New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Per-application system metadata cache reduces startup performance #306
Comments
|
That looks like a bug in KRunner - the cache is transient by default and stored in memory, unless a location is explicitly defined by the client. When running KRunner could either explicitly specify a cache location, or let AppStream choose do that by setting |
That won't help much as the system locale is not necessarily what the user specifies anyway.
This is the runner code: https://github.com/KDE/plasma-workspace/blob/19ba5f1cf34cbe5bfe71d3bf40e7cb8bae3520b7/runners/appstream/appstreamrunner.cpp#L139 It does
On every startup I see it parsing a lot of xml data, which seems to be the main source of the memory use. The result of that parsing doesn't appear to be cached anywhere, at least not across multiple starts or even multiple processes. |
|
Sooo... Creating an |
|
FWIW, could also be that I am reading the KRunner code wrong and it doesn't reload on every search - but at the moment, it looks like this is the case to me and this also would explain your issue really well. |
I don't think that is an issue here:
That's only relevant if the same process loads the pool multiple times, right? What I'm looking for is a cross-application cache somewhere in
AFAICT it also protects
It's called only once - when the first search happens. It's a function-scope |
There is no cross-application cache, each one creates its own cache, as apps may individually change the pool contents. But there will be a cache per-app, so once the thing is loaded, memory usage should go down. I'll have a look if that works correctly later.
Yeah, I was dumb there... Shouldn't have read this code so late at night, my brain completely ignored the So, the only way I see you getting this problem is if the pool doesn't free used memory once the cache was created, but keeps it. From a quick look at the code, this only ever happens if there was a problem with the cache itself, so we failed to store something in LMDB - but that doesn't seem to be the case here. I'll run some tests when I am back home. |
The system cache works fine though? Moving the system cache from
But not persistent? At least that's what I'm seeing.
Heh - I have a "no public bug reports/issues at night" rule to avoid that ;-)
I don't think the memory is leaked, but it's just never given back to the OS. So it's essentially lying around mostly unused after the pool got loaded. It's just high peak heap memory use, but using small allocations (for XML nodes?) which can't easily be unmapped by libc. There's also a huge performance difference in |
A good rule - getting rid of an issue quickly is neat too though, especially for the reporter, and especially if my first reply was that this may not be an issue in my project but theirs ^^ Okay, I think I got to the bottom of this - and the solution surprised me quite a bit. So, tl;dr: Your "huge memory usage" issue seems to be a mirage. Can you please try throwing a malloc_trim(0) into KRunner's code, to verify that this indeed works for your case, and I wasn't looking at the wrong issue the whole time? This certainly "fixed" the supposed high memory usage for my case. I feels wrong for libappstream to call |
|
You just confirmed my theory:
The issue here is not just about memory use (which calling With system cache, |
|
But the cache is used and created. Just the first run will be slower, as a new cache has to be created (and of course |
I think we're finally converging on what this issue is about. The system cache does not have this problem as it is properly shared between starts of the same application and also different applications. What I'm asking for is the same feature which doesn't require root privileges: A transparently used cache in |
|
Yes, but that is a wontfix issue, as the cache is per-application - if it was global, any change to the pool done by Discover or any other app would impact other applications, making fault finding difficult. Also, the caches are only writable by a single process, having multiple processes write to the same cache will fail. Also also, having two processes create a cache at the same time is pretty hard to do properly too. |
Applications could just do an implicit
Mostly the memory use. Introducing a cache would fix both issues though. |
|
There is no increased memory use though, that's what my previous message was about - memory usage is exactly the same, no matter whether the system cache existed or didn't exist before (I measured it). |
I think we're misunderstanding each other again. Without calling
Only of a single application though and only on the first start after the metadata changed, to refresh the cache. After that, the cache would only be read. Also, at least krunner and plasmashell just use a pristine pool, so this could be detected and just written to disk directly.
If the cache is used, it doesn't do any XML parsing and thus not allocate 150MiB of memory during that. See also the issue description. |
So, have you tried that now? If this works for your case, it means that memory has already been freed and isn't used anymore, and is just associated with the process in case that needs it back quickly. It's a performance feature of glibc. |
No, this is actually about real memory usage. The allocated pages were written to, so the kernel can't just throw them away. If at all, they could be swapped out to move them out of main RAM, but they can't simply be discarded. This really is about 150MiB of wasted memory per process. |
|
I really do wonder whether we are talking past each other or actually about completely different things. Repeating the exact same thing with the system cache yielded the same results, with the system cache in use, This was all with AppStream's Git master though, and I recently and completely by accident fixed a huge memory leak in AppStream's Qt bindings: Basically, whenever you searched anything that could return multiple results, the search results were never properly freed. So the more searches you did, the more memory you would use. I am against adding |
With system cache,
I don't think that's related (though it's definitely good to know!), because my test search is "xterm"/"asdf" and neither of those produce any results.
I agree.
Is
Currently it wastes 150MiB of RAM in every process using AppStream everytime, but with proper caching it would only do that once for a single process after each metainfo change. |
Definitely not, I deliberately have tons of data. Most of it is in YAML format though. You are on OpenSUSE, I guess? In that case you would have more XML data, and we allocate and free even more small memory chunks when parsing XML.
It definitely isn't at the moment. And generating that cache also wouldn't really help you much - krunner is one of the earliest apps to start, so it would usually be the application that creates this cache in the first place. Which helps KDE Discover and other app's startup performance, but that would be about it. Still, I don't get your memory issue - if we once requested a lot of memory, then freed it again, surely the OS will be able to reclaim the free pages now that we don't use them anymore... If malloc_trim is required to do that (I recently found http://xmlsoft.org/xmlmem.html which makes it seem like that), then krunner should just call this so it uses the minimal amount of memory even if it just wrote its temporary cache (which is will always do, actually, even with a system cache active). Might make future heap allocations slower though for a bit of time. |
Yes, the 150MiB are all caused by allocations inside libxml. If you need some data to reproduce the issue, take one of the files from http://download.opensuse.org/tumbleweed/repo/oss/repodata/?P=*appdata.xml.gz and put it into
Neither krunner nor plasmashell call
Yeah, that should work, as long as LMDB cooperates and doesn't try to do any path based access after opening the cache initially.
That depends on what the libc memory allocator does. Generally it only unmaps (i.e. gives back) freed memory it mapped directly, which is usually done only for big allocations done with
Yes, I'm thinking about that as a workaround, might even get accepted upstream. I'll play around with Neither of those would benefit of the 10x performance increase the cache provides though.
You got to the main point of this bug report: If this behaviour is changed, the issue would mostly disappear.
|
|
I still think the memory argument is very weak, but the startup performance argument is very real. AppStream's caching code will get a major rewrite soon (some time this year maybe...). Fortunately the current caching code is rather easy to read, with the exception of some insane locking going on in AsPool (but that's a mess to untangle at a different time). The current implementation will share a cache for system data between all apps, either using the system one if it's up-to-date, or otherwise creating one per-user. This operation should be atomic (and seems to work fine in my limited tests). This change is very invasive though, so please give it a lot of testing! |
|
Finally had some time to get back to this.
For YAML I agree, as apparently that apparently doesn't need nearly as much memory allocation.
\o/ Thanks a lot! I had a look at the code and might have found some issues (or just misunderstanding code):
I built latest master (49b937c) and played around a bit with appstreamcli, krunner and plasmashell. I couldn't find any issues so far. Only thing I noticed is that when running After that, the cache is established and I also noticed this rather odd wording: Is there anything else I should test specifically? I suppose the cache refreshing done on repo updates by the package manager could be dropped now. This way the cache is only generated when it's actually needed by an application and also always in the correct locale. The additional ~1s on initial cache creation happens in separate threads anyway (at least in discover, plasmashell, krunner), so shouldn't be an issue. |
|
Apparently the fix for this is part of the latest release now. Were any of the issues I found addressed? |
This apparently happens some times, and with cold caches it means we will load all data twice, leading to a pretty big startup delay. CC: #306
I noticed that krunner uses over 150MiB of private RAM when idle, which I tracked down to be caused by the AppStream module.
Using heaptrack, I can see that most of that is allocated by
AppStream::Pool::loadand its callees.I tracked that down further by running
G_MESSAGES_DEBUG=all appstreamcli search xterm, which printedSystem cache is stale, ignoring it.. Runningappstreamcli refresh-cachemanually fixed that.That's already done by the package manager on repo metadata downloads though, but looking at the generated files I noticed that those are locale specific! The locale the package manager runs as is in most cases not the user's locale. A local root shell uses
POSIX,packagekitdusesCas locale, etc.. Even if it matched, the user could change it at any point to one different from the system locale and end up with a mysteriously slower system with double (!) the usual RAM use (krunner, plasmashell)It's much more likely that applications running as the user interact with AppStream metainfo, so IMO it makes sense to keep the cache somewhere in ~, even if only as fallback if the system cache is not usable.
The text was updated successfully, but these errors were encountered: