-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Find emulator unnecessarily greedy #184
Comments
Yeah, we've hit similar problems elsewhere, but I hadn't seen it as much here. Soong also has a similar concept, and flat out ignores directories with certain "marker" files in them, so it never recurses into output directories. But for legacy reasons, I suspect kati's find emulator may still need to know about the current output directory. At first glance, I kinda like option 1 -- it would also mean that once everything in a certain subdir got converted to Soong, we'd no longer need to walk that directory with Kati (at least with enough cleanup). But as you say, it could be a large project. And would we limit it to each top level directory, or something further down? It's possible that someone is using a directory to store their output directories... (and we'd want further down for some of the soong conversion optimizations) Option 3 seems to only work if you've got problematic symlinks, but not if there are actually multiple output trees in the source tree. I guess with option 2 we'd read the full list in soong_ui, filter out the known output dirs, and then add back the current output directory? I may take a quick look at option 1 and see if a lazy loading approach would work. It may not be too bad. |
Thanks for you comment Dan. I hadn't realised that some Android.mk's actually use find commands in the output directory! It shouldn't make too much difference for either option 1 or 2 though. I think with option 1 you don't have to assume anything about top-level directories, instead you just cache the results every time you emulate a find command for a given path (which means that you never traverse more directories than the actual find command would have traversed). For option 2 I was thinking about having some kind of hardcoded top-level directories list in Soong (+ the current output directory), but this may not be ideal. Soong should know the path of all projects though, so it could in principle figure out what are the top-level directories. |
Soong doesn't know the paths that are in your source -- it's agnostic to whether you're using git / tarballs / perforce / etc. (We may care about folders that don't contain Android.mk / Android.bp files too) I'm still planning on looking at the on-demand solution, but another option I just thought about is to use the marker files to prune the list, and use the fallback path if someone does use find / etc into those directories. (If the find emulator doesn't understand the find command, or it's pointing outside the directory the find emulator cached, it falls back to actually calling Separately, I'd love to get rid of that fallback path, since it's an easy way to make your build times much longer than they should be, but that's a separate topic. |
Using marker files and falling back if needed sounds like a good solution for the time being, it also makes it consistent with Soong's own finder. That shouldn't really increase the use of the fallback path, since these directories are not supposed to be looked into in the first place. |
Initialing the find emulator was taking longer and longer the more files you had under your source tree (including multiple output directories). In some of my tests on AOSP: 1.81s (1025k nodes) 3.29s (1707k nodes) 4.97s (2180k nodes) 5.83s (2736k notes) Issue google#186 mentions times of >10 minutes, and this was a report of nearly 22 minutes (3071k notes): https: //groups.google.com/forum/#!msg/android-building/ruqXTcxicFQ/9i5HV1OvAQAJ Change-Id: Ib69ef99a97574c97b9fe7c396fcdfc51276b08d2 --- With this change, my tests from above take 0.37s and only load 143k nodes, regardless of how many files actually exist. This change essentially just delays the loading of directories and symlinks until the first time we attempt to access them, instead of loading them all the first time we ever run 'find'. The improvement to our internal builds doesn't look as nice, but they're still less than half the time / nodes loaded compared to before this change. Fixes: github issue google#184 Test: AOSP build-aosp_flame.ninja is the same before/after Test: internal build-flame.ninja is the same before/after Change-Id: Idb93e096e3f968c7c6b513dea5cbe34786e7edb4
Initializing the find emulator was taking longer and longer the more files you had under your source tree (including multiple output directories). In some of my tests on AOSP: 1.81s (1025k nodes) 3.29s (1707k nodes) 4.97s (2180k nodes) 5.83s (2736k notes) Issue google#186 mentions times of >10 minutes, and this was a report of nearly 22 minutes (3071k notes): https: //groups.google.com/forum/#!msg/android-building/ruqXTcxicFQ/9i5HV1OvAQAJ Change-Id: Ib69ef99a97574c97b9fe7c396fcdfc51276b08d2 --- With this change, my tests from above take 0.37s and only load 143k nodes, regardless of how many files actually exist. This change essentially just delays the loading of directories and symlinks until the first time we attempt to access them, instead of loading them all the first time we ever run 'find'. The improvement to our internal builds doesn't look as nice, but they're still less than half the time / nodes loaded compared to before this change. Fixes: github issue google#184 Test: AOSP build-aosp_flame.ninja is the same before/after Test: internal build-flame.ninja is the same before/after Change-Id: Idb93e096e3f968c7c6b513dea5cbe34786e7edb4
Initializing the find emulator was taking longer and longer the more files you had under your source tree (including multiple output directories). In some of my tests on AOSP: 1.81s (1025k nodes) 3.29s (1707k nodes) 4.97s (2180k nodes) 5.83s (2736k notes) Issue google#186 mentions times of >10 minutes, and this was a report of nearly 22 minutes (3071k notes): https://groups.google.com/forum/#!msg/android-building/ruqXTcxicFQ/9i5HV1OvAQAJ With this change, my tests from above take 0.37s and only load 143k nodes, regardless of how many files actually exist. This change essentially just delays the loading of directories and symlinks until the first time we attempt to access them, instead of loading them all the first time we ever run 'find'. The improvement to our internal builds doesn't look as nice, but they're still less than half the time / nodes loaded compared to before this change. Fixes: github issue google#184 Test: AOSP build-aosp_flame.ninja is the same before/after Test: internal build-flame.ninja is the same before/after Change-Id: Idb93e096e3f968c7c6b513dea5cbe34786e7edb4
Initializing the find emulator was taking longer and longer the more files you had under your source tree (including multiple output directories). In some of my tests on AOSP: 1.81s (1025k nodes) 3.29s (1707k nodes) 4.97s (2180k nodes) 5.83s (2736k notes) Issue google#184 mentions times of >10 minutes, and this was a report of nearly 22 minutes (3071k notes): https://groups.google.com/forum/#!msg/android-building/ruqXTcxicFQ/9i5HV1OvAQAJ With this change, my tests from above take 0.37s and only load 143k nodes, regardless of how many files actually exist. This change essentially just delays the loading of directories and symlinks until the first time we attempt to access them, instead of loading them all the first time we ever run 'find'. The improvement to our internal builds doesn't look as nice, but they're still less than half the time / nodes loaded compared to before this change. Fixes google#184 Test: AOSP build-aosp_flame.ninja is the same before/after Test: internal build-flame.ninja is the same before/after Change-Id: Idb93e096e3f968c7c6b513dea5cbe34786e7edb4
Initializing the find emulator was taking longer and longer the more files you had under your source tree (including multiple output directories). In some of my tests on AOSP: 1.81s (1025k nodes) 3.29s (1707k nodes) 4.97s (2180k nodes) 5.83s (2736k notes) Issue google#184 mentions times of >10 minutes, and this was a report of nearly 22 minutes (3071k notes): https://groups.google.com/forum/#!msg/android-building/ruqXTcxicFQ/9i5HV1OvAQAJ With this change, my tests from above take 0.37s and only load 143k nodes, regardless of how many files actually exist. This change essentially just delays the loading of directories and symlinks until the first time we attempt to access them, instead of loading them all the first time we ever run 'find'. The improvement to our internal builds doesn't look as nice, but they're still less than half the time / nodes loaded compared to before this change. Fixes google#184 Test: AOSP build-aosp_flame.ninja is the same before/after Test: internal build-flame.ninja is the same before/after Change-Id: Idb93e096e3f968c7c6b513dea5cbe34786e7edb4
Initializing the find emulator was taking longer and longer the more files you had under your source tree (including multiple output directories). In some of my tests on AOSP: 1.81s (1025k nodes) 3.29s (1707k nodes) 4.97s (2180k nodes) 5.83s (2736k notes) Issue google#184 mentions times of >10 minutes, and this was a report of nearly 22 minutes (3071k notes): https://groups.google.com/forum/#!msg/android-building/ruqXTcxicFQ/9i5HV1OvAQAJ With this change, my tests from above take 0.37s and only load 143k nodes, regardless of how many files actually exist. This change essentially just delays the loading of directories and symlinks until the first time we attempt to access them, instead of loading them all the first time we ever run 'find'. The improvement to our internal builds doesn't look as nice, but they're still less than half the time / nodes loaded compared to before this change. Fixes google#184 Test: AOSP build-aosp_flame.ninja is the same before/after Test: internal build-flame.ninja is the same before/after Change-Id: Idb93e096e3f968c7c6b513dea5cbe34786e7edb4
I have been working on Android for a few years now, and I have been facing a rather annoying issue for almost as long: whenever the build system decides that all the Ninja files need to be regenerated and the Makefiles included again, kati gets stuck on including
art/Android.mk
, sometimes for a very long time (10 min+). Other members of my team have also encountered that issue.A few weeks ago, I finally realised why this happens. Since Soong passes
--use_find_emulator
, when kati encounters the firstfind
in the Makefiles (which happens to be inart/Android.mk
), it callsConstructDirectoryTree("")
, creating a representation of the whole tree in memory. This is already a very greedy approach when starting from a clean tree, but in my case it gets much worse, as I store multiple output directories in the root of the tree. In fact, since I don't have enough space on my SSD to store all those output directories, I instead store them on HDD and put symlinks at the root. As a result, kati spends a lot of time constructing a tree of irrelevant directories on a slow medium. #86 seems to complain about a similar same issue, but AFAICT the underlying issue didn't actually get resolved.Unsurprisingly, disabling the find emulator results in a large performance increase in my situation. However, I can see how that approach can be useful; I think it just needs to be made less greedy. I can think of a few ways to achieve that:
find
command ever starts from the root of the tree, it always specifies a subdirectory. By indexing only the relevant subdirectories as encountered, unrelated directories would not cause any issue.--use_find_emulator
, so that only those are indexed.Option 1 is probably the best, but requires refactoring the find emulator quite a bit. Option 2 is much easier to implement but requires a corresponding change in Soong. Option 3 is probably too ad-hoc.
Any thoughts?
The text was updated successfully, but these errors were encountered: