-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subninja support and multithreaded loading #112
base: main
Are you sure you want to change the base?
Conversation
aa7d8f3
to
60055f0
Compare
CI is passing now. |
Wow, impressive! I was just wondering this weekend whether you were still tinkering in this area... Some minor thoughts I had while skimming this:
|
By the way, did you ever get the copy of the android ninja files I shared with you after you requested them in #94? You should have a shared google drive zip. |
Thanks! I grabbed a copy onto my Google Drive. I haven't looked into the rest of this PR yet, sorry. :( It seems pretty fancy though, do you think you'll use it for Android going forward? |
This removes the need for maintaining 2 maps from ids to files, and strings to ids.
These turn out to be faster than spawning individual tasks, even if it requires more single-threaded parts of the code.
Using Arc<String> allows us to not copy the string into both the hashmap key and the File object.
Keep the statements around as a Vec<Vec<Statement>> even after parsing, because copying all the builds into a new vector takes too long and causes more memory usage.
Another attempt to reduce copies / memory usage
Using the final graph::Build object is faster due to the need for fewer copies between different vector types. It will also allow us to only evaluate the bindings of the Build objects we actually use.
This saves ~0.3 seconds of merging the vecs of builds.
This allows us to not spend time evaluating all the build bindings, but we still need to evaluate all the global variables if we want to not keep the file in memory.
We only need these within the lifetime of 'text, so we can make them references instead.
Surprisingly this seems to have a measurable ~0.1s performance increase. Also removed some unwraps in the VariableAssignment's evaluation, but those didn't seem to help as much.
unmapping files is slow.
So that we don't have to allocate memory for their parsed representations.
This should also improve performance of evaluation in general slightly.
Windows doesn't have support for mmap (at least not with the same apis)
Needed for usize::next_multiple_of().
60055f0
to
9cc13a8
Compare
Do you mean this PR or n2 in general? For the PR, yes, we definitely need some kind of load speed improvement, 16 seconds to start ninja would cause riots. For n2 in general, I've started the process of forking it into android, but there are still a number of features that need to be added. I'm hoping to switch to it soon because there are some other teams who either want to add more ninja features, or who have already made their own forks with new features and are now contemplating merging their code back into the android fork. (e.g. abfs, a filesystem that communicates the files read by actions back to the build system) I also want to implement action sandboxing. I think it would be easier to scale to all these new features if they were written in rust as opposed to C++. |
I am glad it is working out! I agree that working from Rust will hopefully make it easier to extend. One of my goals with n2 was to make it separable enough that it's hopefully easier to reason about. BTW I think you'll need #50 if you want to use n2 in production. Or maybe even consider using a real database, I dunno -- the load time is in the critical path so I'm kind of fond of my tricky hack. In any case I'd be happy to give advice on any pieces you end up needing to interact with, feel free to email about them or file bugs. (In particular I have some ideas on how n2 might implement content hashing for files, which is useful both for dirty checking but also for interacting with cloud build systems, and how that can nicely fit into n2's model.) |
Yeah I'm aware of that issue. Though what "tricky hack" are you talking about?
Thanks! Though I think I have a good idea about what needs to be done. What would be most helpful would be if you could review changes like this one faster, but I understand that's a lot of work to ask of you. So I'm going to try to get a fork going so we can get the fork up to speed faster.
Yeah I'd like to add that feature as well. |
The custom "database" defined in db.rs. I think Ninja has some extra goop in it for the databasey corner cases, where like each row is suffixed with a parity check to catch partial writes to it. |
dashmap is currently unused, but will be required after merging in evmar/n2#112 Bug: 318434287 Bug: 328273370 Test: added n2 to build-prebuilts.sh, and ran OUT_DIR=out prebuilts/build-tools/build-prebuilts.sh Change-Id: Ied5af144b8b2c21b385521be430f9d0313c00d93
Fixes #109
Fixes #108
This PR ports over most of the enhancements that are in Android's fork of ninja to n2. N2 goes from taking ~16 seconds to parse the AOSP ninja files, to ~1 second. (at least on my work computer where available_parallelism returns 128) It still seems to be about ~0.1 seconds slower than the android fork of ninja, but it's at least in the range where it's a little difficult to measure.
A general list of the changes in this PR:
id_from_canonical
can be called in parallel.One thing that the android fork has that I didn't include was precomputed hashes. The android fork has a HashedString and HashedStrView type that it uses to ensure hashes aren't recalculated. This is somewhat difficult to do in rust, you have to either make concessions about being able to look up map values by their borrowed representation, use the raw entry map apis on nightly rust, or use hashbrown's RawTable. And even when I was experimenting with RawTable it didn't seem to provide a noticeable speedup.
Some of these optimizations are memory optimizations. This turned out to be necessary because apparently the exit_group syscall that quits the process spends time cleaning up memory. The exit_group syscall was taking over a second to run before some of these optimizations.
I've also bumped the rust version needed to compile n2 from 1.70 to 1.73, to be able to use
usize::next_multiple_of
. But if we want to keep the rust version low I can just manually write anext_multiple_of
.