-
Notifications
You must be signed in to change notification settings - Fork 518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Diagnose extremely slow generate_hash_key time #497
Comments
@kwaegel How did you diagnose this? I'd be curious to check hash creation times for my own project. I see questionable slow speed ups as well. I work on a multi-platform project. CCache builds improve build times from 5 minutes to 20 seconds, however, I see nothing similar so far with sccache in windows (did not check against sccache linux, yet). |
Forget the remark, I could diagnose it as well now (using RUST_LOG="sccache::compiler::compiler=debug"). Do you happen to have code with very deep header include hiearchies? Or in other words: Is an expanded (with the preprocessor) translation unit super large for your project? |
See the just created connected issue #504. |
@apriori Your suspicion is correct. I haven't checked the reprocessed source size, but deep (and wide) include chains are an ongoing problem with the project in question. The include processing step is over half of the compilation time for many of the files. I have (long-pending...) plans to fix this, but it's going to require some significant reorganization. Your suspicion about SHA-512 seems like a good candidate. If you create a fork with a different hash function, I'd be a happy to give it a shot and share performance deltas. (I'd be willing to help with development too, but I'm not quite as familiar with Rust as I am with C++). |
@kwaegel |
@kwaegel
and investigate these lines
What about is your size/time ratio? |
Ok, this is weird output:
Well, its just wrong logging/measuring. The "Hashed 0 files in 2.836s" is run as a continuation of "future::join_all". It will measure whatever was done before, but not just the portion in question. And in my case that is the preprocessor running. Is maybe that the actual bottle neck? |
I think that |
Yes it is. Measuring futures invocation without a composable abstraction (hello monad stacks?) is a pain as it becomes very intrusive and therefore error prone.
Fyi: "actual generate_hash_key" is really measuring "hash_key". And at least for my local fork the runtime of it is completely neglectable. So the runtime was completely dominated by running the preprocessor. Even more funny: Edit: Additional remark. Piping to a file has equivalent runtime to asking cl.exe to preprocess to a file (~3.8s), so somewhere a lot of time gets lost. |
Additional remark: Even running MSVC with "/Zs" (syntax only) combined with "/showincludes" results in nearly identical runtime as full preprocessing. So unless one would like to roll a completely separate parser, this would be the lower bound for the execution time on windows. For my case I get (due to usage of BOOST and Eigen) 4784 includes for a single translation unit (of which 2337 are unique). So even if one would implement something like direct mode for sccache, these duplications would need to be cached as well so one does not calculate insane amounts of redundant hashes (context #219) |
It sounds like the core problem here is just "you have a lot of includes and the preprocessor is slow". There's not much we can do about that, unfortunately. You might look into using precompiled headers if you haven't already. I'm not sure if sccache can cache PCH compilations but if that speeds up your build that'd be a reasonable feature request. |
Yes, the sad thing is that this is merely a 200k LOC codebase. And all the file in question does is using |
Hmm. Let me add a couple logs to see if they make any sense here. This is from one file I've identified as reliability problematic, with a hot cache. Total wall time (including a couple of dependencies that I can't easily remove) is 26s, so the 13s below is half the build time.
For reference, here's a comparison with ccache. This isn't an ideal test case, since there are a couple of third-party dependencies of my file that I can't easily remove. For longer builds with more of my problimitic files, the generate_hash_key time starts to dominate the hot cache case. Also of interest, here is a copy of the
|
@kwaegel What is your ccache performance if you force it to not use direct mode? |
I have an (unfortunately private) project that is seeing very little speedup from
sccache
. Checking the server logs,generate_hash_key
is taking an extremely long time, upwards of 30-200 seconds per file. Is there a good way to diagnose what is going on here?My initial guess is that the preprocessor is just extremely slow, so the lack of direct mode is hurting me (
ccache
builds the entire project in 3 minutes with a hot cache), but I'm not sure how to verify this.The text was updated successfully, but these errors were encountered: