Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upAdd build profile. #6577
Conversation
This comment has been minimized.
This comment has been minimized.
|
I have some concerns about this, so I wanted to open this for discussion. dev/release switchingWith this feature, if a dev build is made, and then a release build is made, all of the
Extra artifactsI'm very worried about this causing shared dependencies to now be built multiple times. Some options:
I've been doing some analysis on crates.io to try to understand the impact. 3187 of 21098 crates have at least one overlapping dependency. gist. crossgen has a whopping 161 crates in common worst case. I did some tests "without" build profile and "with" build profile with either 12 or 2 concurrency. All times in seconds. These is just a rough idea. I ran each attempt multiple times, but this is particular to my hardware, and running on MacOS.
As you can see, sometimes it is a little faster, but usually it is slower (sometimes much slower). Here's a few more pieces of data that seemed interesting (of 21098 crates):
Default settingsThe current defaults may not be the best. Turning off debug improves speed and reduces disk space, but then you lose good backtraces. It's also questionable if it matters if debug-assertions or overflow-checks are off. Setting opt-level=1 had a noticeable increase in compile time on the few projects I tried, so I left it at 0. |
alexcrichton
reviewed
Jan 22, 2019
|
Thanks so much for doing the analysis here! To make sure I understand this, the PR proposed as-is changes the default build settings for build scripts/procedural macros in both debug/release modes. This means that entire dependency trees rooted in procedural macros and build scripts are now compiled differently, and any previous sharing which happend no longer occurs, accounting for longer build times. I'm curious if you know if there are some particularly bad "root offenders"? How do crates like FWIW absolute compile times aren't always the most interesting metric in my opinion. Incremental builds almost always occur because there's previous artifacts and/or build times were already bad enough to motivate tools like |
| debug-assertions = false | ||
| codegen-units = 16 | ||
| panic = 'unwind' | ||
| incremental = false |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jan 22, 2019
Member
Shouldn't this be true? (was this copy/pasted from somewhere else that needs an update?)
This comment has been minimized.
This comment has been minimized.
ehuss
Jan 22, 2019
Author
Contributor
Right now it is false. The default build profile is defined here based on the default here. This is similar to release mode.
I don't have a strong opinion about any of the defaults. I think the theory on this one is that build scripts are rarely modified. But I can see how it would be annoying when you are actively working on it.
Maybe you are thinking of #6564 which hasn't merged, yet? That would change the default.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jan 23, 2019
Member
Oh nah I just wanted to confirm. I think that we should have incremental turned on by default as the overhead only applies for path dependencies anyway, in this case typically just the build script itself.
I could see this going either way though. Build scripts are typically quite small and fast to compile, which means that incremental isn't a hit for them nor does it really matter too much. I'd personally err on the side of enabling incremental though
| debug = false | ||
| rpath = false | ||
| lto = false | ||
| debug-assertions = false |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jan 22, 2019
Member
I'm a little wary about this being false, it seems like this may want to be true by default to help weed out mistakes more quickly
This comment has been minimized.
This comment has been minimized.
ehuss
Jan 22, 2019
Author
Contributor
Yea, I wouldn't mind making it true. Maybe the same for overflow-checks? I don't have a sense of how much slower that typically makes things, but I suspect it would not be perceivable by most scripts/macros. I think debug is the bigger question of how it should be defaulted.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jan 23, 2019
Member
Yeah I think this and overflow-checks should probably default to on, build scripts are typically rarely a bottleneck and if they both of these options can be disabled pretty easily (both from crates.io via changing apis or locally by changing profiles).
For debug I wonder if we could perhaps try setting a default of 1? That means we only generate line tables for backtraces, but no local variable info as no one's really using gdb on these
This comment has been minimized.
This comment has been minimized.
It sounds like you understand it well. I'll look into offenders soon. |
This comment has been minimized.
This comment has been minimized.
|
Here is some analysis of root offenders: https://gist.github.com/ehuss/0c9fb074d4b8720316b8ede243006f78. I tried to weight them by how often they are used and how many shared dependencies they tend to have. Maybe not the best weighting strategy. The top offender is
Here is a detailed look at cargo-crev: https://gist.github.com/ehuss/a15704fc8c9d9a345a0d71739e3db32e. It's interesting because there isn't one bad offender, but a bunch of them (bindgen, clear_on_drop, failure, phf_codegen, cc, rand). |
This comment has been minimized.
This comment has been minimized.
|
Ok thanks for that analysis! I agree it's pretty hard to draw a trend from that. My main conclusion is largely just that the ecosystem of build dependencies is basically the same as normal dependencies, they themselves are built on a number of crates in the ecosystem and there's some big ones and some small ones. When thinking about the build as a whole, as mentioned before this change is basically irrelevant for incremental builds. It's also largely irrelevant for builds using caching solutions like One aspect of those builds I've often noticed is that for larger projects all hardware parallelism is eaten up during the first half-or-so of the build, but the second half is often more serial as dependencies become chained and all the quick crates are out of the way. The relatively small percentage increase in build times you measured above I think may be explainable that the "time to a serial build" is moving back and we're making use of the unused parallelism at that portion of the build to finish up build dependencies. Now of course those same dependencies can also push back the build because the serial chain of crates could depend on everything being finished. Overall I still personally feel pretty good about this change. Local projects can always reconfigure back to today's configuration if cold builds matter a lot, and otherwise this should provide a general improvement for working with build scripts and procedural macros. |
This comment has been minimized.
This comment has been minimized.
|
Spot on. Do you have any thoughts about how to organize the artifact directory? To address something like #1774 it would need to change so that dev/release will share the same build artifacts. My preference would be to remove the debug/release directory separation. I suspect there might be opposition to that, though it could maybe be done in a backwards compatible fashion with links. From a functional standpoint of using the If that is untenable, a dedicated Or it could just stay as-is, which allows for sharing, but causes rebuilds when switching dev/release. Or maybe some other option, like build artifacts are always in the |
This comment has been minimized.
This comment has been minimized.
|
I definitely think we should solve the rebuiding problem, but I think we could either do that by placing output in a new directory or by hashing more into the filename. I'm actually somewhat surprised that their filenames are conflicting today, do you know what's not being hashed to cause the filenames to be different and avoid colliding into the same filename? We definitely can't easily remove debug/release folders as they're so widely ingrained today. What I think we could do, however, is move towards a world where those folders only contain final output artifacts rather than intermediate ones. Sort of like how we have |
This comment has been minimized.
This comment has been minimized.
I'm a little confused. I was saying that they don't conflict, so there should be no reason they need to be in separate directories.
Yea, that's what I meant by "backwards compatible fashion with links" — it would keep the debug/release directories and just link final artifacts there for any tools that expect them. I'll take a look soon at implementing that soon and see if there are any major drawbacks. I expect there to be a lot of little changes throughout the code, but overall to be straightforward. I'd like to do that in a separate PR if that's OK? |
This comment has been minimized.
This comment has been minimized.
|
Oh sorry I was misunderstanding the rebuild point. It's not that we're thrashing a cache but the same artifacts are cached in two locations. That doesn't happen today as the settings are basically always different, but after this change the build profile for dev/release is the same so the artifacts are actually the same. In the long term I think we're going to move to a global build cache for Cargo, so I think it's fine to go ahead and experiment with it ahead of time. I'm thinking something along the lines of "everything stays exactly the same as it is today", but all files are just hard links to a build cache elsewhere. The build cache is just a dump of everything Cargo ever does, compeltely unorganized. |
This comment has been minimized.
This comment has been minimized.
|
I implemented a unified deps directory, but ran into some problems dealing with backwards compatibility. I've been trying a few different approaches, but they all have drawbacks.
Any ideas? |
This comment has been minimized.
This comment has been minimized.
|
If we break very old Windows I think that's fine, I thought that I don't actually know any systems that don't support hard links on the same filesystem, but have we hit some in the wild we wanted to handle? I think breaking rustbuild is fine (especially if we see the breakage coming!). Overall I think we definitely need to preserve backcompat to ensure that the current patterns for finding a test binary works somewhat (although we have broken this before...). Otherwise it should be fine to ignore older Windows and I think it's fine to assume hard links for perf (although I may be forgetting something critical there). If we only hard link/copy the final binaries that could mitigate the impact of systems without hard links perhaps and overall reduce the amount of traffic on the filesystem? |
This comment has been minimized.
This comment has been minimized.
It is fairly recent. Creating symlinks historically required admin permissions until Windows 10 Creators Update (released mid 2017). The reason you can run on older systems is because A. I don't think we every try to link directories on Windows. I can only think of macos with dSYM.
I believe some network filesystems do not support it. Sometime soonish, unless you have any other feedback, I'll try out the hybrid approach and see how it goes. |
This comment has been minimized.
This comment has been minimized.
|
Oh sorry right yeah symlinks won't work but I think that directory junctions are supported much further back on Windows, right? (I forget if that's what Hm network filesystems is a bummer... I think the hybrid approach would be best there though long-term! |
This comment has been minimized.
This comment has been minimized.
|
|
nrc
assigned
alexcrichton
Mar 6, 2019
ehuss
added
the
S-blocked
label
Mar 7, 2019
ehuss
referenced this pull request
Mar 14, 2019
Open
Tracking issue for RFC 2282 - Cargo profile dependencies #48683
ehuss
force-pushed the
ehuss:build-profile
branch
from
59139bc
to
4ca9e0e
Apr 1, 2019
ehuss
referenced this pull request
Apr 1, 2019
Merged
Include proc-macros in `build-override`. #6811
bors
added a commit
that referenced
this pull request
Apr 2, 2019
This comment has been minimized.
This comment has been minimized.
|
|
ehuss commentedJan 21, 2019
This adds a
buildprofile as discussed at rust-lang/rust#48683. Seeunstable.mdfor a brief description.