Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to optimize just dependencies #1359

Closed
alexcrichton opened this issue Feb 27, 2015 · 50 comments
Closed

Add an option to optimize just dependencies #1359

alexcrichton opened this issue Feb 27, 2015 · 50 comments
Labels
A-configuration Area: cargo config files and env vars

Comments

@alexcrichton
Copy link
Member

The package being built (and all targets within) will not be built in release mode, but all dependencies would be built in release mode.

cc #784
cc #784 (comment)

@alexcrichton alexcrichton added the A-configuration Area: cargo config files and env vars label Feb 27, 2015
@mystise
Copy link

mystise commented Mar 11, 2015

Would this apply to the standard library as well? (I.E. compiling your code in debug mode while the standard lib would be compiled in release mode)

@alexcrichton
Copy link
Member Author

Unfortunately no, the standard library currently only comes as optimized.

@mystise
Copy link

mystise commented Mar 11, 2015

Oh, excellent. That's actually what I was hoping would happen, as then most of the bottlenecks are going to be in my code that calls the std lib rather than in the std lib itself.

@pcwalton
Copy link

+1. I want this for Servo ipc-channel.

pcwalton added a commit to pcwalton/cargo that referenced this issue Jul 23, 2015
Inside the `dependencies.foo` section, you may now specify a minimum
optimization level to be applied to this dependency. If multiple such
optimization levels are supplied, Cargo chooses the highest one.

This will probably need to go through an RFC before landing.

Addresses rust-lang#1359.
@matklad
Copy link
Member

matklad commented Feb 7, 2016

@alexcrichton I think I can work on this.

From the design point of view, we can add an optimize-dependencies flag, or we can add a separate dev-dependencies profile. The first option is simpler, the second is more flexible and would allow to implement a default like

[profiles.dev-deps] 
opt-level = 2 
debug = true 
debug-assertions = true

which may be better then the current default of slow dependencies.

These two options are somewhat orthogonal, and it should be possible to implement one first and add the other later.

Another design question is what exactly a dependency is? If you have some path dependencies, do you want them to be optimized? And what about overridden dependencies? I can suggest two solutions here.

A package is not a dependency if it is

  1. a root package.
  2. a package for which source_id().is_path() is true (this includes the root package)

From the implementation point of view I hope that the only thing that should be changed is lib_profile function :)

@alexcrichton
Copy link
Member Author

@matklad awesome! I think that this definitely has a bit of a design aspect to it before charging ahead, although I think you're correct in that the function you highlighted is the only one that needs to change (convenient, eh?).

I like the idea of optimize-dependencies not being the right flag as there may be other things you've got going on. One possibility could be:

[profiles.dev] 
dependency-profile = 'release'

or something like that. Basically you select a profile for dependencies rather than any specific information about them. If we later add the ability to specify custom profiles that could also be selected.

I also agree about your heuristic about what a dependency is and what isn't. In general I think that all path dependencies are just implementation details (or units of incrementality) that are part of the current crate being worked on. That being said, however, the name "dependency" may be a bit misleading there because they are indeed dependencies.

I wonder if perhaps the term "upstream" could be used? That may help distinguish "other people's code" from "my local code". The term "upstream" and "downstream" are sometimes conflated, though, so that may not be best...

And all that being said you may still want to optimize path dependencies. For example if you temporarily override a perf-critical dependency to a local crate, you probably still want to optimize it even though you're working on it.

I think that I might somewhat lean towards "dependency" meaning "anything not the root package". We could eventually support something where you can specify profiles for specific dependencies in a Cargo.toml, and that could perhaps be used to optimize everything but a few packages.

@matklad
Copy link
Member

matklad commented Feb 8, 2016

Basically you select a profile for dependencies rather than any specific information about them. If we later add the ability to specify custom profiles that could also be selected.

Yes, I think it is the best approach.

I wonder if perhaps the term "upstream" could be used?

I find "upstream" even more confusing than "dependency". What about "local package"?

I think that I might somewhat lean towards "dependency" meaning "anything not the root package".

I'm more for "dependency = anything non local". The original motivation for "optimize dependencies" is that you build them only once, so it makes sense to spend more time on compiling them efficiently. Local packages however are rebuild more often (n times) then deps (1 time), and optimizing them will affect compile times negatively. If you really need fast local packages the right solution might be to add opt-level=2 to profiles.dev.

Out of curiosity, how this will work with monomorphisation? If I have a template function in a dependency compiled in release mode, and use it from a crate compiled in debug mode, will I get any benefits? Suppose that the templated function does not call other non-templated functions.

@alexcrichton
Copy link
Member Author

"local" does seem kinda on the right track, yeah, although it may be kind of odd saying:

[profile.dev]
non-local-dependency-profile = 'release'

Quite a mouthful!

This will indeed not really work well with monomorphization, all instantiations will just be optimized at the same level as the crate they're instantiated into.

@whitequark
Copy link
Member

Maybe vendor instead of non-local?

@matklad
Copy link
Member

matklad commented Feb 9, 2016

I like the idea of optimize-dependencies not being the right flag as there may be other things you've got going on. One possibility could be:

[profiles.dev] 
dependency-profile = 'release'

Hm, in this piece of TOML two ideas are expressed:

  1. specify dependency profile override by name,
  2. specify dependency override in the profile itself.

I totally agree with 1, but 2 does not feel right: ideally profile should be set per package, but dependency-profile is global property.

What about

[profile-overrides.dev]
vendor-dependencies="release"

? This can in future be extended to handle more fine-grain overrides.

@alexcrichton
Copy link
Member Author

Hm yeah I guess if you set all dependencies to a release profile you'd want that to happen in both test and dev. Although we currently have two profiles for that, so any configuration in one already needs to be reflected in another, so maybe it's not so bad?

I'd be somewhat wary of inventing new top-level keys like profile-overrides (maybe we could stick it in [project] if we want?). I'm also not sure that "vendor" conveys the right information here because to me "vendor" typically means what's literally included locally, rather than what I'm using from crates.io

@matklad
Copy link
Member

matklad commented Feb 10, 2016

Hm yeah I guess if you set all dependencies to a release profile you'd want that to happen in both test and dev. Although we currently have two profiles for that, so any configuration in one already needs to be reflected in another, so maybe it's not so bad?

I'd like to ask what exactly a profile is? I mean, there is test profile and release profile, but there is also cargo test and cargo test --release, which do different things.

Here is a model I have in mind for this feature:

A profile is basically a description of a set of flags, applied during compilation of a package. And there is "compilation mode" which determines what profiles are applied to what packages. Like, there is --release mode, which applies release profile to all packages.

As far as I understand, currently in cargo the notion of compilation mode is not present first class. For example, in Context build_config.release and unit.profile.test are used to make decisions about compilation. This lead to surprising (?) effects. If I have

[profile.test] 
opt-level = 3 

in Cargo.toml, then for cargo test only my crate will be build with optimizations enabled, but all dependencies will use flags from the dev profile.

@alexcrichton
Copy link
Member Author

Ah yeah to clarify, when I say profile I mean what you're writing down in Cargo.toml like test, release, dev, and bench. I will agree, however, that Cargo's treatment of these profiles is kinda ad-hoc and in general "not great" as it may be surprising (as you're encountering here).

That being said, though, my motivation of foo = "profile-name" is because we may one day support custom profiles, and otherwise I think the interaction of profiles here may be somewhat orthogonal to the feature at hand?

@matklad
Copy link
Member

matklad commented Feb 11, 2016

I think the interaction of profiles here may be somewhat orthogonal to the feature at hand?

Yes, if we specify override globally, like

[project]
profile-overrides = {
   upstream-dependencies="profile-name"
}

Likely not, if we use something like

[profiles.test]
profile-overrides = {
   upstream-dependencies="profile-name"
}

@matklad
Copy link
Member

matklad commented Feb 11, 2016

@alexcrichton what about

I'm more for "dependency = anything non local". The original motivation for "optimize dependencies" is that you build them only once, so it makes sense to spend more time on compiling them efficiently. Local packages however are rebuild more often (n times) then deps (1 time), and optimizing them will affect compile times negatively. If you really need fast local packages the right solution might be to add opt-level=2 to profiles.dev.

@alexcrichton
Copy link
Member Author

Yeah we could in theory support a top-level "always override dependencies with this profile", but I suspect that it's likely to be a local decision per-profile (especially if we grow custom profiles) rather than an always-true option. Note that the dev/test profile management here may want to be improved...

I can also see the "dependency = anything non local" logic, yeah, although I think it can work both ways. You may sometimes be frobbing all the path dependencies at once, but there could also be legitimate cases where only the main one is being modified. I suspect though that the "anything non local" case is more common.

At least in terms of performance, I would expect any bottlenecked dependencies to all be upstream rather than being worked on locally.

@alexcrichton
Copy link
Member Author

There are some concerns on #2380, outlining few tensions to balance when implementing this.

@matklad
Copy link
Member

matklad commented Feb 26, 2016

@alexcrichton I agree with your analysis. One thing I want to add is that there is a possible case of optimizing (O1?) dependencies by default, which still can (maybe) bring some benefits.

And for the future refence I've done some dirty "benchmarks" of the approach in #2380. I've benchmarked this crate (this crate was the reason I found this issue :) ). It does some OpenGL rendering in a super inefficient and unidiomatic way. The major run time bottleneck is texture loading (which is done by a library).

So here are the measurements of a recompile cycle after a trivial whitespace edit in "src/lib.rs" and a corresponding time to see something rendered.

Usual debug build

$CARGO build --bin mirror  24.93s user 0.92s system 100% cpu 25.840 total
$CARGO build --bin mirror  25.22s user 0.89s system 100% cpu 26.096 total
./target/debug/mirror  68.11s user 0.24s system 97% cpu 1:10.14 total
./target/debug/mirror  68.26s user 0.22s system 98% cpu 1:09.60 total

Usual --release build

$CARGO build --release --bin mirror  72.13s user 0.72s system 100% cpu 1:12.81 total
$CARGO build --release --bin mirror  69.92s user 0.73s system 100% cpu 1:10.61 total
./target/release/mirror  1.59s user 0.21s system 63% cpu 2.836 total
./target/release/mirror  1.66s user 0.20s system 71% cpu 2.602 total

Debug build with release deps

[profile.release]
debug = true
debug-assertions = true

[profile.dev]
dependencies-profile = "release"
$CARGO build --bin mirror  24.13s user 0.71s system 100% cpu 24.832 total
$CARGO build --bin mirror  24.27s user 0.75s system 100% cpu 25.005 total
./target/debug/mirror  25.40s user 0.22s system 97% cpu 26.354 total
./target/debug/mirror  25.38s user 0.23s system 96% cpu 26.536 total

Debug build with opt-level=1 deps

[profile.release]
debug = true
debug-assertions = true
opt-level = 1

[profile.dev]
dependencies-profile = "release"
$CARGO build --bin mirror  24.65s user 0.80s system 100% cpu 25.432 total
$CARGO build --bin mirror  24.58s user 0.84s system 100% cpu 25.404 total
./target/debug/mirror  50.54s user 0.25s system 98% cpu 51.672 total
./target/debug/mirror  50.30s user 0.23s system 98% cpu 51.325 total

Debug build with opt-level=1 for everything

[profile.dev]
opt-level = 1
$CARGO build --bin mirror  38.91s user 0.81s system 100% cpu 39.699 total
$CARGO build --bin mirror  38.68s user 0.80s system 100% cpu 39.459 total
./target/debug/mirror  34.61s user 0.26s system 95% cpu 36.404 total
./target/debug/mirror  34.55s user 0.25s system 96% cpu 36.139 total

debug release deps-release deps opt-level=1 all opt-level=1
recompile 25s 70s 25s 25s 38s
run 68s 2s 25s 51s 36s

I kinda like how compile and run time coincide in the third case =)

@alexcrichton
Copy link
Member Author

Thanks for the data @matklad! Out of curiosity, what do the numbers look like if opt-level is set to 1?

@matklad
Copy link
Member

matklad commented Feb 29, 2016

@alexcrichton updated the table

@alexcrichton
Copy link
Member Author

Oh sorry, I meant for O1 that the entire graph had opt-level 1, not just the dependencies. Does that reduce the runtime at all?

@matklad
Copy link
Member

matklad commented Feb 29, 2016

yep, updated the table.

@alexcrichton
Copy link
Member Author

Thanks! Looks like that kinda confirms that there's a "sweet spot O1" which may suffice for some build configurations?

@matklad
Copy link
Member

matklad commented Mar 1, 2016

Looks like that kinda confirms that there's a "sweet spot O1" which may suffice for some build configurations?

Not sure. First of all, I don't think that this benchmarks are really representative (target crate is just some quick an dirty openGL exercises, nothing close to production). In particular, target crate is pretty small, and I suppose that compile time grows lineary with code size, while run time is sub linear. That is, for large code bases, O1 compile time overhead may kill any run time benefits.

And given current largish compile times, I won't feel comfortable sacrificing compile time for anything at all :)

@mkovaxx
Copy link

mkovaxx commented Jun 11, 2017

Has there been any progress on this? #1826 got pretty close to creating a solution that'd work for me, unfortunately it was never merged. It was abandoned in favor of #942. Was a solution implemented based on that?

@alexcrichton
Copy link
Member Author

@mkovacs currently, no, there hasn't been much progress on this. This feature remains unimplemented but @matklad sounds like he's got thoughts above

@matklad
Copy link
Member

matklad commented Jun 13, 2017

Yeah, this is basically blocked on profiles, and I have not yet written a pre RFC because I don't know what to write in the "proposed solution" section, but I'll link my current take on the problem formulation: #4140 (comment)

@mqudsi
Copy link

mqudsi commented Jul 16, 2017

Could profiles be separated from an overall option to build dependencies as release?

For example, the new default could be debug local code + release crate dependencies with a --debug-all as a counterpart to the existing --release parameter?

@est31
Copy link
Member

est31 commented Jul 16, 2017

I think we should fix the issue in two steps:

  • First, provide an unstable option to support this feature somehow
  • Second, work on profiles 2.0 which hopefully will support something like this, deprecating and removing the unstable option

@chyvonomys
Copy link

Is there any workaround currently to make one of the dependencies to build in release mode (optimized, without debug info)? I have one dependency that is way slower in debug than in release.

@matklad
Copy link
Member

matklad commented Sep 15, 2017

@ozkriff @not-fl3 yesterday at Rust gamedev meetup at Saint Petersburg you've both said that this feature would be very useful for gamedev. So I've decided to update my PR so that you can check if this feature indeed brings the benefits you expect (there's a fear that due to monomorphization optimization level for deps might be irrelevant).

Given that we now have unstable cargo features, I think that we could probably land this in some unstable form. For sure, we won't stabilize anything close to my implementation, because profiles need to be fixed first. However, this won't probably happen soon, so, if this is useful, we can try to help at least nightly users. As this only affects the development experience of binary crates authors, I don't think there's a high risk behind adding this under a feature flag.

To try this out, install Cargo from this branch: https://github.com/matklad/cargo/tree/fast-dependencies-2

cp ~/.cargo/bin/cargo ~/.cargo/bin/rustup-cargo
cargo install --git https://github.com/matklad/cargo --branch fast-dependencies-2 --force  

Then add to your Cargo.toml (for workspaces this must be a non-virtual manifest):

cargo-features = ["always-optimize-deps"]
always-optimize-deps = true

@Ralith
Copy link

Ralith commented Sep 15, 2017

Perhaps rustc could compile monomorphized code from external crates according to the optimization flags given when those crates were built, distinct from the flags applying to the current crate?

@ozkriff
Copy link

ozkriff commented Sep 15, 2017

@matklad It works beautifully for my use cases! :-D

@not-fl3
Copy link
Contributor

not-fl3 commented Sep 16, 2017

@matklad great work!

Made a quick hack not-fl3@fd2a0b5 to cover all of my needs.

Its a way to force even local dependencies to be built in release mode.

Two use cases I wanted to cover:

  • marking "force-release" local slow sub-crates to improve runtime speed of debug builds.

For example, local mesh loading crate. Its not on crates.io, so its added as a local lib. But in the same time its not activly developed, so its really nice to compile it once in release.

  • marking "force-release" local crates, which actually is a separate app running as a child process. Helps a lot in testing.

For example, local server with entrypoint in lib.rs. Nice to have an option to make it fast during development of client.

It would be really great to have something like this in nightly cargo!

@tyoc213
Copy link

tyoc213 commented Sep 16, 2017

I havent read anything about monophormization, but from my point of view the only thing that need to be like This will indeed not really work well with monomorphization, all instantiations will just be optimized at the same level as the crate they're instantiated into. is release mode.

@crumblingstatue
Copy link

I also support the idea of specifying optimization level for individual dependencies.

I'm developing a program that deals with images, and it's a major pain to test it, because the image crate takes forever to load images when compiled with debug profile.

@luser
Copy link
Contributor

luser commented Jan 18, 2018

@Manishearth proposed an RFC: rust-lang/rfcs#2282

@stale
Copy link

stale bot commented Sep 17, 2018

As there hasn't been any activity here in over 6 months I've marked this as stale and if no further activity happens for 7 days I will close it.

I'm a bot so this may be in error! If this issue should remain open, could someone (the author, a team member, or any interested party) please comment to that effect?

The team would be especially grateful if such a comment included details such as:

  • Is this still relevant?
  • If so, what is blocking it?
  • Is it known what could be done to help move this forward?

Thank you for contributing!

(The cargo team is currently evaluating the use of Stale bot, and using #6035 as the tracking issue to gather feedback.)

If you're reading this comment from the distant future, fear not if this was closed automatically. If you believe it's still an issue please leave a comment and a team member can reopen this issue. Opening a new issue is also acceptable!

@stale stale bot added the stale label Sep 17, 2018
@alexcrichton
Copy link
Member Author

This has been implemented on nightly and is tracked at rust-lang/rust, so I'm going to close in favor of those locations.

@SimonSapin
Copy link
Contributor

@alexcrichton Your "implemented" link goes to the docs for profiles in .cargo/config, which as I understand apply to all crates for a given cargo invocation. So it is not directly relevant to this issue. Are crate-specific profiles implemented too? If not, shouldn’t this issue stay open?

@ehuss
Copy link
Contributor

ehuss commented Sep 17, 2018

@SimonSapin I think the link should be: https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#profile-overrides The "*" override covers all dependencies.

@SimonSapin
Copy link
Contributor

Oh I see, never mind then :)

@dylemma
Copy link

dylemma commented Apr 26, 2020

I found my way here while wondering if this feature existed, and since this is a years-old thread, looks like the feature is no longer "unstable", and docs are now here: https://doc.rust-lang.org/nightly/cargo/reference/profiles.html#overrides

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-configuration Area: cargo config files and env vars
Projects
None yet
Development

Successfully merging a pull request may close this issue.