Add an option to optimize just dependencies #1359

alexcrichton · 2015-02-27T00:54:28Z

The package being built (and all targets within) will not be built in release mode, but all dependencies would be built in release mode.

cc #784
cc #784 (comment)

mystise · 2015-03-11T05:29:17Z

Would this apply to the standard library as well? (I.E. compiling your code in debug mode while the standard lib would be compiled in release mode)

alexcrichton · 2015-03-11T09:16:37Z

Unfortunately no, the standard library currently only comes as optimized.

mystise · 2015-03-11T19:15:13Z

Oh, excellent. That's actually what I was hoping would happen, as then most of the bottlenecks are going to be in my code that calls the std lib rather than in the std lib itself.

pcwalton · 2015-07-22T22:39:05Z

+1. I want this for Servo ipc-channel.

Inside the `dependencies.foo` section, you may now specify a minimum optimization level to be applied to this dependency. If multiple such optimization levels are supplied, Cargo chooses the highest one. This will probably need to go through an RFC before landing. Addresses rust-lang#1359.

matklad · 2016-02-07T00:17:12Z

@alexcrichton I think I can work on this.

From the design point of view, we can add an optimize-dependencies flag, or we can add a separate dev-dependencies profile. The first option is simpler, the second is more flexible and would allow to implement a default like

[profiles.dev-deps] 
opt-level = 2 
debug = true 
debug-assertions = true

which may be better then the current default of slow dependencies.

These two options are somewhat orthogonal, and it should be possible to implement one first and add the other later.

Another design question is what exactly a dependency is? If you have some path dependencies, do you want them to be optimized? And what about overridden dependencies? I can suggest two solutions here.

A package is not a dependency if it is

a root package.
a package for which source_id().is_path() is true (this includes the root package)

From the implementation point of view I hope that the only thing that should be changed is lib_profile function :)

alexcrichton · 2016-02-08T17:48:56Z

@matklad awesome! I think that this definitely has a bit of a design aspect to it before charging ahead, although I think you're correct in that the function you highlighted is the only one that needs to change (convenient, eh?).

I like the idea of optimize-dependencies not being the right flag as there may be other things you've got going on. One possibility could be:

[profiles.dev] 
dependency-profile = 'release'

or something like that. Basically you select a profile for dependencies rather than any specific information about them. If we later add the ability to specify custom profiles that could also be selected.

I also agree about your heuristic about what a dependency is and what isn't. In general I think that all path dependencies are just implementation details (or units of incrementality) that are part of the current crate being worked on. That being said, however, the name "dependency" may be a bit misleading there because they are indeed dependencies.

I wonder if perhaps the term "upstream" could be used? That may help distinguish "other people's code" from "my local code". The term "upstream" and "downstream" are sometimes conflated, though, so that may not be best...

And all that being said you may still want to optimize path dependencies. For example if you temporarily override a perf-critical dependency to a local crate, you probably still want to optimize it even though you're working on it.

I think that I might somewhat lean towards "dependency" meaning "anything not the root package". We could eventually support something where you can specify profiles for specific dependencies in a Cargo.toml, and that could perhaps be used to optimize everything but a few packages.

matklad · 2016-02-08T22:35:13Z

Basically you select a profile for dependencies rather than any specific information about them. If we later add the ability to specify custom profiles that could also be selected.

Yes, I think it is the best approach.

I wonder if perhaps the term "upstream" could be used?

I find "upstream" even more confusing than "dependency". What about "local package"?

I think that I might somewhat lean towards "dependency" meaning "anything not the root package".

I'm more for "dependency = anything non local". The original motivation for "optimize dependencies" is that you build them only once, so it makes sense to spend more time on compiling them efficiently. Local packages however are rebuild more often (n times) then deps (1 time), and optimizing them will affect compile times negatively. If you really need fast local packages the right solution might be to add opt-level=2 to profiles.dev.

Out of curiosity, how this will work with monomorphisation? If I have a template function in a dependency compiled in release mode, and use it from a crate compiled in debug mode, will I get any benefits? Suppose that the templated function does not call other non-templated functions.

alexcrichton · 2016-02-09T16:11:46Z

"local" does seem kinda on the right track, yeah, although it may be kind of odd saying:

[profile.dev]
non-local-dependency-profile = 'release'

Quite a mouthful!

This will indeed not really work well with monomorphization, all instantiations will just be optimized at the same level as the crate they're instantiated into.

whitequark · 2016-02-09T16:14:04Z

Maybe vendor instead of non-local?

matklad · 2016-02-09T16:30:42Z

I like the idea of optimize-dependencies not being the right flag as there may be other things you've got going on. One possibility could be:
[profiles.dev] 
dependency-profile = 'release'

Hm, in this piece of TOML two ideas are expressed:

specify dependency profile override by name,
specify dependency override in the profile itself.

I totally agree with 1, but 2 does not feel right: ideally profile should be set per package, but dependency-profile is global property.

What about

[profile-overrides.dev]
vendor-dependencies="release"

? This can in future be extended to handle more fine-grain overrides.

alexcrichton · 2016-02-10T18:33:00Z

Hm yeah I guess if you set all dependencies to a release profile you'd want that to happen in both test and dev. Although we currently have two profiles for that, so any configuration in one already needs to be reflected in another, so maybe it's not so bad?

I'd be somewhat wary of inventing new top-level keys like profile-overrides (maybe we could stick it in [project] if we want?). I'm also not sure that "vendor" conveys the right information here because to me "vendor" typically means what's literally included locally, rather than what I'm using from crates.io

matklad · 2016-02-10T23:26:29Z

Hm yeah I guess if you set all dependencies to a release profile you'd want that to happen in both test and dev. Although we currently have two profiles for that, so any configuration in one already needs to be reflected in another, so maybe it's not so bad?

I'd like to ask what exactly a profile is? I mean, there is test profile and release profile, but there is also cargo test and cargo test --release, which do different things.

Here is a model I have in mind for this feature:

A profile is basically a description of a set of flags, applied during compilation of a package. And there is "compilation mode" which determines what profiles are applied to what packages. Like, there is --release mode, which applies release profile to all packages.

As far as I understand, currently in cargo the notion of compilation mode is not present first class. For example, in Context build_config.release and unit.profile.test are used to make decisions about compilation. This lead to surprising (?) effects. If I have

[profile.test] 
opt-level = 3

in Cargo.toml, then for cargo test only my crate will be build with optimizations enabled, but all dependencies will use flags from the dev profile.

alexcrichton · 2016-02-11T00:43:24Z

Ah yeah to clarify, when I say profile I mean what you're writing down in Cargo.toml like test, release, dev, and bench. I will agree, however, that Cargo's treatment of these profiles is kinda ad-hoc and in general "not great" as it may be surprising (as you're encountering here).

That being said, though, my motivation of foo = "profile-name" is because we may one day support custom profiles, and otherwise I think the interaction of profiles here may be somewhat orthogonal to the feature at hand?

matklad · 2016-02-11T00:57:36Z

I think the interaction of profiles here may be somewhat orthogonal to the feature at hand?

Yes, if we specify override globally, like

[project]
profile-overrides = {
   upstream-dependencies="profile-name"
}

Likely not, if we use something like

[profiles.test]
profile-overrides = {
   upstream-dependencies="profile-name"
}

matklad · 2016-02-11T01:01:29Z

@alexcrichton what about

I'm more for "dependency = anything non local". The original motivation for "optimize dependencies" is that you build them only once, so it makes sense to spend more time on compiling them efficiently. Local packages however are rebuild more often (n times) then deps (1 time), and optimizing them will affect compile times negatively. If you really need fast local packages the right solution might be to add opt-level=2 to profiles.dev.

alexcrichton · 2016-02-11T19:17:04Z

Yeah we could in theory support a top-level "always override dependencies with this profile", but I suspect that it's likely to be a local decision per-profile (especially if we grow custom profiles) rather than an always-true option. Note that the dev/test profile management here may want to be improved...

I can also see the "dependency = anything non local" logic, yeah, although I think it can work both ways. You may sometimes be frobbing all the path dependencies at once, but there could also be legitimate cases where only the main one is being modified. I suspect though that the "anything non local" case is more common.

At least in terms of performance, I would expect any bottlenecked dependencies to all be upstream rather than being worked on locally.

alexcrichton · 2016-02-26T19:30:42Z

There are some concerns on #2380, outlining few tensions to balance when implementing this.

matklad · 2016-02-26T23:56:11Z

@alexcrichton I agree with your analysis. One thing I want to add is that there is a possible case of optimizing (O1?) dependencies by default, which still can (maybe) bring some benefits.

And for the future refence I've done some dirty "benchmarks" of the approach in #2380. I've benchmarked this crate (this crate was the reason I found this issue :) ). It does some OpenGL rendering in a super inefficient and unidiomatic way. The major run time bottleneck is texture loading (which is done by a library).

So here are the measurements of a recompile cycle after a trivial whitespace edit in "src/lib.rs" and a corresponding time to see something rendered.

Usual debug build

$CARGO build --bin mirror  24.93s user 0.92s system 100% cpu 25.840 total
$CARGO build --bin mirror  25.22s user 0.89s system 100% cpu 26.096 total
./target/debug/mirror  68.11s user 0.24s system 97% cpu 1:10.14 total
./target/debug/mirror  68.26s user 0.22s system 98% cpu 1:09.60 total

Usual --release build

$CARGO build --release --bin mirror  72.13s user 0.72s system 100% cpu 1:12.81 total
$CARGO build --release --bin mirror  69.92s user 0.73s system 100% cpu 1:10.61 total
./target/release/mirror  1.59s user 0.21s system 63% cpu 2.836 total
./target/release/mirror  1.66s user 0.20s system 71% cpu 2.602 total

Debug build with release deps

[profile.release]
debug = true
debug-assertions = true

[profile.dev]
dependencies-profile = "release"

$CARGO build --bin mirror  24.13s user 0.71s system 100% cpu 24.832 total
$CARGO build --bin mirror  24.27s user 0.75s system 100% cpu 25.005 total
./target/debug/mirror  25.40s user 0.22s system 97% cpu 26.354 total
./target/debug/mirror  25.38s user 0.23s system 96% cpu 26.536 total

Debug build with opt-level=1 deps

[profile.release]
debug = true
debug-assertions = true
opt-level = 1

[profile.dev]
dependencies-profile = "release"

$CARGO build --bin mirror  24.65s user 0.80s system 100% cpu 25.432 total
$CARGO build --bin mirror  24.58s user 0.84s system 100% cpu 25.404 total
./target/debug/mirror  50.54s user 0.25s system 98% cpu 51.672 total
./target/debug/mirror  50.30s user 0.23s system 98% cpu 51.325 total

Debug build with opt-level=1 for everything

[profile.dev]
opt-level = 1

$CARGO build --bin mirror  38.91s user 0.81s system 100% cpu 39.699 total
$CARGO build --bin mirror  38.68s user 0.80s system 100% cpu 39.459 total
./target/debug/mirror  34.61s user 0.26s system 95% cpu 36.404 total
./target/debug/mirror  34.55s user 0.25s system 96% cpu 36.139 total

	debug	release	deps-release	deps opt-level=1	all opt-level=1
recompile	25s	70s	25s	25s	38s
run	68s	2s	25s	51s	36s

I kinda like how compile and run time coincide in the third case =)

alexcrichton · 2016-02-29T18:53:03Z

Thanks for the data @matklad! Out of curiosity, what do the numbers look like if opt-level is set to 1?

matklad · 2016-02-29T19:27:43Z

@alexcrichton updated the table

alexcrichton · 2016-02-29T21:58:02Z

Oh sorry, I meant for O1 that the entire graph had opt-level 1, not just the dependencies. Does that reduce the runtime at all?

matklad · 2016-02-29T23:25:05Z

yep, updated the table.

alexcrichton · 2016-03-01T00:11:40Z

Thanks! Looks like that kinda confirms that there's a "sweet spot O1" which may suffice for some build configurations?

matklad · 2016-03-01T10:07:16Z

Looks like that kinda confirms that there's a "sweet spot O1" which may suffice for some build configurations?

Not sure. First of all, I don't think that this benchmarks are really representative (target crate is just some quick an dirty openGL exercises, nothing close to production). In particular, target crate is pretty small, and I suppose that compile time grows lineary with code size, while run time is sub linear. That is, for large code bases, O1 compile time overhead may kill any run time benefits.

And given current largish compile times, I won't feel comfortable sacrificing compile time for anything at all :)

mkovaxx · 2017-06-11T21:07:24Z

Has there been any progress on this? #1826 got pretty close to creating a solution that'd work for me, unfortunately it was never merged. It was abandoned in favor of #942. Was a solution implemented based on that?

alexcrichton · 2017-06-13T14:31:56Z

@mkovacs currently, no, there hasn't been much progress on this. This feature remains unimplemented but @matklad sounds like he's got thoughts above

matklad · 2017-06-13T15:05:07Z

Yeah, this is basically blocked on profiles, and I have not yet written a pre RFC because I don't know what to write in the "proposed solution" section, but I'll link my current take on the problem formulation: #4140 (comment)

mqudsi · 2017-07-16T21:36:00Z

Could profiles be separated from an overall option to build dependencies as release?

For example, the new default could be debug local code + release crate dependencies with a --debug-all as a counterpart to the existing --release parameter?

est31 · 2017-07-16T21:48:11Z

I think we should fix the issue in two steps:

First, provide an unstable option to support this feature somehow
Second, work on profiles 2.0 which hopefully will support something like this, deprecating and removing the unstable option

chyvonomys · 2017-09-12T21:26:01Z

Is there any workaround currently to make one of the dependencies to build in release mode (optimized, without debug info)? I have one dependency that is way slower in debug than in release.

matklad · 2017-09-15T01:35:13Z

@ozkriff @not-fl3 yesterday at Rust gamedev meetup at Saint Petersburg you've both said that this feature would be very useful for gamedev. So I've decided to update my PR so that you can check if this feature indeed brings the benefits you expect (there's a fear that due to monomorphization optimization level for deps might be irrelevant).

Given that we now have unstable cargo features, I think that we could probably land this in some unstable form. For sure, we won't stabilize anything close to my implementation, because profiles need to be fixed first. However, this won't probably happen soon, so, if this is useful, we can try to help at least nightly users. As this only affects the development experience of binary crates authors, I don't think there's a high risk behind adding this under a feature flag.

To try this out, install Cargo from this branch: https://github.com/matklad/cargo/tree/fast-dependencies-2

cp ~/.cargo/bin/cargo ~/.cargo/bin/rustup-cargo
cargo install --git https://github.com/matklad/cargo --branch fast-dependencies-2 --force

Then add to your Cargo.toml (for workspaces this must be a non-virtual manifest):

cargo-features = ["always-optimize-deps"]
always-optimize-deps = true

Ralith · 2017-09-15T01:43:14Z

Perhaps rustc could compile monomorphized code from external crates according to the optimization flags given when those crates were built, distinct from the flags applying to the current crate?

ozkriff · 2017-09-15T18:54:57Z

@matklad It works beautifully for my use cases! :-D

not-fl3 · 2017-09-16T08:40:20Z

@matklad great work!

Made a quick hack not-fl3@fd2a0b5 to cover all of my needs.

Its a way to force even local dependencies to be built in release mode.

Two use cases I wanted to cover:

marking "force-release" local slow sub-crates to improve runtime speed of debug builds.

For example, local mesh loading crate. Its not on crates.io, so its added as a local lib. But in the same time its not activly developed, so its really nice to compile it once in release.

marking "force-release" local crates, which actually is a separate app running as a child process. Helps a lot in testing.

For example, local server with entrypoint in lib.rs. Nice to have an option to make it fast during development of client.

It would be really great to have something like this in nightly cargo!

tyoc213 · 2017-09-16T14:33:19Z

I havent read anything about monophormization, but from my point of view the only thing that need to be like This will indeed not really work well with monomorphization, all instantiations will just be optimized at the same level as the crate they're instantiated into. is release mode.

crumblingstatue · 2017-11-18T13:22:48Z

I also support the idea of specifying optimization level for individual dependencies.

I'm developing a program that deals with images, and it's a major pain to test it, because the image crate takes forever to load images when compiled with debug profile.

luser · 2018-01-18T18:58:27Z

@Manishearth proposed an RFC: rust-lang/rfcs#2282

stale · 2018-09-17T15:57:57Z

As there hasn't been any activity here in over 6 months I've marked this as stale and if no further activity happens for 7 days I will close it.

I'm a bot so this may be in error! If this issue should remain open, could someone (the author, a team member, or any interested party) please comment to that effect?

The team would be especially grateful if such a comment included details such as:

Is this still relevant?
If so, what is blocking it?
Is it known what could be done to help move this forward?

Thank you for contributing!

(The cargo team is currently evaluating the use of Stale bot, and using #6035 as the tracking issue to gather feedback.)

If you're reading this comment from the distant future, fear not if this was closed automatically. If you believe it's still an issue please leave a comment and a team member can reopen this issue. Opening a new issue is also acceptable!

alexcrichton · 2018-09-17T16:26:35Z

This has been implemented on nightly and is tracked at rust-lang/rust, so I'm going to close in favor of those locations.

SimonSapin · 2018-09-17T16:40:09Z

@alexcrichton Your "implemented" link goes to the docs for profiles in .cargo/config, which as I understand apply to all crates for a given cargo invocation. So it is not directly relevant to this issue. Are crate-specific profiles implemented too? If not, shouldn’t this issue stay open?

ehuss · 2018-09-17T16:50:30Z

@SimonSapin I think the link should be: https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#profile-overrides The "*" override covers all dependencies.

SimonSapin · 2018-09-17T17:18:06Z

Oh I see, never mind then :)

dylemma · 2020-04-26T06:03:12Z

I found my way here while wondering if this feature existed, and since this is a years-old thread, looks like the feature is no longer "unstable", and docs are now here: https://doc.rust-lang.org/nightly/cargo/reference/profiles.html#overrides

alexcrichton added the A-configuration Area: cargo config files and env vars label Feb 27, 2015

alexcrichton mentioned this issue Feb 27, 2015

Make --release default (at least for dependencies) #784

Closed

pcwalton mentioned this issue Jul 23, 2015

Implement selective optimization levels on a per-dependency basis. #1826

Closed

erickt mentioned this issue Sep 15, 2015

Canvas IPC is still too slow for debug WPT servo/servo#7638

Open

matklad mentioned this issue Feb 11, 2016

add ability to optimize dependencies in dev #2380

Closed

est31 mentioned this issue Jul 8, 2017

Rust should have a pleasant edit-compile-debug cycle rust-lang/rust-roadmap-2017#1

Open

CAD97 mentioned this issue Jul 16, 2017

[ucd/category] Provisionally add unic-ucd-category open-i18n/rust-unic#31

Merged

defyrlt mentioned this issue Sep 2, 2017

Very long parse time for simple template djc/askama#44

Closed

est31 mentioned this issue Nov 19, 2017

Support building crate dependencies at different optimization levels #4719

Closed

SimonSapin mentioned this issue Dec 22, 2017

Repeated "warning: profiles for the non root package will be ignored" when building Rust rust-lang/rust#46951

Closed

Mark-Simulacrum mentioned this issue Jun 10, 2018

Should proc macro crates always be compiled in release even if target crate is being compiled in debug? #5622

Closed

mersinvald mentioned this issue Aug 1, 2018

Rust Scrypt deprecation emeraldpay/emerald-vault#292

Merged

stale bot added the stale label Sep 17, 2018

alexcrichton closed this as completed Sep 17, 2018

iceiix mentioned this issue Jan 4, 2020

steven_blocks: could not compiler under windows? memory allocation of 15728640 bytes failed, STATUS_STACK_BUFFER_OVERRUN iceiix/stevenarella#184

Closed

extrawurst mentioned this issue Dec 12, 2020

improve performance in debug build extrawurst/gitui#445

Closed

Add an option to optimize just dependencies #1359

Add an option to optimize just dependencies #1359

Comments

alexcrichton commented Feb 27, 2015

mystise commented Mar 11, 2015

alexcrichton commented Mar 11, 2015

mystise commented Mar 11, 2015

pcwalton commented Jul 22, 2015

matklad commented Feb 7, 2016

alexcrichton commented Feb 8, 2016

matklad commented Feb 8, 2016

alexcrichton commented Feb 9, 2016

whitequark commented Feb 9, 2016

matklad commented Feb 9, 2016

alexcrichton commented Feb 10, 2016

matklad commented Feb 10, 2016

alexcrichton commented Feb 11, 2016

matklad commented Feb 11, 2016

matklad commented Feb 11, 2016

alexcrichton commented Feb 11, 2016

alexcrichton commented Feb 26, 2016

matklad commented Feb 26, 2016

Usual debug build

Usual --release build

Debug build with release deps

Debug build with opt-level=1 deps

Debug build with opt-level=1 for everything

alexcrichton commented Feb 29, 2016

matklad commented Feb 29, 2016

alexcrichton commented Feb 29, 2016

matklad commented Feb 29, 2016

alexcrichton commented Mar 1, 2016

matklad commented Mar 1, 2016

mkovaxx commented Jun 11, 2017

alexcrichton commented Jun 13, 2017

matklad commented Jun 13, 2017 • edited

mqudsi commented Jul 16, 2017

est31 commented Jul 16, 2017

chyvonomys commented Sep 12, 2017

matklad commented Sep 15, 2017

Ralith commented Sep 15, 2017

ozkriff commented Sep 15, 2017

not-fl3 commented Sep 16, 2017

tyoc213 commented Sep 16, 2017

crumblingstatue commented Nov 18, 2017

luser commented Jan 18, 2018

stale bot commented Sep 17, 2018

alexcrichton commented Sep 17, 2018

SimonSapin commented Sep 17, 2018

ehuss commented Sep 17, 2018

SimonSapin commented Sep 17, 2018

dylemma commented Apr 26, 2020

matklad commented Jun 13, 2017 •

edited