Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test setting a lower import limit for ThinLTO. #66625

Conversation

michaelwoerister
Copy link
Member

The Firefox build system sets a lower ThinLTO import limit since recently but because they discovered that that can significantly lower build times without really affecting runtime performance too much. Let's see how this would affect our benchmarks.

r? @ghost

@michaelwoerister
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Nov 22, 2019

⌛ Trying commit 9db5aeb with merge 3972c8d...

bors added a commit that referenced this pull request Nov 22, 2019
…it, r=<try>

Test setting a lower import limit for ThinLTO.

The Firefox build system [sets a lower ThinLTO import limit](https://searchfox.org/mozilla-central/search?q=-import-instr-limit&path=) since recently but because they discovered that that can significantly lower build times without really affecting runtime performance too much. Let's see how this would affect our benchmarks.

r? @ghost
@bors
Copy link
Contributor

bors commented Nov 22, 2019

☀️ Try build successful - checks-azure
Build commit: 3972c8d (3972c8db90dc87ad3de1334b952269f6c9f381ff)

@rust-timer
Copy link
Collaborator

Queued 3972c8d with parent bd816fd, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 3972c8d, comparison URL.

@michaelwoerister
Copy link
Member Author

OK, if I'm interpreting this correctly these results show two things:

  1. The potential compile-time wins for tuning the ThinLTO import limit are pretty big, especially for incremental ThinLTO: 5-30% wall-time wins.
  2. Lowering the limit too much does affect the runtime performance of the generated code. This can be observed quite clearly in the check and debug runs which don't do any ThinLTO, so we only see the negative effect of the stricter import limit.

I think it's a good idea to investigate further and see if there is a setting that still provides wins without impacting performance.

Medium term I'd like us to make more use of the different optimization levels we offer. Then O2 could have a lower limit than O3 and O1 could maybe skip ThinLTO entirely.

@michaelwoerister
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Nov 25, 2019

⌛ Trying commit 35aec48 with merge 3683e14...

bors added a commit that referenced this pull request Nov 25, 2019
…it, r=<try>

Test setting a lower import limit for ThinLTO.

The Firefox build system [sets a lower ThinLTO import limit](https://searchfox.org/mozilla-central/search?q=-import-instr-limit&path=) since recently but because they discovered that that can significantly lower build times without really affecting runtime performance too much. Let's see how this would affect our benchmarks.

r? @ghost
@bors
Copy link
Contributor

bors commented Nov 25, 2019

☀️ Try build successful - checks-azure
Build commit: 3683e14 (3683e142564f0a32dea086e63301f4d0601e8c30)

@rust-timer
Copy link
Collaborator

Queued 3683e14 with parent 4eee955, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 3683e14, comparison URL.

@michaelwoerister
Copy link
Member Author

The results are expected: Compile improvements go down as runtime performance goes up. Let's see if we can find a setting the still gives improvements without the losses.

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Nov 26, 2019

⌛ Trying commit 18dcb29 with merge 8d916b6441f6dc09860687842f23feced42d2404...

@bors
Copy link
Contributor

bors commented Nov 26, 2019

☀️ Try build successful - checks-azure
Build commit: 8d916b6441f6dc09860687842f23feced42d2404 (8d916b6441f6dc09860687842f23feced42d2404)

@michaelwoerister
Copy link
Member Author

@rust-timer build 8d916b6441f6dc09860687842f23feced42d2404

@rust-timer
Copy link
Collaborator

Queued 8d916b6441f6dc09860687842f23feced42d2404 with parent 0f6f66f, future comparison URL.

@Mark-Simulacrum
Copy link
Member

@bors retry try @rust-timer queue

(master was broken for a time)

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Nov 26, 2019

⌛ Trying commit 18dcb29 with merge a2cd298...

bors added a commit that referenced this pull request Nov 26, 2019
…it, r=<try>

Test setting a lower import limit for ThinLTO.

The Firefox build system [sets a lower ThinLTO import limit](https://searchfox.org/mozilla-central/search?q=-import-instr-limit&path=) since recently but because they discovered that that can significantly lower build times without really affecting runtime performance too much. Let's see how this would affect our benchmarks.

r? @ghost
@bors
Copy link
Contributor

bors commented Nov 26, 2019

☀️ Try build successful - checks-azure
Build commit: a2cd298 (a2cd298558c368497b643139d8ce052726941759)

@rust-timer
Copy link
Collaborator

Queued a2cd298 with parent 797fd92, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit a2cd298, comparison URL.

@hdhoang hdhoang added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Nov 28, 2019
@michaelwoerister
Copy link
Member Author

It looks like there are no substantial compile time wins to be had without also introducing runtime regressions, so we can't lower the limit by default, at least not for -Copt-level=3. But if a slight runtime performance regression is acceptable, e.g. for -Copt-level=2 and -Copt-level=1, then the wins can be sizeable.
We should also look into leveraging those for ./x.py where a (local) performance loss of 5% is outweighed by doing 20% less work. In that case we can also get rid of the performance hits by strategic placement of inline attributes.

cc @nnethercote in case you are interested.

@michaelwoerister
Copy link
Member Author

I've been doing some performance measurements for bootstrapping the compiler and the numbers look pretty good. The tests were done on a 4C/8T Core i7 machine and only for the when compiling the compiler with incremental = true in config.toml, as that is a setting where we clearly favor cycle times over raw compiler performance. It is also a configuration that doesn't affect the compiler we ship, so the bar for making changes there should be relatively lower.

default -import-instr-limit=10
x.py build --stage 1 src/libstd 21m 59s 18m 58s
x.py build --stage 1 src/libstd (after change) 9m 53s 8m 16s
x.py test --stage 1 src/test/ui 30m 26s 30m 6s
x.py test --stage 2 src/test/run-make-fulldeps 44m 25s 42m 15s

So that looks like a nice speedup when mostly compiling the compiler and no slowdown when running lots of tests with that slightly less optimized compiler.

@michaelwoerister
Copy link
Member Author

@rust-lang/infra: The above might also be interesting for non-dist CI builds.

@hdhoang hdhoang added A-codegen Area: Code generation T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. labels Dec 12, 2019
@michaelwoerister
Copy link
Member Author

This PR was only ever meant to help gather numbers. Follow up in #67450.

bors added a commit that referenced this pull request Jan 2, 2020
…k-Simulacrum

Allow for setting a ThinLTO import limit during bootstrap

The benchmarks in #66625 have shown that a lower ThinLTO import limit can be a net win for bootstrap times. This PR:
- exposes the setting to `config.toml`,
- defaults to a lower limit if `incremental = true` in `config.toml`, and
- sets a lower limit for `x86_64-gnu-llvm-7` CI image in order to make the jobs complete more quickly (which remains to be tested).

This setting will affect how the compiler and it's tools are compiled. It will not affect the settings the compiler uses when compiling user code.

r? @pietroalbini
cc @rust-lang/infra
JohnTitor added a commit to JohnTitor/rust that referenced this pull request Jan 3, 2020
…imit, r=Mark-Simulacrum

Allow for setting a ThinLTO import limit during bootstrap

The benchmarks in rust-lang#66625 have shown that a lower ThinLTO import limit can be a net win for bootstrap times. This PR:
- exposes the setting to `config.toml`,
- defaults to a lower limit if `incremental = true` in `config.toml`, and
- sets a lower limit for `x86_64-gnu-llvm-7` CI image in order to make the jobs complete more quickly (which remains to be tested).

This setting will affect how the compiler and it's tools are compiled. It will not affect the settings the compiler uses when compiling user code.

r? @pietroalbini
cc @rust-lang/infra
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants