ci: Conditionally build parallel compiler on `try` #59417

alexcrichton · 2019-03-25T18:43:19Z

This commit configures Travis/AppVeyor to conditionally compile parallel
compilers on @bors: try. This is an experiment currently to see how
this plays out, but the intention is that if the commit message contains
the term "parallel-compiler" then when @bors: try is issued it will
perform differently than the try branch does today, building three
compilers: Linux, macOS, and Windows. We currently have no try
builders for macOS or Windows due to typical capacity issues, so it's
intended that this is only very sparingly used from time to time when
necessary.

cc #48685, tracking issue for parallel compilation

rust-highfive · 2019-03-25T18:43:29Z

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

alexcrichton · 2019-03-25T18:44:49Z

@bors: try

Let's see what happens...

ci: Conditionally build parallel compiler on `try` This commit configures Travis/AppVeyor to conditionally compile parallel compilers on `@bors: try`. This is an experiment currently to see how this plays out, but the intention is that if the commit message contains the term "parallel-compiler" then when `@bors: try` is issued it will perform differently than the try branch does today, building three compilers: Linux, macOS, and Windows. We currently have no `try` builders for macOS or Windows due to typical capacity issues, so it's intended that this is only very sparingly used from time to time when necessary. [parallel-compiler]

alexcrichton · 2019-04-01T18:28:37Z

@bors: try

bors · 2019-04-01T18:28:48Z

⌛ Trying commit 4382c25 with merge 9743253127ed10bcfb5b046647599d211f0b0a0b...

alexcrichton · 2019-04-01T18:30:14Z

@bors: try

Ok I'm gonna run a test real quick and see what happens if we just enable compiling a parallel compiler but it defaults to 1 thread instead of num_cpus. I'm curious what sort of perf slowdown we'll see

bors · 2019-04-01T18:30:25Z

⌛ Trying commit c4fa1e0 with merge cb23a46f60b0233f366367dd11b5fdc241bce328...

Zoxc · 2019-04-01T18:40:33Z

There's another parallel compiler bug introduced in #58929, but it should still work with a single thread.

bors · 2019-04-01T20:44:05Z

☀️ Try build successful - checks-travis
Build commit: cb23a46f60b0233f366367dd11b5fdc241bce328

alexcrichton · 2019-04-01T20:46:13Z

Let's see if we can get some ballpark numbers...

@rust-timer build cb23a46f60b0233f366367dd11b5fdc241bce328

rust-timer · 2019-04-01T20:46:15Z

Success: Queued cb23a46f60b0233f366367dd11b5fdc241bce328 with parent eab3eb3, comparison URL.

rust-timer · 2019-04-02T01:00:50Z

Finished benchmarking try commit cb23a46f60b0233f366367dd11b5fdc241bce328

alexcrichton · 2019-04-02T14:10:28Z

Ok so that looks like it's a blanket 2-3% slowdown across the board if we enable parallel compilation and peg rustc to just one thread. That's... frankly amazing!

I think given those kind of numbers it might actually be feasible to just turn this on by default for all releases. We'd get a small slowdown which we'd quickly be able to recover many times over by bumping up the default number of threads.

@Zoxc I'm curious, do you think that rustc might be ready for this transition? Concretely I'd imagine that we'd turn on the parallel_compiler #[cfg] by default for CI (like done in this PR), but we'd still default to -Zthreads=1 (or something like that). We'd afterwards likely quickly remove the #[cfg(parallel_compiler)] annotations, only using the parallel versions.

Later once ready we could default threads to greater than 1, but until then we could allow testing on nightly via the unstable -Zthreads option for local investigations.

alexcrichton · 2019-04-02T14:13:08Z

I've also started some chat on zulip about this!

mati865 · 2019-04-02T14:15:10Z

Wall-time doesn't look convincing, maybe there is a lot of blocking?

alexcrichton · 2019-04-02T14:24:43Z

Ah yes sorry my mistake! The interesting metric here indeed is wall time, not instructions executed. That comparison URL is located here and is indeed unfortunately less inspiring, showing a blanket ~10% slowdown, which I think is a bit too serious to land just now.

@Zoxc in addition to the question above about whether the compiler is ready for this, are you aware of low hanging fruit for optimizing?

alexcrichton · 2019-04-02T14:27:58Z

Hm well so it's also probably worthwhile noting the magnitude of changes here, it looks like all 10%+ changes are in tiny crates, regressing from 5 to 6 seconds for example. Larger crates like servo ones do regress but still on the order of a second in a 15 second compilation already, so I don't think it's necessarily damning evidence

mati865 · 2019-04-02T14:35:43Z

packed-simd already takes very long time to build and it regresses here by ~10%. That's quite worrying and could be worth checking.

Zoxc · 2019-04-02T19:24:56Z

Turning on parallel_compiler with a single thread should be fine from a correctness standpoint.

I'm not aware of any low hanging fruit, there might be some though as I've mostly focused on making the code run in parallel. Maybe @nnethercote wants to help here and look for some excessive locking?

I'm working on getting rid of Arcs in the query system, but I suspect the overhead comes mostly from locking, and maybe some from Rayon. I can measure this though.

nnethercote · 2019-04-02T21:57:04Z

One distinct downside of (true, multi-thread) parallelization is that it makes benchmarking much harder. Wall time is the true metric of interest, but it's so noisy that e.g. 1% improvements/regression are impossible to spot. Currently we can use instruction counts as a proxy for wall time, because instruction counts have much less variation, and they're a pretty good proxy. But true parallelization will majorly reduce the usefulness of instruction counts as a metric.

It's a hard problem, I'm not sure what to do about it. It's one reason why I'm biased toward trying to improve coarse-grained parallelization (e.g. pipelining) rather than fine-grained parallelization.

alexcrichton · 2019-04-02T22:05:30Z

@nnethercote that's a good point! I suspect though we can solve it by always passing -Zthreads=1 to benchmarks on perf.r-l.o?

alexcrichton · 2019-04-02T22:09:53Z

@Zoxc I don't mind also doing some profiling to look into this, the packed_simd case pointed out by @mati865 is a good one to look into, so I'll try to investigate that and post some results here tomorrow

nnethercote · 2019-04-02T22:18:40Z

@alexcrichton: Let's assume we reach the point where we are shipping a parallel compiler.
Passing -Zthreads=1 would mean that instruction counts is still a good proxy for wall time, which is good... but it would also mean that we are measuring something that we aren't shipping, which is bad :(

Zoxc · 2019-04-02T22:28:55Z

I opened #59649, #59647, #59644 and #59641 to find out what the cause of the regressions are on a high level.

Zoxc · 2019-04-03T03:37:19Z

Seems like the locks are to blame for most of the regressions.

alexcrichton · 2019-04-03T14:32:21Z

Ok I think that we've identified an actionable way to go (thanks @Zoxc) so I don't think we're going to want to take this approach where @bors: try builds parallel compilers and we recommend rustup-toolchain-install-master for testing.

To that end I'm going to close this and I've opened up #59667 to track further work.

rust-highfive assigned nikomatsakis Mar 25, 2019

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 25, 2019

This comment has been minimized.

Sign in to view

alexcrichton force-pushed the parallel-binaries branch from d6b926b to 50e89c3 Compare March 25, 2019 18:46

bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 25, 2019

alexcrichton force-pushed the parallel-binaries branch from 50e89c3 to 70ad29c Compare March 25, 2019 18:51

This comment has been minimized.

Sign in to view

alexcrichton force-pushed the parallel-binaries branch 4 times, most recently from b47a1c3 to acf826e Compare March 25, 2019 19:59

This comment has been minimized.

Sign in to view

alexcrichton force-pushed the parallel-binaries branch from acf826e to f21a904 Compare March 25, 2019 20:01

This comment has been minimized.

Sign in to view

alexcrichton force-pushed the parallel-binaries branch from f21a904 to c357a1a Compare March 26, 2019 14:41

More AppVeyor tweaks

4382c25

Deafult to 1 thread

c4fa1e0

alexcrichton mentioned this pull request Apr 3, 2019

Release nightly compilers with ability to internally parallelize #59667

Closed

alexcrichton closed this Apr 3, 2019

ci: Conditionally build parallel compiler on try #59417

ci: Conditionally build parallel compiler on try #59417

Uh oh!

Conversation

alexcrichton commented Mar 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rust-highfive commented Mar 25, 2019

Uh oh!

alexcrichton commented Mar 25, 2019

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

alexcrichton commented Apr 1, 2019

Uh oh!

bors commented Apr 1, 2019

Uh oh!

alexcrichton commented Apr 1, 2019

Uh oh!

bors commented Apr 1, 2019

Uh oh!

Zoxc commented Apr 1, 2019

Uh oh!

bors commented Apr 1, 2019

Uh oh!

alexcrichton commented Apr 1, 2019

Uh oh!

rust-timer commented Apr 1, 2019

Uh oh!

rust-timer commented Apr 2, 2019

Uh oh!

alexcrichton commented Apr 2, 2019

Uh oh!

alexcrichton commented Apr 2, 2019

Uh oh!

mati865 commented Apr 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexcrichton commented Apr 2, 2019

Uh oh!

alexcrichton commented Apr 2, 2019

Uh oh!

mati865 commented Apr 2, 2019

Uh oh!

Zoxc commented Apr 2, 2019

Uh oh!

nnethercote commented Apr 2, 2019

Uh oh!

alexcrichton commented Apr 2, 2019

Uh oh!

alexcrichton commented Apr 2, 2019

Uh oh!

nnethercote commented Apr 2, 2019

Uh oh!

Zoxc commented Apr 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Zoxc commented Apr 3, 2019

Uh oh!

alexcrichton commented Apr 3, 2019

Uh oh!

Uh oh!

ci: Conditionally build parallel compiler on `try` #59417

ci: Conditionally build parallel compiler on `try` #59417

alexcrichton commented Mar 25, 2019 •

edited

Loading

mati865 commented Apr 2, 2019 •

edited

Loading

Zoxc commented Apr 2, 2019 •

edited

Loading