Release nightly compilers with ability to internally parallelize #59667

alexcrichton · 2019-04-03T14:31:13Z

This is intended to be a tracking issue to releasing nightly compilers with the ability to internally parallelize themselves but they are still defaulted to single threaded mode. This is part of the larger parallel compiler tracking issue, and is intended to be an incremental step towards fully closing that out.

A recent attempt to build binaires of the parallel compiler led to the thought of whether we could just enable a parallel compiler by default. Note that there are two axes we can change here over time:

Whether or nor the compiler can be parallelized at all, aka whether it's built with the --cfg parallel_compiler flag.
Whether or not the compiler by default is parallelized, aka the default value of -Z threads

The proposal in this issue is to default to -Z threads=1 (or the moral equivalent) but build nightly compilers with --cfg parallel_compiler (or the equivalent thereof). The intention is to get us closer to shipping a parallel compiler while buying us time to continue to fix any issues that arise. This would allow, for example, for users to very easily test out parallel compilation locally by using RUSTFLAGS=-Zthreads=16.

The main blocker for doing this is performance. Requested in a recent thread we realized it's imported to not watch the comparison of instruction counts but rather instead watch the wall time numbers. The instruction count numbers regress 2-3% which looks deceptively good, but the wall-time numbers regress 10-20% (ish) which is much more serious.

Some further investigation shows that most of the slowdown is likely coming from the use of mutexes (as opposed to other avenues like removing parallel code, the overhead of using rayon, or using Arc instead of Rc).

The next steps here would be to investigate whether we can recover the performance lost from using mutexes (probably if we can remove the mutexes one way or another).

This issue will likely receive many updates over time!

The text was updated successfully, but these errors were encountered:

alexcrichton · 2019-04-03T16:35:12Z

@Zoxc do you know off the top of your head what some hot mutexes might be? Some local profiling of a compiler from #59644 and the commit just before is not very illuminating, while everything does get a bit slower it's hard to see where it's getting slower.

It does look like get_query (presumably this lock?) is pretty hot, but that also seems somewhat fundamental

HadrienG2 · 2019-04-03T17:51:44Z

You may want to try this lock contention profiling tool and see if it works for you: http://0pointer.de/blog/projects/mutrace.html .

Aaron1011 · 2019-07-05T22:32:23Z

As a temporary workaround, we could try doing something similar to the fragile crate. At runtime, we would inspect -Z threads:

If -Z threads > 1, we use a normal Mutex.
If -Z threads = 1, we use a 'fake' mutex - a type which implements Send/Sync, but panics if used on any thread other than the one which created it. Since only one thread should ever be accessing these Mutexes, the panic should never actually occur.

Hopefully, the overhead of these runtime checks would be much less than the overhead of a full Mutex type. This would hopefully allow a parallelizable compiler to be shipped, while at the same time we continue to work in improving single-thread performance (with actual Mutexes).

Alexendoo · 2023-11-24T12:51:36Z

Implemented in #117435

alexcrichton added A-parallel-queries Area: Parallel query execution T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-compiler-performance Working group: Compiler Performance labels Apr 3, 2019

alexcrichton mentioned this issue Apr 3, 2019

ci: Conditionally build parallel compiler on try #59417

Closed

Luro02 mentioned this issue Jan 16, 2020

multi threading with rayon Luro02/shorthand#8

Open

est31 mentioned this issue Nov 14, 2020

Build the compiler with -Ctarget-cpu=x86-64-v2 #79043

Closed

workingjubilee mentioned this issue Mar 8, 2022

ICE with --alt rustc: WorkerLocal can only be used on the thread pool it was created on #94654

Closed

This was referenced Sep 5, 2022

make mk_attr_id part of ParseSess #101313

Merged

[WIP] make nightly compilers able to parallelize #101566

Closed

Alexendoo closed this as completed Nov 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release nightly compilers with ability to internally parallelize #59667

Release nightly compilers with ability to internally parallelize #59667

alexcrichton commented Apr 3, 2019 •

edited

Loading

alexcrichton commented Apr 3, 2019

HadrienG2 commented Apr 3, 2019 •

edited

Loading

Aaron1011 commented Jul 5, 2019 •

edited

Loading

Alexendoo commented Nov 24, 2023

Release nightly compilers with ability to internally parallelize #59667

Release nightly compilers with ability to internally parallelize #59667

Comments

alexcrichton commented Apr 3, 2019 • edited Loading

alexcrichton commented Apr 3, 2019

HadrienG2 commented Apr 3, 2019 • edited Loading

Aaron1011 commented Jul 5, 2019 • edited Loading

Alexendoo commented Nov 24, 2023

alexcrichton commented Apr 3, 2019 •

edited

Loading

HadrienG2 commented Apr 3, 2019 •

edited

Loading

Aaron1011 commented Jul 5, 2019 •

edited

Loading