Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upRun translation and LLVM in parallel when compiling with multiple CGUs #43506
Conversation
rust-highfive
assigned
arielb1
Jul 27, 2017
This comment has been minimized.
This comment has been minimized.
|
r? @arielb1 (rust_highfive has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
|
Pre-assigning @alexcrichton for review, so he can already start reading |
rust-highfive
assigned
alexcrichton
and unassigned
arielb1
Jul 27, 2017
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
|
As a future thought for how |
alexcrichton
reviewed
Jul 27, 2017
|
Looking great! The general framework of who's running what when seemed a little confusing to follow, but I think a comment would go a long way towards helping that. |
| if let Ok(token) = token { | ||
| tokens.push(token); | ||
| } else { | ||
| shared_emitter.fatal("failed to acquire jobserver token"); |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jul 28, 2017
Member
Oh just something like:
match token {
Ok(token) => tokens.push(token),
Err(e) => shared_emitter.fatal(&format!("failed to acquire jobserver token: {}", e)),
}
This comment has been minimized.
This comment has been minimized.
| } | ||
|
|
||
| Message::TranslationDone { llvm_work_item, is_last } => { | ||
| work_items.insert(0, llvm_work_item); |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
michaelwoerister
Jul 28, 2017
Author
Contributor
No reason. I changed that during a debugging session. I'll switch it back to a push.
| assert_eq!(trans_worker_state, TransWorkerState::LLVMing); | ||
| trans_worker_state = TransWorkerState::Idle; | ||
| } else { | ||
| drop(tokens.pop()); |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jul 27, 2017
Member
I think we discussed this awhile ago, but should we perhaps not drop the token here? If we greedily hold on to our tokens then that means we can more quickly finish this compilation, which in theory may be desirable to reduce overall memory usage?
This comment has been minimized.
This comment has been minimized.
michaelwoerister
Jul 28, 2017
Author
Contributor
Yes, good idea. The truncate above should take care of dropping the Token if we actually don't need it.
| } | ||
| } | ||
| } else { | ||
| match trans_worker_state { |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jul 27, 2017
Member
I'm currently finding the logic here sort of hard to follow in terms of what this "trans worker" is doing. Could you be sure to add a comment with a high-level architecture of what the relationship is between this coordinator thread, the main translation thread, and the worker codegen threads?
This comment has been minimized.
This comment has been minimized.
alexcrichton
Jul 27, 2017
Member
Ok so to see if I understand this:
- The documentation above refers to a "translation worker"
- This trans worker seems to represent the "ephemeral token" that we inherently have to run work from a jobserver
- The literal thread representing the trans worker can change I think? Sometimes it's the literally the thread doing translation, sometimes it's a spawned worker here to translate an existing module.
- If we happen to reach 4 translated codegen units but not codegen'd codegen units then we request the translating main thread to stop, and continue its work with a freshly spawned thread to codegen a module.
Does that sound roughly right?
Ah I see now there's also a crucial piece where when any codegen thread finishes we consider it "trans worker" thread now available again. This means after any codegen thread finishes we may be candidate to start translation of another unit of work I think, right?
This comment has been minimized.
This comment has been minimized.
michaelwoerister
Jul 28, 2017
Author
Contributor
Ah I see now there's also a crucial piece where when any codegen thread finishes we consider it "trans worker" thread now available again. This means after any codegen thread finishes we may be candidate to start translation of another unit of work I think, right?
No, but it's an excellent idea :D
alexcrichton
added
the
S-waiting-on-author
label
Jul 27, 2017
michaelwoerister
force-pushed the
michaelwoerister:async-llvm
branch
from
d2cdb54
to
b6c9e69
Jul 28, 2017
This comment has been minimized.
This comment has been minimized.
|
Docs look great! So one overall meta-comment as well now. I'm having some difficulty articulating this but I'm slightly worried about a situation like:
I haven't convinced myself this is possible and I think that the various handling makes it ok? I've found the interaction between all these threads and the implicit token a little difficult to follow, but does this make sense to you? Can you think of a case where we accidentally starve the translation thread due to our heuristic? |
michaelwoerister
force-pushed the
michaelwoerister:async-llvm
branch
from
b6c9e69
to
dbaee99
Jul 28, 2017
This comment has been minimized.
This comment has been minimized.
So with the latest change, the main thread will be free again as soon as the first LLVM worker is done (thanks to your suggestion). It's still possible to run into a LLVM work shortage the way you describe though. I just adapted the strategy to estimate the cost of a LLVM WorkItem. Maybe the main thread should always start the cheapest one available, so it can get back to translating sooner, if needed? I don't think this is much of a problem though (with no data to back this claim up in any way |
This comment has been minimized.
This comment has been minimized.
|
Thinking about this a bit more, I think this is my mental model for what's happening: on each turn of the loop we'll have N slots of work to fill up depending on what's currently running and what amount of tokens we have. Given the choice of whether to translate a new unit or codegen an existing unit it seems fine to have a heuristic. Whenever something happens though it'll turn the loop and cause everything to start over. I think that's roughly what's implemented right now, but I think that it means that we should consider the translation thread idle as soon as we've acquired a new token? That way if the translation thread was blocked and we get a token it should get unblocked? (which I don't think happens today?) I may not be following the code quite right though...
Neat! |
This comment has been minimized.
This comment has been minimized.
Yes, that makes sense. It's a variation of considering the translation thread idle when a package is finished (another way of getting an additional token). |
michaelwoerister
force-pushed the
michaelwoerister:async-llvm
branch
from
dbaee99
to
29989ef
Jul 31, 2017
michaelwoerister
added some commits
Jul 21, 2017
This comment has been minimized.
This comment has been minimized.
|
Thanks, @kennytm. Compiling with LLVM 3.7 right now... |
This comment has been minimized.
This comment has been minimized.
|
Can't reproduce with LLVM 3.7 either |
michaelwoerister
force-pushed the
michaelwoerister:async-llvm
branch
from
3b2af87
to
b8d4413
Aug 1, 2017
This comment has been minimized.
This comment has been minimized.
|
I pushed another change that should give a sensible error message at a likely point of failure. Also adapted the scheduler heuristic slightly. Let's see. |
This comment has been minimized.
This comment has been minimized.
|
New error, excellent |
This comment has been minimized.
This comment has been minimized.
|
@bors r=alexcrichton OK, passes travis now. |
This comment has been minimized.
This comment has been minimized.
|
|
bors
added a commit
that referenced
this pull request
Aug 1, 2017
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
|
michaelwoerister commentedJul 27, 2017
•
edited
This is still a work in progress but the bulk of the implementation is done, so I thought it would be good to get it in front of more eyes.
This PR makes the compiler start running LLVM while translation is still in progress, effectively allowing for more parallelism towards the end of the compilation pipeline. It also allows the main thread to switch between either translation or running LLVM, which allows to reduce peak memory usage since not all LLVM module have to be kept in memory until linking. This is especially good for incr. comp. but it works just as well when running with
-Ccodegen-units=N.In order to help tuning and debugging the work scheduler, the PR adds the

-Ztrans-time-graphflag which spits out html files that show how work packages where scheduled:(red is translation, green is llvm)
One side effect here is that
-Ztime-passesmight show something not quite correct because trans and LLVM are not strictly separated anymore. I plan to have some special handling there that will try to produce useful output.One open question is how to determine whether the trans-thread should switch to intermediate LLVM processing.
TODO:
-Z time-passesoutput for LLVM.cc @alexcrichton @rust-lang/compiler