Skip to content

Commit

Permalink
[ELF] Cap parallel::strategy to 16 threads when --threads= is unspeci…
Browse files Browse the repository at this point in the history
…fied

When --threads= is unspecified, we set it to
`parallel::strategy.compute_thread_count()`, which uses
sched_getaffinity (Linux)/cpuset_getaffinity (FreeBSD)/std::thread::hardware_concurrency (others).
With extensive testing on many machines (many configurations from
{aarch64,x86-64} x {Linux,FreeBSD,Windows} x allocators(native,mimalloc,rpmalloc) combinations)
with varying workloads, we discovered that when the concurrency is larger than
16, the linking process is slower than using --threads=16 due to parallelism
overhead outweighs optimizations. This is particularly harmful for machines with
many cores or when the link job competes with other jobs.

Cap parallel::strategy when --threads= is unspecified.
For some workloads changing the concurrency from 8 to 16 has nearly no improvement.

--thinlto-jobs= is unchanged since ThinLTO backend compiles are embarrassingly
parallel.

Link: https://discourse.llvm.org/t/avoidable-overhead-from-threading-by-default/69160

Reviewed By: peter.smith, andrewng

Differential Revision: https://reviews.llvm.org/D147493
  • Loading branch information
MaskRay committed Apr 20, 2023
1 parent 0c7fe52 commit a8788de
Showing 1 changed file with 6 additions and 1 deletion.
7 changes: 6 additions & 1 deletion lld/ELF/Driver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1421,7 +1421,9 @@ static void readConfigs(opt::InputArgList &args) {
}

// --threads= takes a positive integer and provides the default value for
// --thinlto-jobs=.
// --thinlto-jobs=. If unspecified, cap the number of threads since
// overhead outweighs optimization for used parallel algorithms for the
// non-LTO parts.
if (auto *arg = args.getLastArg(OPT_threads)) {
StringRef v(arg->getValue());
unsigned threads = 0;
Expand All @@ -1430,6 +1432,9 @@ static void readConfigs(opt::InputArgList &args) {
arg->getValue() + "'");
parallel::strategy = hardware_concurrency(threads);
config->thinLTOJobs = v;
} else if (parallel::strategy.compute_thread_count() > 16) {
log("set maximum concurrency to 16, specify --threads= to change");
parallel::strategy = hardware_concurrency(16);
}
if (auto *arg = args.getLastArg(OPT_thinlto_jobs_eq))
config->thinLTOJobs = arg->getValue();
Expand Down

0 comments on commit a8788de

Please sign in to comment.