Respect MSBUILDCOPYTASKPARALLELISM for Copy parallelism#13955
Respect MSBUILDCOPYTASKPARALLELISM for Copy parallelism#13955jankratochvilcz wants to merge 1 commit into
Conversation
a30c07c to
a8ef53a
Compare
Perfstar showed OrchardCore multi-threaded incremental builds regressing badly relative to non-MT: +126% slower on a 16-core Mac and +117% slower in CI on a 4-vCPU agent. The Copy task's static process-wide worker pool is part of that regression: /m non-MT builds get one pool per worker process, while /mt uses one process and therefore one pool, reducing Copy queue depth even though aggregate Copy CPU was essentially identical (5463 vs 5502 thread-seconds, 1.01x). Keep using the existing MSBUILDCOPYTASKPARALLELISM trait and its established semantics: less than zero uses the default heuristic, zero maps to int.MaxValue, one disables parallel Copy, and values greater than one explicitly set the Copy worker count. The existing default heuristic remains unchanged for everyone else. Also log the effective Copy parallelism once per process at low importance and cover the environment-variable override with Traits tests plus Copy child-process tests. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
a8ef53a to
39c2e9c
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates the built-in Copy task to honor MSBUILDCOPYTASKPARALLELISM as an explicit override for copy thread parallelism (not just as a switch between parallel vs. legacy single-threaded behavior), and adds tests to validate the behavior and environment variable parsing.
Changes:
- Use
Traits.Instance.CopyTaskParallelism(when > 0) to setCopy’s default parallelism thread count. - Add a diagnostic log line to help confirm the effective copy parallelism (used by tests).
- Add unit tests covering both the env-var parsing (
Traits) and end-to-end behavior via a bootstrapped MSBuild run.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
src/Tasks/Copy.cs |
Applies MSBUILDCOPYTASKPARALLELISM as a thread-count override and logs effective parallelism. |
src/Tasks.UnitTests/Copy_Tests.cs |
Adds an end-to-end test verifying the override is applied in a child MSBuild process via diagnostic output. |
src/Framework.UnitTests/Traits_Tests.cs |
Adds a focused unit test validating Traits.CopyTaskParallelism environment variable parsing. |
| // Computed once at process start from Traits.Instance.CopyTaskParallelism. | ||
| // > 0 = explicit override; < 0 (default) or 0 = use empirical heuristic below. | ||
| private static readonly int DefaultCopyParallelism = |
| Traits.Instance.CopyTaskParallelism > 0 | ||
| ? Traits.Instance.CopyTaskParallelism | ||
| : NativeMethodsShared.GetLogicalCoreCount() > 4 ? 6 : 4; |
There was a problem hiding this comment.
yeah I'd clamp to like
| Traits.Instance.CopyTaskParallelism > 0 | |
| ? Traits.Instance.CopyTaskParallelism | |
| : NativeMethodsShared.GetLogicalCoreCount() > 4 ? 6 : 4; | |
| Traits.Instance.CopyTaskParallelism > 0 | |
| ? Math.Min(Traits.Instance.CopyTaskParallelism, 64) | |
| : NativeMethodsShared.GetLogicalCoreCount() > 4 ? 6 : 4; |
which is way more than I'd expect to help on even big hardware
| bool overridden = Traits.Instance.CopyTaskParallelism > 0; | ||
| Log.LogMessage(MessageImportance.Low, "Copy parallelism = {0}{1}", DefaultCopyParallelism, overridden ? " (overridden)" : ""); |
There was a problem hiding this comment.
Agreed, I'd log only if overridden and then feel ok about it not being localized.
| #nullable disable | ||
|
|
||
| namespace Microsoft.Build.UnitTests | ||
| { | ||
| public class Traits_Tests | ||
| { | ||
| [Theory] | ||
| [InlineData(null, -1)] | ||
| [InlineData("", -1)] | ||
| [InlineData("0", 0)] | ||
| [InlineData("-1", -1)] | ||
| [InlineData("garbage", -1)] | ||
| [InlineData("1", 1)] | ||
| [InlineData("16", 16)] | ||
| public void CopyTaskParallelismReadsIntegerOverride(string value, int expectedParallelism) |
| // instead of int.MaxValue. | ||
| private static readonly int DefaultCopyParallelism = NativeMethodsShared.GetLogicalCoreCount() > 4 ? 6 : 4; | ||
| // Computed once at process start from Traits.Instance.CopyTaskParallelism. | ||
| // > 0 = explicit override; < 0 (default) or 0 = use empirical heuristic below. |
There was a problem hiding this comment.
The comment lists 3 different modes of behavior, but the implementation only does 2 (>0 and everything else).
| private static readonly int DefaultCopyParallelism = | ||
| Traits.Instance.CopyTaskParallelism > 0 | ||
| ? Traits.Instance.CopyTaskParallelism | ||
| : NativeMethodsShared.GetLogicalCoreCount() > 4 ? 6 : 4; |
There was a problem hiding this comment.
The numbers look like magic - if there's reasoning behind them, add a comment about them.
There was a problem hiding this comment.
This is the old behavior; there's a comment above--numbers derived by benchmarks on stale hardware.
🚧 Draft for discussion
Motivation
I'm hoping to merge this so I can run it on our perf bench with different core counts to see if it moves the numbers for MT since we've effectively lowered the max-parallelization of copies as before we would have potentially have copies spread out over multiple processes.
Mechanism
I noticed we already do have some override already, but it's not fully being used, so this PR could hopefully bring it full circle and then I could try runs on PerfStar with different counts just so see if there is any impact on OC.