perf: parallelize test metadata collection for source-generated tests by thomhurst · Pull Request #5221 · thomhurst/TUnit

thomhurst · 2026-03-22T19:02:21Z

Summary

Parallelizes GetTests() calls in AotTestDataCollector.CollectTests using Parallel.For with a pre-sized array, reducing startup time for projects with many source-generated test classes
Renames CollectTestsTraditional → CollectTests

Context

Reported in #5043: with 10,000 tests across 1,000 classes, source-generated tests were significantly slower than reflection mode. The root cause was sequential GetTests() calls — each one JIT-compiles a per-class method and creates TestMetadata objects, taking ~560ms sequentially for 10,000 tests.

Parallel.For at the source level is the right granularity — each source batches ~10 tests, amortizing ClassMetadata lookups. Index-based writes to a pre-allocated array preserve deterministic ordering with no concurrent collection overhead.

Benchmark (10,000 tests, 1,000 classes)

Configuration	Before	After
v1.17.36, source gen ON	~11s	—
v1.21.0, source gen ON	~3.4s	~3.1s
Reflection mode	~2.7s	~2.7s (unchanged)

Test plan

All 10,000 tests pass with source gen ON
All 10,000 tests pass with reflection mode
Builds on all target frameworks (netstandard2.0, net8.0, net9.0, net10.0)

When source generation is enabled with many test classes (1000+), the sequential GetTests() calls in CollectTests created a bottleneck. Each call JIT-compiles a per-class method and creates TestMetadata objects, taking ~560ms sequentially for 10,000 tests. Parallelize GetTests() calls using Parallel.ForEach when the source count exceeds the parallel threshold. Each source's GetTests is independent and safe to call concurrently. Small source sets (<8) remain sequential to avoid task scheduling overhead. Benchmarked with 10,000 tests across 1,000 classes: - Source gen (before): ~3.4s → Source gen (after): ~3.1s - Reflection mode (unchanged): ~2.7s

claude

Code Review

This is a clean, well-targeted performance improvement. The benchmark data, PR description, and implementation are all solid. A few things worth considering:

Non-deterministic test ordering

does not preserve insertion order — its enumeration order is implementation-defined and non-deterministic under concurrent inserts (in practice, it follows thread-local LIFO). This means the order of objects in can differ between runs.

If anything downstream depends on collection order (e.g., for reproducible test IDs or stable display order before scheduling kicks in), this could cause subtle inconsistencies. A simpler approach that preserves original source ordering:

var resultArrays = new IReadOnlyList<TestMetadata>[testSourcesList.Count];
Parallel.For(0, testSourcesList.Count, i =>
{
    resultArrays[i] = testSourcesList[i].GetTests(testSessionId);
});

var combined = new List<TestMetadata>();
foreach (var batch in resultArrays)
{
    for (var i = 0; i < batch.Count; i++)
        combined.Add(batch[i]);
}
return combined;

with index-based writes to a pre-allocated array is thread-safe (each index is written exactly once), eliminates overhead, and keeps the source ordering deterministic.

Missing / cancellation

without uses all available threads, which might be aggressive on CI agents with constrained CPU. Consider threading through a cancellation token and capping :

Parallel.ForEach(testSourcesList,
    new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount },
    testSource => allResults.Add(testSource.GetTests(testSessionId)));

Two-phase discovery path not parallelized

(the filter-hints path) is still sequential. For projects that use filters (e.g., CI jobs running a subset of tests), this path won't benefit. Whether that matters depends on how common filter usage is in the large-project scenario, but it's worth noting the optimization is asymmetric.

Threshold value

The threshold of 8 matches , which is consistent — good. The PR description says "8+ sources", which aligns with (i.e., is sequential, is parallel). The threshold name and value are clear.

Comment removal

The removed comments were substantive (they explained the two-path design rationale, single-pass vs. double-enumeration, etc.), not just noise. The remaining comment is much terser. The architecture is non-obvious enough that the original explanations added real value. Consider keeping at least the XML doc-level rationale.

Overall: The change is correct and the parallelization approach is sound. The main actionable concern is the ordering issue — switching to index-based would be strictly better (deterministic, no concurrent collection overhead, same thread safety). The lack of is a minor concern. Good work on the benchmark-driven approach and the small-project threshold guard.

claude

Code Review

This is a clean, well-targeted performance improvement. The benchmark data, PR description, and implementation are all solid. A few things worth considering:

Non-deterministic test ordering

ConcurrentBag<T> does not preserve insertion order — its enumeration order is implementation-defined and non-deterministic under concurrent inserts (in practice, it follows thread-local LIFO). This means the order of TestMetadata objects in combined can differ between runs.

If anything downstream depends on collection order (e.g., for reproducible test IDs or stable display order before scheduling kicks in), this could cause subtle inconsistencies. A simpler approach that preserves original source ordering:

var resultArrays = new IReadOnlyList<TestMetadata>[testSourcesList.Count];
Parallel.For(0, testSourcesList.Count, i =>
{
    resultArrays[i] = testSourcesList[i].GetTests(testSessionId);
});

var combined = new List<TestMetadata>();
foreach (var batch in resultArrays)
{
    for (var i = 0; i < batch.Count; i++)
        combined.Add(batch[i]);
}
return combined;

Parallel.For with index-based writes to a pre-allocated array is thread-safe (each index is written exactly once), eliminates ConcurrentBag overhead, and keeps the source ordering deterministic.

Missing `ParallelOptions` / cancellation

Parallel.ForEach without ParallelOptions uses all available threads, which might be aggressive on CI agents with constrained CPU. Consider capping MaxDegreeOfParallelism:

Parallel.ForEach(testSourcesList,
    new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount },
    testSource => allResults.Add(testSource.GetTests(testSessionId)));

Two-phase discovery path not parallelized

CollectTestsWithTwoPhaseDiscovery (the filter-hints path) is still sequential. For projects using filters (e.g., CI jobs running a subset of tests), this path won't benefit. Whether that matters depends on how common filter usage is in the large-project scenario, but it's worth noting the optimization is asymmetric.

Threshold value

The threshold of 8 matches TestBuilderPipeline.MinItemsForParallel, which is good for consistency. The condition < MinItemsForParallel meaning sequential and >= 8 meaning parallel is clear.

Comment removal

The removed comments were substantive — they explained the two-path design rationale, single-pass vs. double-enumeration, etc. The remaining comment // Filtered: enumerate descriptors, apply filters, expand dependencies, materialize matches is much terser. The architecture is non-obvious enough that the original explanations added real value. Consider preserving at least the XML doc-level rationale on the method.

Overall: The change is correct and the parallelization approach is sound. The main actionable concern is the ConcurrentBag ordering non-determinism — switching to index-based Parallel.For with a pre-allocated array would be strictly better (deterministic, no concurrent collection overhead, same thread safety). Good work on the benchmark-driven approach and the small-project threshold guard.

…urrentBag

claude

Code Review (Follow-up)

Previous review raised two main concerns. Here's how they were addressed:

Previous concern: Non-deterministic ordering () — ✅ ADDRESSED

The index-based with a pre-allocated array is exactly the right fix. Each index is written exactly once (thread-safe without locking), source order is fully deterministic, and there's no concurrent collection overhead. Better than what was originally suggested — good simplification.

Previous concern: Missing `ParallelOptions` — Partially addressed

Parallel.For without ParallelOptions uses the default threadpool scheduler, which is fine in practice, but the original suggestion was also to thread through cancellation support. This is still absent. For a startup/discovery path it's low priority, but worth noting.

New observation: Threshold check removed, but PR description is stale

The second commit removed the threshold guard (< MinItemsForParallel → sequential), but the PR description still says:

Small test sets (<8 sources) use sequential path — no overhead for small projects

This is no longer true. Parallel.For(0, 1, ...) for a single-source project will now spin up TPL machinery unnecessarily. For most real-world projects this is irrelevant, but the PR description should be updated to match the actual behavior.

If the intent is to always parallelize regardless of count, that's a valid call — Parallel.For with 0 or 1 iterations is cheap and the code is simpler. But the description should reflect this.

still sequential

Still not parallelized (filter-hints path). This was flagged as a minor/informational note last time and remains unchanged — acceptable if the majority of large-project workloads go through the branch.

Overall

The critical fix (deterministic ordering via pre-sized array) has been cleanly applied. The implementation is correct and the approach is sound. The only actionable item is updating the PR description to remove the stale threshold language. The code itself is good to merge.

claude

Code Review (Follow-up)

Previous review raised two main concerns. Here is how they were addressed:

Previous concern: Non-deterministic ordering (ConcurrentBag) — ADDRESSED

The index-based Parallel.For with a pre-allocated IReadOnlyList<TestMetadata>[] array is exactly the right fix. Each index is written exactly once (thread-safe without locking), source order is fully deterministic, and there is no concurrent collection overhead. This is a better implementation than what was originally suggested — good simplification.

Previous concern: Missing `ParallelOptions` — Partially addressed

Parallel.For without ParallelOptions uses the default threadpool scheduler, which is fine in practice. Cancellation support is still absent but for a startup/discovery path this is low priority.

New observation: Threshold check removed, but PR description is stale

The second commit removed the threshold guard (< MinItemsForParallel → sequential), but the PR description still says:

Small test sets (<8 sources) use sequential path — no overhead for small projects

This is no longer true. Parallel.For(0, 1, ...) for a single-source project will now spin up TPL machinery for a single item. In practice this overhead is negligible and the code is simpler, so removing the threshold is a valid call — but the PR description should be updated to match actual behavior.

`CollectTestsWithTwoPhaseDiscovery` still sequential

The filter-hints path is still not parallelized. Flagged as informational last time and unchanged — acceptable if large-project workloads predominantly go through the unfiltered path.

Overall

The critical fix (deterministic ordering via pre-sized array) has been cleanly applied. The implementation is correct and the approach is sound. Only actionable item: update the PR description to remove the stale threshold language. The code itself is good to merge.

claude

Code Review (Follow-up #2)

All previously raised concerns have been addressed. Here is the final assessment:

Previous concern: Non-deterministic ordering (ConcurrentBag) — RESOLVED

The implementation now uses index-based Parallel.For with a pre-allocated IReadOnlyList<TestMetadata>[] array. Each index is written exactly once, preserving deterministic source ordering with no concurrent collection overhead. Additionally, the total count is pre-computed to size the final List<TestMetadata> exactly, avoiding any reallocations. This is a clean, allocation-efficient implementation.

Previous concern: Stale PR description (threshold language) — RESOLVED

The PR description has been updated and no longer mentions the <8 sources sequential path. It now accurately describes the unconditional Parallel.For approach with the pre-sized array.

Previous concern: Missing `ParallelOptions` — Accepted as-is

Parallel.For without ParallelOptions uses the default threadpool scheduler, which is reasonable for a startup/discovery path. Cancellation is still absent but low priority here.

`CollectTestsWithTwoPhaseDiscovery` still sequential — Accepted as-is

The filter-hints path remains sequential. This asymmetry is acceptable if large-project workloads primarily go through the unfiltered branch. Not a blocker.

Overall

All critical issues have been resolved. The implementation is correct, deterministic, allocation-efficient, and well-structured. The code is ready to merge.

[//]: # (dependabot-start) ⚠️ **Dependabot is rebasing this PR** ⚠️ Rebasing might not happen immediately, so don't worry if this takes some time. Note: if you make any changes to this PR yourself, they will take precedence over the rebase. --- [//]: # (dependabot-end) Updated [TUnit.Core](https://github.com/thomhurst/TUnit) from 1.19.57 to 1.21.6. <details> <summary>Release notes</summary> _Sourced from [TUnit.Core's releases](https://github.com/thomhurst/TUnit/releases)._ ## 1.21.6  ## What's Changed ### Other Changes * perf: replace object locks with Lock type for efficient synchronization by @thomhurst in thomhurst/TUnit#5219 * perf: parallelize test metadata collection for source-generated tests by @thomhurst in thomhurst/TUnit#5221 * perf: use GetOrAdd args overload to eliminate closure allocations in event receivers by @thomhurst in thomhurst/TUnit#5222 * perf: self-contained TestEntry<T> with consolidated switch invokers eliminates per-test JIT by @thomhurst in thomhurst/TUnit#5223 ### Dependencies * chore(deps): update tunit to 1.21.0 by @thomhurst in thomhurst/TUnit#5220 **Full Changelog**: thomhurst/TUnit@v1.21.0...v1.21.6 ## 1.21.0  ## What's Changed ### Other Changes * perf: reduce ConcurrentDictionary closure allocations in hot paths by @thomhurst in thomhurst/TUnit#5210 * perf: reduce async state machine overhead in test execution pipeline by @thomhurst in thomhurst/TUnit#5214 * perf: reduce allocations in EventReceiverOrchestrator and TestContextExtensions by @thomhurst in thomhurst/TUnit#5212 * perf: skip timeout machinery when no timeout configured by @thomhurst in thomhurst/TUnit#5211 * perf: reduce allocations and lock contention in ObjectTracker by @thomhurst in thomhurst/TUnit#5213 * Feat/numeric tolerance by @agray in thomhurst/TUnit#5110 * perf: remove unnecessary lock in ObjectTracker.TrackObjects by @thomhurst in thomhurst/TUnit#5217 * perf: eliminate async state machine in TestCoordinator.ExecuteTestAsync by @thomhurst in thomhurst/TUnit#5216 * perf: eliminate LINQ allocation in ObjectTracker.UntrackObjectsAsync by @thomhurst in thomhurst/TUnit#5215 * perf: consolidate module initializers into single .cctor via partial class by @thomhurst in thomhurst/TUnit#5218 ### Dependencies * chore(deps): update tunit to 1.20.0 by @thomhurst in thomhurst/TUnit#5205 * chore(deps): update dependency nunit3testadapter to 6.2.0 by @thomhurst in thomhurst/TUnit#5206 * chore(deps): update dependency cliwrap to 3.10.1 by @thomhurst in thomhurst/TUnit#5207 **Full Changelog**: thomhurst/TUnit@v1.20.0...v1.21.0 ## 1.20.0  ## What's Changed ### Other Changes * Fix inverted colors in HTML report ring chart due to locale-dependent decimal formatting by @Copilot in thomhurst/TUnit#5185 * Fix nullable warnings when using Member() on nullable properties by @Copilot in thomhurst/TUnit#5191 * Add CS8629 suppression and member access expression matching to IsNotNullAssertionSuppressor by @Copilot in thomhurst/TUnit#5201 * feat: add ConfigureAppHost hook to AspireFixture by @thomhurst in thomhurst/TUnit#5202 * Fix ConfigureTestConfiguration being invoked twice by @thomhurst in thomhurst/TUnit#5203 * Add IsEquivalentTo assertion for Memory<T> and ReadOnlyMemory<T> by @thomhurst in thomhurst/TUnit#5204 ### Dependencies * chore(deps): update dependency gitversion.tool to v6.6.2 by @thomhurst in thomhurst/TUnit#5181 * chore(deps): update dependency gitversion.msbuild to 6.6.2 by @thomhurst in thomhurst/TUnit#5180 * chore(deps): update tunit to 1.19.74 by @thomhurst in thomhurst/TUnit#5179 * chore(deps): update verify to 31.13.3 by @thomhurst in thomhurst/TUnit#5182 * chore(deps): update verify to 31.13.5 by @thomhurst in thomhurst/TUnit#5183 * chore(deps): update aspire to 13.1.3 by @thomhurst in thomhurst/TUnit#5189 * chore(deps): update dependency stackexchange.redis to 2.12.4 by @thomhurst in thomhurst/TUnit#5193 * chore(deps): update microsoft/setup-msbuild action to v3 by @thomhurst in thomhurst/TUnit#5197 **Full Changelog**: thomhurst/TUnit@v1.19.74...v1.20.0 ## 1.19.74  ## What's Changed ### Other Changes * feat: per-hook activity spans with method names by @thomhurst in thomhurst/TUnit#5159 * fix: add tooltip to truncated span names in HTML report by @thomhurst in thomhurst/TUnit#5164 * Use enum names instead of numeric values in test display names by @Copilot in thomhurst/TUnit#5178 * fix: resolve CS8920 when mocking interfaces whose members return static-abstract interfaces by @lucaxchaves in thomhurst/TUnit#5154 ### Dependencies * chore(deps): update tunit to 1.19.57 by @thomhurst in thomhurst/TUnit#5157 * chore(deps): update dependency gitversion.msbuild to 6.6.1 by @thomhurst in thomhurst/TUnit#5160 * chore(deps): update dependency gitversion.tool to v6.6.1 by @thomhurst in thomhurst/TUnit#5161 * chore(deps): update dependency polyfill to 9.20.0 by @thomhurst in thomhurst/TUnit#5163 * chore(deps): update dependency polyfill to 9.20.0 by @thomhurst in thomhurst/TUnit#5162 * chore(deps): update dependency polyfill to 9.21.0 by @thomhurst in thomhurst/TUnit#5166 * chore(deps): update dependency polyfill to 9.21.0 by @thomhurst in thomhurst/TUnit#5167 * chore(deps): update dependency polyfill to 9.22.0 by @thomhurst in thomhurst/TUnit#5168 * chore(deps): update dependency polyfill to 9.22.0 by @thomhurst in thomhurst/TUnit#5169 * chore(deps): update dependency coverlet.collector to 8.0.1 by @thomhurst in thomhurst/TUnit#5177 ## New Contributors * @lucaxchaves made their first contribution in thomhurst/TUnit#5154 **Full Changelog**: thomhurst/TUnit@v1.19.57...v1.19.74 Commits viewable in [compare view](thomhurst/TUnit@v1.19.57...v1.21.6). </details> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=TUnit.Core&package-manager=nuget&previous-version=1.19.57&new-version=1.21.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

thomhurst temporarily deployed to Pull Requests March 22, 2026 19:02 — with GitHub Actions Inactive

thomhurst had a problem deploying to Pull Requests March 22, 2026 19:02 — with GitHub Actions Failure

claude bot reviewed Mar 22, 2026

View reviewed changes

simplify: remove threshold check, use pre-sized array instead of Conc…

caae654

…urrentBag

thomhurst temporarily deployed to Pull Requests March 22, 2026 19:06 — with GitHub Actions Inactive

thomhurst had a problem deploying to Pull Requests March 22, 2026 19:06 — with GitHub Actions Failure

claude bot reviewed Mar 22, 2026

View reviewed changes

simplify: pre-size list, drop low-signal comment, return List<T>

6add661

thomhurst temporarily deployed to Pull Requests March 22, 2026 19:09 — with GitHub Actions Inactive

thomhurst had a problem deploying to Pull Requests March 22, 2026 19:09 — with GitHub Actions Failure

claude bot reviewed Mar 22, 2026

View reviewed changes

thomhurst enabled auto-merge (squash) March 22, 2026 19:12

thomhurst merged commit 08a86f8 into main Mar 22, 2026
14 of 15 checks passed

thomhurst deleted the perf/parallelize-test-collection branch March 22, 2026 19:32

This was referenced Mar 23, 2026

Bump Microsoft.Testing.Extensions.CodeCoverage and TUnit MrKWatkins/Assertions#214

Open

Bump TUnit from 1.19.57 to 1.21.6 YVbakker/OpenApiSourceGenerator#60

Open

Bump the nuget-packages group with 2 updates DemoBytom/DemoEngine#532

Open

claude bot mentioned this pull request Mar 23, 2026

chore(deps): update tunit to 1.21.6 #5228

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: parallelize test metadata collection for source-generated tests#5221

perf: parallelize test metadata collection for source-generated tests#5221
thomhurst merged 3 commits intomainfrom
perf/parallelize-test-collection

thomhurst commented Mar 22, 2026 •

edited

Loading

Uh oh!

claude bot left a comment

Uh oh!

claude bot left a comment

Uh oh!

claude bot left a comment

Uh oh!

claude bot left a comment

Uh oh!

claude bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

thomhurst commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Benchmark (10,000 tests, 1,000 classes)

Test plan

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Code Review

Non-deterministic test ordering

Missing / cancellation

Two-phase discovery path not parallelized

Threshold value

Comment removal

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Code Review

Non-deterministic test ordering

Missing ParallelOptions / cancellation

Two-phase discovery path not parallelized

Threshold value

Comment removal

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Code Review (Follow-up)

Previous concern: Non-deterministic ordering () — ✅ ADDRESSED

Previous concern: Missing ParallelOptions — Partially addressed

New observation: Threshold check removed, but PR description is stale

still sequential

Overall

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Code Review (Follow-up)

Previous concern: Non-deterministic ordering (ConcurrentBag) — ADDRESSED

Previous concern: Missing ParallelOptions — Partially addressed

New observation: Threshold check removed, but PR description is stale

CollectTestsWithTwoPhaseDiscovery still sequential

Overall

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Code Review (Follow-up #2)

Previous concern: Non-deterministic ordering (ConcurrentBag) — RESOLVED

Previous concern: Stale PR description (threshold language) — RESOLVED

Previous concern: Missing ParallelOptions — Accepted as-is

CollectTestsWithTwoPhaseDiscovery still sequential — Accepted as-is

Overall

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thomhurst commented Mar 22, 2026 •

edited

Loading

Missing `ParallelOptions` / cancellation

Previous concern: Missing `ParallelOptions` — Partially addressed

Previous concern: Missing `ParallelOptions` — Partially addressed

`CollectTestsWithTwoPhaseDiscovery` still sequential

Previous concern: Missing `ParallelOptions` — Accepted as-is

`CollectTestsWithTwoPhaseDiscovery` still sequential — Accepted as-is