Skip to content

perf: reduce async state machine overhead in test execution pipeline#5214

Merged
thomhurst merged 3 commits intomainfrom
perf/reduce-async-overhead
Mar 22, 2026
Merged

perf: reduce async state machine overhead in test execution pipeline#5214
thomhurst merged 3 commits intomainfrom
perf/reduce-async-overhead

Conversation

@thomhurst
Copy link
Owner

@thomhurst thomhurst commented Mar 22, 2026

Summary

  • TestCoordinator: Replace async lambda wrapper with direct method group reference for ExecuteWithRetry, eliminating state machine + closure allocation
  • RetryHelper: Change ExecuteWithRetry to accept Func<ValueTask> instead of Func<Task> to avoid adapter allocation; cache Task.FromResult results as static readonly fields in ShouldRetry
  • TestBuilder / TestBuilderPipeline: Cache DateTimeOffset.UtcNow where two adjacent reads occurred to ensure consistent Start/End timestamps and eliminate redundant syscalls
  • DiscoveryCircuitBreaker: Replace Stopwatch instance with Stopwatch.GetTimestamp() / Stopwatch.GetElapsedTime() on .NET 8+ to avoid object allocation
  • TestExecutor: Remove async/await in discovery hook methods, returning tasks directly

Rationale

Profiling shows ~3.8% exclusive CPU in AsyncMethodBuilderCore.Start (2.59%) and AsyncLocalValueMap.Set (1.21%) from deep async call chains, plus 1.16% in DateTime.get_UtcNow. These changes reduce async state machine overhead and unnecessary clock reads.

Test plan

  • Verify all existing tests pass (no behavioral changes)
  • Confirm build succeeds across all target frameworks

- Elide async/await in forwarding methods (TestExecutor discovery hooks,
  TestMethodInvoker, RetryHelper.ShouldRetry/ApplyBackoffDelay) to avoid
  unnecessary state machine allocations
- Replace async lambda wrappers with direct ValueTask construction in
  TestExecutor.ExecuteTestAsync and TestCoordinator retry path
- Change RetryHelper.ExecuteWithRetry to accept Func<ValueTask> instead
  of Func<Task> to avoid ValueTask-to-Task conversion on the retry path
- Cache DateTimeOffset.UtcNow in TestStateManager.MarkFailed,
  TestBuilder, and TestBuilderPipeline to avoid redundant clock reads
- Replace Stopwatch instance with Stopwatch.GetTimestamp() /
  Stopwatch.GetElapsedTime() in DiscoveryCircuitBreaker on .NET 8+
  (falls back to Stopwatch on netstandard2.0)
Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Well-structured performance PR. The changes are correct and the overall direction is good — reducing state machine allocations in hot paths is the right focus. A few observations worth discussing.

Solid wins

  • ****: Cleanest change in the PR. Replacing with is a genuine allocation improvement (eliminates state machine + closure), and it works perfectly now that ExecuteWithRetry accepts Func<ValueTask>.

  • ** / **: Caching DateTimeOffset.UtcNow where two adjacent reads occurred is correct — it ensures consistent Start/End timestamps for failed tests and eliminates a syscall.

  • DiscoveryCircuitBreaker.cs: The Stopwatch.GetTimestamp() / Stopwatch.GetElapsedTime() approach on #if NET avoids the Stopwatch object allocation. The #if NET symbol is the right choice here (covers net8.0+, not netstandard2.0).


Issues / Questions

1. — no-op caching

The PR description says "where the same timestamp was read twice in adjacent lines", but there's only one call in (the line). The cached variable is only used once, so this change has no effect on performance. The in the initializer reads , not . This could be removed to keep the diff clean, or the PR description should be corrected.

2. discovery hooks — allocation concern

is and iterates hooks — it's virtually never synchronously complete. Calling on an incomplete allocates a object internally. The previous wrapper allocated a state machine instead. These are roughly allocation-equivalent for the common (non-trivially-complete) case, so this is more of a style change than a perf win for discovery hooks. If these callers were themselves, the step could be avoided entirely. Not a blocker, but worth being clear about what's actually saved here.

3. — Task→ValueTask→Task round-trip

returns . Wrapping it in is correct (no copy, just a wrapper). Then on an incomplete allocates again. The original wrapper path also allocated a state machine for this branch. So this is allocation-neutral in the common case — not a regression, but the PR description's claim of "avoiding a state machine + closure allocation" applies more to than here.

The non-executor path () is a genuine win — no allocation at all.

4. — allocates on every retry

allocates a new each call. In the retry path this is invoked on each exception. A cached alternative would avoid this:

This is a micro-optimization, but since the original was (which also allocated a state machine), neither approach is particularly allocation-heavy. Low priority.


Minor

  • and both contain identical logic for the branch. This pre-existing duplication is unrelated to this PR but could be a future consolidation target.
  • The null-check removal ( → guarded by ) is correct since is always initialized in that branch.

Overall this is a clean, well-motivated PR. The and caching changes are the clearest wins. The async elision changes are correct but the allocation benefit is more nuanced than described for the discovery hook and paths.

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Well-structured performance PR. The changes are correct and the overall direction is good — reducing state machine allocations in hot paths is the right focus. A few observations worth discussing.

Solid wins

  • TestCoordinator.cs: Cleanest change in the PR. Replacing the async lambda wrapper with a direct method group reference is a genuine allocation improvement (eliminates state machine + closure), and works perfectly now that ExecuteWithRetry accepts Func<ValueTask>.

  • TestBuilder.cs / TestBuilderPipeline.cs: Caching DateTimeOffset.UtcNow where two adjacent reads occurred is correct — ensures consistent Start/End timestamps for failed tests and eliminates a redundant syscall.

  • DiscoveryCircuitBreaker.cs: The Stopwatch.GetTimestamp() / Stopwatch.GetElapsedTime() approach on #if NET avoids the Stopwatch object allocation. The #if NET symbol is the right choice (covers net8.0+, not netstandard2.0).


Issues / Questions

1. TestStateManager.MarkFailed — no-op caching

The PR description says "where the same timestamp was read twice in adjacent lines", but there is only one DateTimeOffset.UtcNow call in MarkFailed (the ??= line). The cached now variable is only used once, and test.Result uses test.EndTime (not another UtcNow call). This change has zero performance impact and could be removed to keep the diff clean, or the PR description should be corrected.

2. TestExecutor.cs discovery hooks — .AsTask() allocation concern

HookExecutor.ExecuteBeforeTestDiscoveryHooksAsync is async ValueTask and iterates hooks — it is virtually never synchronously complete. Calling .AsTask() on an incomplete ValueTask allocates a Task object internally. The previous async Task wrapper allocated a state machine instead. These are roughly allocation-equivalent in the common case, so this is more of a style change than a clear perf win for discovery hooks. If the callers were changed to async ValueTask, the .AsTask() step could be eliminated entirely. Not a blocker, but worth being clear about what is actually saved here.

3. TestMethodInvoker.cs — Task→ValueTask→Task round-trip

test.InvokeTestAsync(...) returns Task. Wrapping it in new ValueTask(task) is correct (just a struct wrapper, no copy). Then .AsTask() on an incomplete ValueTask allocates again. The original async wrapper path also allocated a state machine for this branch. This is allocation-neutral in the common case — the PR description's claim of "avoiding a state machine + closure allocation" applies more to TestCoordinator.cs than here.

The non-executor fast path (return test.InvokeTestAsync(...) directly) is a genuine win — zero overhead.

4. ShouldRetry — Task.FromResult allocates on every call

Task.FromResult<T> allocates a new Task<bool> on each invocation. Since the original async Task<bool> also allocated a state machine, neither is zero-cost. A simple improvement:

private static readonly Task<bool> s_shouldRetryTrue = Task.FromResult(true);
private static readonly Task<bool> s_shouldRetryFalse = Task.FromResult(false);

Low priority micro-optimization, but it is cheap to add and eliminates the allocation on the retry hot path.


Minor

  • TestMethodInvoker and TestExecutor.ExecuteTestAsync contain identical logic for the testExecutor branch. Pre-existing duplication, not introduced here, but a future consolidation target.
  • The DiscoveryCircuitBreaker.Dispose() null-check removal (_stopwatch?.Stop()_stopwatch.Stop() guarded by #if !NET) is correct since _stopwatch is always initialized in that branch.

Overall a clean, well-motivated PR. The TestCoordinator.cs simplification and DateTimeOffset caching changes in TestBuilder/TestBuilderPipeline are the clearest wins. The async-elision changes are correct but the allocation benefit is more nuanced than described for the discovery hook and TestMethodInvoker paths. Looks good to merge.

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review (Follow-up)

This review compares against the previous Claude review to check which concerns have been addressed.


What's been addressed ✅

Issue #4ShouldRetry caching: The suggestion to cache Task.FromResult(true/false) as static readonly fields has been implemented:

private static readonly Task<bool> s_shouldRetryTrue = Task.FromResult(true);
private static readonly Task<bool> s_shouldRetryFalse = Task.FromResult(false);

This is a real improvement — eliminates one Task<bool> allocation per retry on both early-exit paths. Good addition.


Still open from previous review

Issue #1 — PR description inaccuracy: The description still says "Cache DateTimeOffset.UtcNow in TestStateManager.MarkFailed", but TestStateManager.cs is not in the diff. Worth correcting the description to avoid confusing reviewers.

Issue #2.AsTask() allocation for discovery hooks: ExecuteBeforeTestDiscoveryHooksAsync / ExecuteAfterTestDiscoveryHooksAsync in TestExecutor.cs convert a (nearly always incomplete) ValueTask to Task via .AsTask(), which allocates a Task object. The original async Task wrappers also allocated a state machine. These are roughly allocation-equivalent — this is a style change, not a perf win. Not a blocker, but the PR description shouldn't count this as a reduction in state machine overhead.

Issue #3TestMethodInvoker Task→ValueTask→Task round-trip: The new ValueTask(task).AsTask() path is still present. Same concern as above — the fast path (no executor) is a real win; the executor path is allocation-neutral, not an improvement.


New observations

DiscoveryCircuitBreaker.Dispose() — null-check removal correctness: The change from _stopwatch?.Stop() to _stopwatch.Stop() (under #if !NET) is safe because _stopwatch is assigned in the constructor on that branch. However, DiscoveryCircuitBreaker is sealed, so there's no risk of a derived class calling Dispose() before the constructor runs. Correct as written.

GetResourceUsage — minor improvement: Caching elapsed to avoid calling GetElapsed() twice in the same method is a small but clean win, independent of the #if NET optimization.


Summary

The s_shouldRetryTrue/False caching addition is a concrete improvement over the previous version. The TestCoordinator.cs lambda-to-method-group change and TestBuilder/TestBuilderPipeline timestamp caching remain the clearest wins in the PR. The remaining open items (#1, #2, #3) from the prior review are non-blocking. This looks good to merge as-is, with a minor suggestion to correct the PR description.

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review (Follow-up on latest commit)

This reviews the latest commit 74e63ecc7b33 perf: eliminate unnecessary Task/ValueTask round-trips against the previous Claude review.


All prior issues addressed ✅

Issue #2.AsTask() allocation for discovery hooks: Now fixed. ExecuteBeforeTestDiscoveryHooksAsync and ExecuteAfterTestDiscoveryHooksAsync in TestExecutor.cs now return ValueTask directly (pass-through from _hookExecutor), eliminating the .AsTask() heap allocation. This is a genuine perf improvement over the previous version.

Issue #3TestMethodInvoker Task→ValueTask→Task round-trip: Now fixed. The return type is changed to ValueTask and the .AsTask() call in the executor branch is gone. The fast path wraps Task in new ValueTask(task) (struct, no heap allocation). Clean.

TestExecutor.csnew ValueTask<List<Exception>>(task.AsTask()) wrappers: All four occurrences are gone. _hookExecutor.ExecuteAfterTestSessionHooksAsync and ExecuteAfterAssemblyHooksAsync now return ValueTask<List<Exception>> directly to their callers, skipping the unnecessary Task conversion. This is the most impactful change in the latest commit.


Remaining minor items (non-blocking)

  • Missing newline at EOF in TestMethodInvoker.cs: The No newline at end of file marker is still present. Minor, but worth a one-character fix.
  • Pre-existing duplication: TestMethodInvoker.InvokeTestAsync and TestExecutor.ExecuteTestAsync still both contain the testExecutor.ExecuteTest(...) pattern independently. Not introduced by this PR — a future consolidation opportunity.

Summary

The three-commit arc of this PR is well-structured: initial optimization → fix no-op + cache Task.FromResult → eliminate .AsTask() round-trips. Each commit responds directly to review feedback. The final state is meaningfully better than the original in all targeted paths. Ready to merge.

This was referenced Mar 23, 2026
github-actions bot pushed a commit to IntelliTect/CodingGuidelines that referenced this pull request Mar 23, 2026
[//]: # (dependabot-start)
⚠️  **Dependabot is rebasing this PR** ⚠️ 

Rebasing might not happen immediately, so don't worry if this takes some
time.

Note: if you make any changes to this PR yourself, they will take
precedence over the rebase.

---

[//]: # (dependabot-end)

Updated [TUnit.Core](https://github.com/thomhurst/TUnit) from 1.19.57 to
1.21.6.

<details>
<summary>Release notes</summary>

_Sourced from [TUnit.Core's
releases](https://github.com/thomhurst/TUnit/releases)._

## 1.21.6

<!-- Release notes generated using configuration in .github/release.yml
at v1.21.6 -->

## What's Changed
### Other Changes
* perf: replace object locks with Lock type for efficient
synchronization by @​thomhurst in
thomhurst/TUnit#5219
* perf: parallelize test metadata collection for source-generated tests
by @​thomhurst in thomhurst/TUnit#5221
* perf: use GetOrAdd args overload to eliminate closure allocations in
event receivers by @​thomhurst in
thomhurst/TUnit#5222
* perf: self-contained TestEntry<T> with consolidated switch invokers
eliminates per-test JIT by @​thomhurst in
thomhurst/TUnit#5223
### Dependencies
* chore(deps): update tunit to 1.21.0 by @​thomhurst in
thomhurst/TUnit#5220


**Full Changelog**:
thomhurst/TUnit@v1.21.0...v1.21.6

## 1.21.0

<!-- Release notes generated using configuration in .github/release.yml
at v1.21.0 -->

## What's Changed
### Other Changes
* perf: reduce ConcurrentDictionary closure allocations in hot paths by
@​thomhurst in thomhurst/TUnit#5210
* perf: reduce async state machine overhead in test execution pipeline
by @​thomhurst in thomhurst/TUnit#5214
* perf: reduce allocations in EventReceiverOrchestrator and
TestContextExtensions by @​thomhurst in
thomhurst/TUnit#5212
* perf: skip timeout machinery when no timeout configured by @​thomhurst
in thomhurst/TUnit#5211
* perf: reduce allocations and lock contention in ObjectTracker by
@​thomhurst in thomhurst/TUnit#5213
* Feat/numeric tolerance by @​agray in
thomhurst/TUnit#5110
* perf: remove unnecessary lock in ObjectTracker.TrackObjects by
@​thomhurst in thomhurst/TUnit#5217
* perf: eliminate async state machine in
TestCoordinator.ExecuteTestAsync by @​thomhurst in
thomhurst/TUnit#5216
* perf: eliminate LINQ allocation in ObjectTracker.UntrackObjectsAsync
by @​thomhurst in thomhurst/TUnit#5215
* perf: consolidate module initializers into single .cctor via partial
class by @​thomhurst in thomhurst/TUnit#5218
### Dependencies
* chore(deps): update tunit to 1.20.0 by @​thomhurst in
thomhurst/TUnit#5205
* chore(deps): update dependency nunit3testadapter to 6.2.0 by
@​thomhurst in thomhurst/TUnit#5206
* chore(deps): update dependency cliwrap to 3.10.1 by @​thomhurst in
thomhurst/TUnit#5207


**Full Changelog**:
thomhurst/TUnit@v1.20.0...v1.21.0

## 1.20.0

<!-- Release notes generated using configuration in .github/release.yml
at v1.20.0 -->

## What's Changed
### Other Changes
* Fix inverted colors in HTML report ring chart due to locale-dependent
decimal formatting by @​Copilot in
thomhurst/TUnit#5185
* Fix nullable warnings when using Member() on nullable properties by
@​Copilot in thomhurst/TUnit#5191
* Add CS8629 suppression and member access expression matching to
IsNotNullAssertionSuppressor by @​Copilot in
thomhurst/TUnit#5201
* feat: add ConfigureAppHost hook to AspireFixture by @​thomhurst in
thomhurst/TUnit#5202
* Fix ConfigureTestConfiguration being invoked twice by @​thomhurst in
thomhurst/TUnit#5203
* Add IsEquivalentTo assertion for Memory<T> and ReadOnlyMemory<T> by
@​thomhurst in thomhurst/TUnit#5204
### Dependencies
* chore(deps): update dependency gitversion.tool to v6.6.2 by
@​thomhurst in thomhurst/TUnit#5181
* chore(deps): update dependency gitversion.msbuild to 6.6.2 by
@​thomhurst in thomhurst/TUnit#5180
* chore(deps): update tunit to 1.19.74 by @​thomhurst in
thomhurst/TUnit#5179
* chore(deps): update verify to 31.13.3 by @​thomhurst in
thomhurst/TUnit#5182
* chore(deps): update verify to 31.13.5 by @​thomhurst in
thomhurst/TUnit#5183
* chore(deps): update aspire to 13.1.3 by @​thomhurst in
thomhurst/TUnit#5189
* chore(deps): update dependency stackexchange.redis to 2.12.4 by
@​thomhurst in thomhurst/TUnit#5193
* chore(deps): update microsoft/setup-msbuild action to v3 by
@​thomhurst in thomhurst/TUnit#5197


**Full Changelog**:
thomhurst/TUnit@v1.19.74...v1.20.0

## 1.19.74

<!-- Release notes generated using configuration in .github/release.yml
at v1.19.74 -->

## What's Changed
### Other Changes
* feat: per-hook activity spans with method names by @​thomhurst in
thomhurst/TUnit#5159
* fix: add tooltip to truncated span names in HTML report by @​thomhurst
in thomhurst/TUnit#5164
* Use enum names instead of numeric values in test display names by
@​Copilot in thomhurst/TUnit#5178
* fix: resolve CS8920 when mocking interfaces whose members return
static-abstract interfaces by @​lucaxchaves in
thomhurst/TUnit#5154
### Dependencies
* chore(deps): update tunit to 1.19.57 by @​thomhurst in
thomhurst/TUnit#5157
* chore(deps): update dependency gitversion.msbuild to 6.6.1 by
@​thomhurst in thomhurst/TUnit#5160
* chore(deps): update dependency gitversion.tool to v6.6.1 by
@​thomhurst in thomhurst/TUnit#5161
* chore(deps): update dependency polyfill to 9.20.0 by @​thomhurst in
thomhurst/TUnit#5163
* chore(deps): update dependency polyfill to 9.20.0 by @​thomhurst in
thomhurst/TUnit#5162
* chore(deps): update dependency polyfill to 9.21.0 by @​thomhurst in
thomhurst/TUnit#5166
* chore(deps): update dependency polyfill to 9.21.0 by @​thomhurst in
thomhurst/TUnit#5167
* chore(deps): update dependency polyfill to 9.22.0 by @​thomhurst in
thomhurst/TUnit#5168
* chore(deps): update dependency polyfill to 9.22.0 by @​thomhurst in
thomhurst/TUnit#5169
* chore(deps): update dependency coverlet.collector to 8.0.1 by
@​thomhurst in thomhurst/TUnit#5177

## New Contributors
* @​lucaxchaves made their first contribution in
thomhurst/TUnit#5154

**Full Changelog**:
thomhurst/TUnit@v1.19.57...v1.19.74

Commits viewable in [compare
view](thomhurst/TUnit@v1.19.57...v1.21.6).
</details>

[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=TUnit.Core&package-manager=nuget&previous-version=1.19.57&new-version=1.21.6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant