Skip to content

JIT: Optimize AsyncHelpers await helpers and enable inlining for them #128528

Draft
jakobbotsch wants to merge 14 commits into
dotnet:mainfrom
jakobbotsch:inline-await
Draft

JIT: Optimize AsyncHelpers await helpers and enable inlining for them #128528
jakobbotsch wants to merge 14 commits into
dotnet:mainfrom
jakobbotsch:inline-await

Conversation

@jakobbotsch
Copy link
Copy Markdown
Member

@jakobbotsch jakobbotsch commented May 23, 2026

  • Allow inlining of async functions when all awaits are tail awaits (using AsyncHelpers.TailAwait)
  • Rewrite all await helpers in the tail await shape to allow inlining of them
  • Optimize all await paths to use hand rolled continuations with caching and that avoids additional async1 continuation allocations

Based on #128320

ValueTaskPerf benchmarks from dotnet/performance:

Method Toolchain Mean Error StdDev Median Min Max Ratio
Await_FromResult \main\corerun.exe 12.010 ns 0.2031 ns 0.1900 ns 12.024 ns 11.730 ns 12.353 ns 1.00
Await_FromResult \pr\corerun.exe 8.318 ns 0.0989 ns 0.0925 ns 8.347 ns 8.180 ns 8.434 ns 0.69
Await_FromCompletedTask \main\corerun.exe 16.347 ns 0.5397 ns 0.6215 ns 16.225 ns 15.546 ns 17.593 ns 1.00
Await_FromCompletedTask \pr\corerun.exe 13.291 ns 0.2445 ns 0.2168 ns 13.230 ns 13.030 ns 13.715 ns 0.81
Await_FromCompletedValueTaskSource \main\corerun.exe 26.584 ns 2.2930 ns 2.6407 ns 25.976 ns 23.479 ns 30.889 ns 1.00
Await_FromCompletedValueTaskSource \pr\corerun.exe 22.867 ns 0.1866 ns 0.1654 ns 22.812 ns 22.623 ns 23.281 ns 0.87
CreateAndAwait_FromResult \main\corerun.exe 11.605 ns 0.1786 ns 0.1670 ns 11.556 ns 11.342 ns 11.898 ns 1.00
CreateAndAwait_FromResult \pr\corerun.exe 8.446 ns 0.1681 ns 0.1726 ns 8.421 ns 8.217 ns 8.719 ns 0.73
CreateAndAwait_FromResult_ConfigureAwait \main\corerun.exe 15.849 ns 0.3147 ns 0.2944 ns 16.007 ns 15.381 ns 16.277 ns 1.00
CreateAndAwait_FromResult_ConfigureAwait \pr\corerun.exe 8.364 ns 0.1429 ns 0.1337 ns 8.421 ns 8.098 ns 8.544 ns 0.53
CreateAndAwait_FromCompletedTask \main\corerun.exe 11.933 ns 0.1488 ns 0.1392 ns 11.916 ns 11.704 ns 12.190 ns 1.00
CreateAndAwait_FromCompletedTask \pr\corerun.exe 10.023 ns 0.0853 ns 0.0798 ns 10.030 ns 9.831 ns 10.156 ns 0.84
CreateAndAwait_FromCompletedTask_ConfigureAwait \main\corerun.exe 14.678 ns 0.3420 ns 0.3938 ns 14.517 ns 14.262 ns 15.185 ns 1.00
CreateAndAwait_FromCompletedTask_ConfigureAwait \pr\corerun.exe 10.986 ns 0.0637 ns 0.0564 ns 10.973 ns 10.904 ns 11.091 ns 0.75
CreateAndAwait_FromCompletedValueTaskSource \main\corerun.exe 13.822 ns 0.0354 ns 0.0331 ns 13.812 ns 13.772 ns 13.884 ns 1.00
CreateAndAwait_FromCompletedValueTaskSource \pr\corerun.exe 14.331 ns 0.0870 ns 0.0813 ns 14.313 ns 14.235 ns 14.465 ns 1.04
CreateAndAwait_FromYieldingAsyncMethod \main\corerun.exe 261.416 ns 3.3161 ns 3.1019 ns 261.140 ns 257.977 ns 269.236 ns 1.00
CreateAndAwait_FromYieldingAsyncMethod \pr\corerun.exe 249.452 ns 3.6599 ns 3.0562 ns 249.537 ns 244.468 ns 254.448 ns 0.95
CreateAndAwait_FromDelayedTCS \main\corerun.exe 1,830.124 ns 89.6522 ns 103.2436 ns 1,801.707 ns 1,615.933 ns 2,016.293 ns 1.00
CreateAndAwait_FromDelayedTCS \pr\corerun.exe 1,003.680 ns 44.8132 ns 51.6070 ns 976.779 ns 941.729 ns 1,099.581 ns 0.55
Copy_PassAsArgumentAndReturn_FromResult \main\corerun.exe 4.759 ns 0.0122 ns 0.0114 ns 4.760 ns 4.741 ns 4.779 ns 1.00
Copy_PassAsArgumentAndReturn_FromResult \pr\corerun.exe 4.777 ns 0.0197 ns 0.0184 ns 4.782 ns 4.745 ns 4.804 ns 1.00
Copy_PassAsArgumentAndReturn_FromTask \main\corerun.exe 8.188 ns 0.0148 ns 0.0131 ns 8.191 ns 8.160 ns 8.213 ns 1.00
Copy_PassAsArgumentAndReturn_FromTask \pr\corerun.exe 8.236 ns 0.1016 ns 0.0950 ns 8.199 ns 8.142 ns 8.423 ns 1.01
Copy_PassAsArgumentAndReturn_FromValueTaskSource \main\corerun.exe 12.425 ns 0.1548 ns 0.1448 ns 12.410 ns 12.264 ns 12.664 ns 1.00
Copy_PassAsArgumentAndReturn_FromValueTaskSource \pr\corerun.exe 12.650 ns 0.0285 ns 0.0253 ns 12.651 ns 12.612 ns 12.690 ns 1.02
CreateAndAwait_FromCompletedValueTaskSource_ConfigureAwait \main\corerun.exe 16.194 ns 0.1986 ns 0.1858 ns 16.182 ns 15.943 ns 16.477 ns 1.00
CreateAndAwait_FromCompletedValueTaskSource_ConfigureAwait \pr\corerun.exe 14.792 ns 0.1161 ns 0.1086 ns 14.767 ns 14.663 ns 15.080 ns 0.91

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the runtime-async await helper pipeline to enable more inlining (when all awaits are “tail awaits”) and to reduce allocations by using cached, hand-rolled continuations for Task / ValueTaskSource paths. It also updates async thunk IL generation (CoreCLR + tool) and adjusts the JIT importer to support inlining for tail-await-only async callees and to inherit async context args from the inliner when needed.

Changes:

  • Rewrites AsyncHelpers.Await* helpers into a tail-await shape and introduces cached TaskContinuation / ValueTaskSourceContinuation continuations to reduce continuation allocations.
  • Updates CoreCLR async thunk emission to tail-await TransparentAwait* helpers (including new generic TransparentAwaitOfT) and mirrors the change in the managed type-system stub generator.
  • Updates the JIT importer to (a) allow inlining async callees when awaits are marked tail-await, and (b) propagate async context args from the inliner into inlined async calls.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/tests/async/valuetask-source/valuetask-source.cs Adds debug output (currently noisy) while validating scheduling-context flags in a ValueTaskSource test.
src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/TaskAwaiter.cs Makes ConfiguredTaskAwaiter<TResult>.m_task internal to enable fast-path access from AsyncHelpers.
src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/ConfiguredValueTaskAwaitable.cs Makes configured ValueTask awaitable backing fields internal for fast-path access.
src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.cs Rewrites await helpers to tail-await and route through optimized AwaitTask / AwaitValueTaskSource paths.
src/coreclr/vm/corelib.h Adds binder entry for AsyncHelpers.TransparentAwaitOfT.
src/coreclr/vm/asyncthunks.cpp Emits thunk IL that tail-awaits TransparentAwait* and returns immediately (incl. generic Task<T> case).
src/coreclr/tools/Common/TypeSystem/IL/Stubs/AsyncThunks.cs Mirrors thunk IL updates in the managed stub generator (adds ret after transparent await).
src/coreclr/System.Private.CoreLib/System.Private.CoreLib.csproj Switches async helper continuation sources to new TaskContinuation and ValueTaskSourceContinuation files; minor formatting tweaks.
src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncProfiler.CoreCLR.cs Updates comment text to reflect new continuation type naming.
src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.ValueTaskSourceContinuation.cs Renames/specializes continuation for IValueTaskSource* only and simplifies result handling accordingly.
src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.TaskContinuation.cs Introduces cached continuation that can also perform context-queueing before dispatching the runtime-async task.
src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.CoreCLR.cs Adds AwaitTask* helpers, splits ValueTask handling between Task vs ValueTaskSource, and updates dispatch logic to use new continuations/caches.
src/coreclr/nativeaot/System.Private.CoreLib/src/System.Private.CoreLib.csproj Includes the new continuation source files for NativeAOT CoreLib build.
src/coreclr/jit/importercalls.cpp Enables inlining for tail-await-only async callees; adds async-context inheritance for inlined async calls; renames ldvirtftn helper.
src/coreclr/jit/compiler.h Updates declarations for renamed async-arg insertion helper and adds async-context inheritance helper.
src/coreclr/debug/di/rsstackwalk.cpp Updates comment text to reflect new continuation type naming.

// - in optimized (as in JitOptimizeAwait=1, which is the default) case
// we do not distinguish "false" config vs. having default/no scheduling context,
// and "None" is passed in either case.
System.Console.WriteLine(sUseContext.trace);
Comment on lines +7056 to +7066
//------------------------------------------------------------------------
// impAddAsyncArgsToInlinedCall:
// Inherit async args from inlining call as part of a new async call.
//
// Arguments:
// call - The async call
//
// Remarks:
// Currently we only allow inlining of async calls when all awaits are tail
// awaits. In that case inlining is simplified as we can just inherit
// everything from the inlining call.
Comment on lines 7100 to +7112
@@ -7050,7 +7109,7 @@ void Compiler::impSetupAsyncCall(GenTreeCall* call, OPCODE opcode, unsigned pref
// Should be called before the 'this' arg is inserted, but after other IL args
// have been inserted.
//
void Compiler::impInsertAsyncContinuationForLdvirtftnCall(GenTreeCall* call)
void Compiler::impInsertAsyncArgsForLdvirtftnCall(GenTreeCall* call)
Debug.Assert(continuationContext is TaskScheduler { });
TaskScheduler sched = (TaskScheduler)continuationContext;

// TODO: We do not need TaskSchedulerAwaitTaskContinuation here, just need to refactor its Run method...
Comment on lines +95 to +96
var taskSchedCont = new TaskSchedulerAwaitTaskContinuation(sched, (Action)RuntimeAsyncTask.m_action!, flowExecutionContext: false);
taskSchedCont.Run(Task.CompletedTask, canInlineContinuationTask: true);
Comment on lines +65 to 73
[MethodImpl(MethodImplOptions.Async | MethodImplOptions.AggressiveInlining)]
[StackTraceHidden]
public static T Await<T>(Task<T> task)
{
TaskAwaiter<T> awaiter = task.GetAwaiter();
if (!awaiter.IsCompleted)
if (!task.IsCompleted)
{
UnsafeAwaitAwaiter(awaiter);
TailAwait();
return AwaitTask(task, ConfigureAwaitOptions.ContinueOnCapturedContext);
}
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @dotnet/area-system-threading-tasks
See info in area-owners.md if you want to be subscribed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants