Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce async overhead #2111

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

timcassell
Copy link
Collaborator

@timcassell timcassell commented Sep 20, 2022

Followup to #2108, (redo of #1968 with more tests).

Refactored delegates to reduce async measurement overhead (sync measurements are the same).

Master:

|                      Method |          Mean |      Error |     StdDev |  Gen 0 | Allocated |
|---------------------------- |--------------:|-----------:|-----------:|-------:|----------:|
|          AsyncYieldTaskVoid | 1,777.6317 ns | 11.4813 ns | 10.1779 ns | 0.0381 |     120 B |
|      AsyncImmediateTaskVoid |     2.3197 ns |  0.0294 ns |  0.0246 ns |      - |         - |
|     AsyncYieldValueTaskVoid | 1,852.0409 ns |  2.9352 ns |  2.7456 ns | 0.0381 |     120 B |
| AsyncImmediateValueTaskVoid |    15.3422 ns |  0.1608 ns |  0.1505 ns |      - |         - |
|                    SyncVoid |     0.1591 ns |  0.0042 ns |  0.0037 ns |      - |         - |
|           AsyncYieldTaskInt | 1,850.0632 ns | 13.8105 ns | 12.2426 ns | 0.0381 |     120 B |
|       AsyncImmediateTaskInt |    39.0781 ns |  0.2674 ns |  0.2371 ns | 0.0229 |      72 B |
|      AsyncYieldValueTaskInt | 1,889.4795 ns | 19.5146 ns | 18.2539 ns | 0.0401 |     128 B |
|  AsyncImmediateValueTaskInt |    41.2959 ns |  0.1417 ns |  0.1183 ns |      - |         - |
|                     SyncInt |     0.3043 ns |  0.0089 ns |  0.0083 ns |      - |         - |

This PR:

|                      Method |        Job | UnrollFactor |          Mean |      Error |      StdDev |  Gen 0 | Allocated |
|---------------------------- |----------- |------------- |--------------:|-----------:|------------:|-------:|----------:|
|                    SyncVoid | DefaultJob |           16 |     0.1627 ns |  0.0058 ns |   0.0051 ns |      - |         - |
|                     SyncInt | DefaultJob |           16 |     0.3143 ns |  0.0363 ns |   0.0322 ns |      - |         - |
|          AsyncYieldTaskVoid | Job-CXHPYP |            1 | 1,461.6091 ns | 80.2756 ns | 235.4344 ns | 0.0381 |     120 B |
|      AsyncImmediateTaskVoid | Job-CXHPYP |            1 |     4.4395 ns |  0.0255 ns |   0.0238 ns |      - |         - |
|     AsyncYieldValueTaskVoid | Job-CXHPYP |            1 | 1,543.3745 ns | 30.5159 ns |  57.3162 ns | 0.0381 |     120 B |
| AsyncImmediateValueTaskVoid | Job-CXHPYP |            1 |    11.1546 ns |  0.0417 ns |   0.0390 ns |      - |         - |
|           AsyncYieldTaskInt | Job-CXHPYP |            1 | 1,597.0157 ns | 15.7257 ns |  13.9404 ns | 0.0381 |     120 B |
|       AsyncImmediateTaskInt | Job-CXHPYP |            1 |    43.2262 ns |  0.1489 ns |   0.1393 ns | 0.0229 |      72 B |
|      AsyncYieldValueTaskInt | Job-CXHPYP |            1 | 1,576.6878 ns | 30.8910 ns |  53.2854 ns | 0.0401 |     128 B |
|  AsyncImmediateValueTaskInt | Job-CXHPYP |            1 |    33.3715 ns |  0.2657 ns |   0.2356 ns |      - |         - |

Merging this PR will also make it possible to add async engine support in the future.

Also added IterationSetup/Cleanup async support (As part of the refactor, it was easier to add the support than not to (code re-use). I thought it could be useful, so I added tests for it also).

@timcassell
Copy link
Collaborator Author

timcassell commented Oct 19, 2022

@YegorStepanov Moving the conversation here since it's off topic for #2114.

Since you are writing a library for Unity, I thought it was related to Unity.

Anyway, we need to support taskable object to support UniTask and UnityEngine.Awaitable/UnityTask without converting them to Task.

To do this, we need to slightly modify your AwaitHelper

Aye, I thought about supporting awaitable types other than Task and ValueTask, but it's hard to do because unfortunately there is no awaitable interface that awaitables must implement to build off of (C# uses duck typing for awaitables). It's easy to do in code-gen and IL Emit, but how could that be done in InProcessNoEmitToolchain? How could I add a method in AwaitHelper to accept an arbitrary awaitable? The only solution I could think of was to add an awaitable interface that users can implement for a wrapper to forward to their custom awaitable type. But that's pretty ugly. [Edit] Technically, if a custom awaitable is convertible to a ValueTask (like UniTask), then it's already doing that (though not exactly).

And none of that has anything to do with the runtime, that's all compiler/dynamic code/DI stuff.

@YegorStepanov
Copy link
Contributor

How could I add a method in AwaitHelper to accept an arbitrary awaitable?

Like I did #2114:

  1. it should accept object
  2. via reflection we get awaitable methods (by name + signature)
  3. we need to await it...
    If I'm not stupid now, we need to write similar code that the await generates, but for object....

Let InProcessNoEmitToolchain not work with this (it already doesn't work with Arguments). It's not very important.

@timcassell
Copy link
Collaborator Author

timcassell commented Oct 19, 2022

  1. it should accept object

Eh... that's 2 boxes (UniTask and its awaiter are structs), and a 3rd box if the result is a struct. I suppose that's acceptable as long as it's not done in the workload method.

  1. via reflection we get awaitable methods (by name + signature)

This would not work for awaitable extensions.

  1. we need to await it...
    If I'm not stupid now, we need to write similar code that the await generates, but for object....

Do-able, but still pretty heavy (I mentioned boxing above).

Technically, once this PR is merged, there will be no need to add custom awaitable support for GetResult, as only the engine uses it (and once the engine goes full async, that will no longer be needed, either), unless it's really needed for testing purposes. We could add that overload for ToValueTaskVoid to make it easier to implement async MyCustomAwaitable GlobalSetup etc instead of full code-gen, though it will be a bit heavier (acceptable as long as it's not workload method)... Or we could only allow (Value)Task for GlobalSetup etc methods.

Let InProcessNoEmitToolchain not work with this (it already doesn't work with Arguments). It's not very important.

I mean, if everyone is okay with that, I'd be willing to do the work to add it to the toolchains that it can work in. But awaitable extensions still won't work, unless we provide a way for the user to tell us what type the extensions are in.

And anyway, neither of those toolchains are usable in Unity's IL2CPP currently. We'd need a new Unity toolchain (#1860), or a new source-gen toolchain (#1770).

@YegorStepanov
Copy link
Contributor

Eh... that's 2 boxes (UniTask and its awaiter are structs), and a 3rd box if the result is a struct. I suppose that's acceptable as long as it's not done in the workload method.

We can calculate the boxing size and subtract it from the result.

But awaitable extensions still won't work, unless we provide a way for the user to tell us what type the extensions are in.

I forgot about it. I remembered how someone made bool awaitable.
Can we enumerate all assemblies and find extensions methods?

We need InProcessNoEmitToolchain only for iOS?

SourceGeneratorToolchain will also fix the missing Arguments for InProcessNoEmitToolchain.

@timcassell
Copy link
Collaborator Author

timcassell commented Oct 19, 2022

We can calculate the boxing size and subtract it from the result.

We can't calculate the extra time that the boxing and reflection will consume, though. But that shouldn't matter anyway if we're not doing that for workload methods (we don't measure setup/cleanup methods in the diagnosers).

I forgot about it. I remembered how someone made bool awaitable. Can we enumerate all assemblies and find extensions methods?

What if there are multiple extensions for the same type? We might pick the wrong one. Also, user may not want a return value to be awaited, only consumed normally (like your bool example).

We need InProcessNoEmitToolchain only for iOS?

IL2CPP is used on many platforms. Required on WebGL, optional on WindowsStandalone.

SourceGeneratorToolchain will also fix the missing Arguments for InProcessNoEmitToolchain.

Aye

@YegorStepanov
Copy link
Contributor

What if there are multiple extensions for the same type?

What is a niche case that needs to be handled.
It requires for SourceGenToolchain too.
The user will have to pass the type to the BDN via attribute or config.

We need InProcessNoEmitToolchain only for iOS?

IL2CPP is used on many platforms. Required on WebGL, optional on WindowsStandalone.

I'm not talking about Unity. InProcessNoEmitToolchain looks outdated. I wonder why it wasn't removed.
@adamsitnik What are the use cases for InProcessNoEmitToolchain? Xamarin.IOS only?
Can SourceGenToolchain(#1770) be used for MAUI/Xamarin.IOS?

@timcassell
Copy link
Collaborator Author

timcassell commented Oct 20, 2022

If we do awaitable interface forwarding, it could look something like this:

Details

public interface IForwardAwaitable<TAwaitable, TAwaiter>
{
    TAwaiter GetAwaiter(ref TAwaitable awaitable);
    bool IsCompleted(ref TAwaiter awaiter);
    void OnCompleted(ref TAwaiter awaiter, Action continuation);
    void GetResult(ref TAwaiter awaiter);
}

[AttributeUsage(AttributeTargets.Method, AllowMultiple = false)]
public class AwaitReturnAttribute : Attribute
{
    public Type ForwardAwaitableType { get; }

    public AwaitReturnAttribute(Type forwardAwaitableType)
    {
        ForwardAwaitableType = forwardAwaitableType;
    }
}

With example:

public struct UniTaskForwardAwait : IForwardAwaitable<UniTask, UniTask.Awaiter>
{
    public UniTask.Awaiter GetAwaiter(ref UniTask awaitable) => awaitable.GetAwaiter();
    public void GetResult(ref UniTask.Awaiter awaiter) => awaiter.GetResult();
    public bool IsCompleted(ref UniTask.Awaiter awaiter) => awaiter.IsCompleted;
    public void OnCompleted(ref UniTask.Awaiter awaiter, Action continuation) => awaiter.UnsafeOnCompleted(continuation);
}

public struct UniTaskForwardAwait<T> : IForwardAwaitable<UniTask<T>, UniTask<T>.Awaiter>
{
    public UniTask<T>.Awaiter GetAwaiter(ref UniTask<T> awaitable) => awaitable.GetAwaiter();
    public void GetResult(ref UniTask<T>.Awaiter awaiter) => awaiter.GetResult();
    public bool IsCompleted(ref UniTask<T>.Awaiter awaiter) => awaiter.IsCompleted;
    public void OnCompleted(ref UniTask<T>.Awaiter awaiter, Action continuation) => awaiter.UnsafeOnCompleted(continuation);
}

public class AsyncBenchmark
{
    [Benchmark]
    [AwaitReturn(typeof(UniTaskForwardAwait))]
    public async UniTask AsyncUniTask()
    {
        // ...
    }

    [Benchmark]
    [AwaitReturn(typeof(UniTaskForwardAwait<int>))]
    public async UniTask<int> AsyncUniTaskInt()
    {
        // ...
    }

    [Benchmark]
    public async Task AsyncTask()
    {
        // ...
    }

    [Benchmark]
    public async Task<int> AsyncTaskInt()
    {
        // ...
    }
}

[Edit] Or a slightly simpler form:

public interface IForwardAwaitable<TAwaitable>
{
    void GetAndStoreAwaiter(ref TAwaitable awaitable);
    bool IsCompleted { get; }
    void OnCompleted(Action continuation);
    void GetResult();
}
public struct UniTaskForwardAwait : IForwardAwaitable<UniTask>
{
    private UniTask.Awaiter awaiter;

    public void GetAndStoreAwaiter(ref UniTask awaitable) => awaiter = awaitable.GetAwaiter();
    public bool IsCompleted => awaiter.IsCompleted;
    public void GetResult() => awaiter.GetResult();
    public void OnCompleted(Action continuation) => awaiter.UnsafeOnCompleted(continuation);
}

This would work for extensions and InProcessNoEmitToolchain (and no boxing!). It just requires a bit more work from the user.

# Conflicts:
#	tests/BenchmarkDotNet.IntegrationTests/InProcessTest.cs
…ead-new

# Conflicts:
#	tests/BenchmarkDotNet.IntegrationTests/InProcessTest.cs
@YegorStepanov
Copy link
Contributor

It seems to me that your solution can do a source generator (this is not related to SourceGenToolchain, because I don't understand what it exactly will do).

When the user writes:

[Benchmark] public UniTask Method();

SourceGenerator generates your adapter.

@timcassell
Copy link
Collaborator Author

timcassell commented Oct 21, 2022

It seems to me that your solution can do a source generator (this is not related to SourceGenToolchain, because I don't understand what it exactly will do).

When the user writes:

[Benchmark] public UniTask Method();

SourceGenerator generates your adapter.

Well yeah, like I said it's easy for source gen and IL emit, but that would be required by the user for InProcessNoEmitToolchain and it would also be required if they wanted to use await extensions. And I think it would be weird to sometimes require it and sometimes not.

@YegorStepanov
Copy link
Contributor

Anyway, SourceGenerators doesn't support F#/VB, so your attribute should be available.

I don't think await extensions are really being used. Especially if there are multiple extensions.

Such cases can be resolved by an analyser or validator:

BDN cannot find await methods, please specify them with the AwaitReturnAttribute.

@timcassell
Copy link
Collaborator Author

Well, would you like to open a new issue for it? I don't want to include it in this PR. I can do a followup PR for it if BDN authorities say the proposal is good.

@YegorStepanov
Copy link
Contributor

Oh, it's just discussion, not a suggestion for this PR.

There is currently no need for such support for async.
When Unity support gets closer, we can get back to the "taskable" problem

# Conflicts:
#	src/BenchmarkDotNet/Engines/IEngine.cs
#	src/BenchmarkDotNet/Toolchains/InProcess.Emit.Implementation/ConsumableTypeInfo.cs
#	src/BenchmarkDotNet/Toolchains/InProcess.Emit.Implementation/Emitters/RunnableEmitter.cs
# Conflicts:
#	src/BenchmarkDotNet/Toolchains/InProcess/BenchmarkAction.cs
#	src/BenchmarkDotNet/Toolchains/InProcess/BenchmarkActionFactory.cs
#	src/BenchmarkDotNet/Toolchains/InProcess/BenchmarkActionFactory_Implementations.cs
#	src/BenchmarkDotNet/Toolchains/InProcess/Emit/Implementation/Emitters/TaskConsumeEmitter.cs
#	src/BenchmarkDotNet/Toolchains/InProcess/InProcessRunner.cs
#	tests/BenchmarkDotNet.IntegrationTests/InProcessTest.cs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Please, add async support for iteration setup and cleanup
2 participants