Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for async/await #12

Open
3 of 4 tasks
Miista opened this issue Jan 13, 2024 · 31 comments · May be fixed by #29
Open
3 of 4 tasks

Add support for async/await #12

Miista opened this issue Jan 13, 2024 · 31 comments · May be fixed by #29
Assignees
Labels
enhancement New feature or request priority: high The issue has high priority
Milestone

Comments

@Miista
Copy link
Owner

Miista commented Jan 13, 2024

Plan:

Before we start working on this issue, though, I want to get the solution working first.

Related:

Blocked by:

@Miista Miista added enhancement New feature or request priority: medium The issue has medium priority blocked This issue is blocked by one or more issues labels Jan 13, 2024
@Miista Miista mentioned this issue Jan 14, 2024
8 tasks
@Miista
Copy link
Owner Author

Miista commented Jan 14, 2024

For what it's worth I no longer get an "invalid program" when I try to shim async methods. The program compiles and runs fine. The shim, however, is never invoked.

I suspect this might be (directly) related to how the compiler translates async methods into state machines. Thus, the original method (as it was in code) does not exist in the same way in the compiled code. The method does exist. Its body, however, has been moved to a state machine, emitted by the compiler, and the method merely starts the state machine.

When stepping through the code, I can see that we actively rewrite the code for AsyncTaskMethodBuilder. This means that we at least get to that level which means that we should at least be able to also rewrite the intended method.

Related resource: https://devblogs.microsoft.com/premier-developer/extending-the-async-methods-in-c/

I tried asking ChatGPT how best to solve this issue: https://chat.openai.com/share/6f30f7a9-18d6-4b68-9d20-3f557fb5d7dd

Miista pushed a commit that referenced this issue Jan 15, 2024
Miista pushed a commit that referenced this issue Jan 17, 2024
@Miista
Copy link
Owner Author

Miista commented Jan 20, 2024

I believe we need to rewrite the state machine at runtime. That is, the state machine that is emitted by the compiler.

Currently, we cannot get to that point because we do not rewrite methods on types in System.Private.CoreLib. We would need to change that so that we allow rewriting methods on AsyncMethodBuilderCore–which is the type that provides the implementation of Start.

Alternatively we could rewrite the call to SetStateMachine. However, this comes with the same change as above. The SetStateMachine method is legacy from a time where implementations of IAsyncStateMachine needed to be boxed, and the method is not called from anywhere in the async machinery code.

Please note that I have only attempted this with AsyncTaskMethodBuilder. I will need to make sure that this works with all method builders.

After some more testing (see comment below) I believe we can make do with just rewriting the Start method on AsyncMethodBuilderCore. We would need to add some more IL than what is normally present. Specifically we need to:

  1. Find the MoveNext method on the current instance of IAsyncStateMachine
  2. Rewrite it (note that the state machine may hold the call to the actual method we want to replace)
  3. Invoke it: We can do that as follows: methodBase.Invoke(stateMachine, new object[]{stateMachine}) Create a delegate from the rewritten method: We can do that as follows: methodBase.CreateDelegate(typeof(Action))
  4. Invoke it: delegate.DynamicInvoke()

Rewriting the state machine is necessary because we normally only rewrite the methods (and their calls) from the entry point. However, when we add async into the mix, the methods cannot be found in the call tree of the entry point. This is due to how async methods are compiled.

That said, rewriting the state machine enables us to follow the call tree which results from the async method calls.

Please note that AsyncMethodBuilderCore is internal to System.Private.CoreLib, meaning we can't shim it as we normally would (using Shim.Replace). Maybe this plays into #11.

Alternative

An alternative to the above would be to provide our own implementation of AsyncTaskMethodBuilder. That would mean we would have to reimplement (we can borrow from the existing implementation) for each target framework where they differ.

If we do this, it should be enough to rewrite the method in Start.

@Miista
Copy link
Owner Author

Miista commented Jan 20, 2024

I got a minor breakthrough today.

I've been able to intercept the call if I supply my own instance of AsyncTaskMethodBuilder.

There are, however, still a lot of intricacies I need to iron out. For instance, the method won't return.

@Miista Miista added priority: high The issue has high priority and removed priority: medium The issue has medium priority labels Jan 20, 2024
@Miista
Copy link
Owner Author

Miista commented Jan 21, 2024

Update on the findings from here: #12 (comment)

When I attempt to rewrite the method I am unable to properly execute the (resulting) program. At this point, I think I need to go back to the basics; I need to get the IL for the compiled assembly and then rewrite the necessary parts manually by hand to get a feeling for the IL code. Only then can I know what IL to emit.

Also mentioned in the previous comment was the need for shimming the Start method on AsyncMethodBuilderCore. My attempts went in a different direction. Instead of shimming the method I updated MethodRewriter and Stubs to be able to handle (unsuccessfully as evident from the above) async methods. The idea is that we simply rewrite the method immediately. "Rewriting" in the sense that we actually only rewrite the IL just before the call to MoveNext. Here we attempt to add the necessary IL from the previous comment. After that we skip the instruction to call the original state machine. Instead we call our rewritten state machine.

@Miista
Copy link
Owner Author

Miista commented Jan 22, 2024

Following is the IL for the Sandbox app: Sandbox.txt

What we want to do now is inject just enough IL to:

  1. Load the state machine onto the stack
  2. Get the MoveNext method from the state machine
  3. Rewrite the MoveNext method
  4. Invoke the rewritten method

The next concern is what we should do with the remaining instructions. Should we still emit them? Does it make sense?

@Miista
Copy link
Owner Author

Miista commented Jan 22, 2024

During my attempts to rewrite the state machine I've run into the issue that the rewrites complains that the MoveNext method does not contain a body. This is by definition correct as the method it attempts to rewrite belongs to an interface.

However, I think it could be fixed by simply emitting the call as-is without attempting to rewrite it.

After all, the MoveNext method for the state machine contains all of the code for the async method. That is, the entire body is contained within the MoveNext method. The body is divided into chunks delimited by calls to await i.e. every time there's an await, code is emitted to save the state of the state machine and to resume it later.

I believe we can add support for async methods. What we need to recognize is that the original async method does not exist in the compiled code. What exists instead is a state machine emitted by the compiler. This state machine implements IAsyncStateMachine (a requirement upheld by the compiler). The IAsyncStateMachine has the MoveNext method which contains:

  • The code for the async machinery, and
  • The code for the original async method (as it is defined in the source code)

To support async methods we need to recognize this fact. This means that we should not look for the method by the name given in the expression to Shim.Replace (as this name won't exist in the compiled IL). Rather we should look for the MoveNext method on the emitted state machine. Note that will require some genius code as we cannot know the names of the emitted state machines beforehand (as this is decided upon by the compiler at compile time).

Assuming that we come up with a way to correctly recognize the MoveNext method on the correct state machine, I believe it's really just a matter of emitting the code for the replacement method in the correct places when rewriting the MoveNext method.

We can fix this in the following ways:

  1. We can forward to the last label in the MoveNext method. This requires that we locate that label, ideally before we emit the first instruction. Actually, there's a simpler solution. We simply set the state as to reflect that the state machine is completed. This should make the state machine itself jump to the label which makes it return. Will need to look up this information in Jon Skeet's C# book.
  2. We emit our own state machine. This is a huge undertaking—one which I am not comfortable to undertake alone.

I think option 1 is feasible. We could locate the label by going backwards through the instructions.

Of course, this is complicated by the fact that we still need to honor all the states of the emitted state machine. In other words, if the original method contains two awaits, then the emitted state machine will have at least two states. If the replacement method does not have two awaits, then we do not honor the states, and I'm afraid the state machine won't finish or execute correctly.

I see the following solutions to this:

  1. No awaits: We make it a requirement that the replacement method does not have any awaits. This means that we won't have to delve into any of the async machinery. It is going to feel like an arbitrary requirement from the consumer's point of view. I'm not a fan of this.
  2. Single await: We allow a single await in the replacement method. Just as with the first option, this is going to feel like an arbitrary requirement. As such I am not a fan of this approach either. (Even just writing it up here, it feels arbitrary. Why 1?)
  3. Unlimited awaits: We don't put an upper limit on the number of awaits in the replacement method. This feels like the most natural approach. However, it is also the most complex approach as it requires us to emit code to store the state and retrieve it again later i.e. delve into the async machinery. Additionally, we will have to handle emitting new states into the MoveNext method. This includes:
    1. Marking an appropriate label, allowing the state machine to navigate to it.
    2. Emitting code up till the point of an await.
    3. Store the state of the state machine and then pause it. (We can draw inspiration from how the compiler does it.)
    4. Restore the state when the state machine is resumed. This leads us back to the first bullet.

Note that for this to work we cannot reuse the state numbers from the original method i.e. we need to use state numbers at least one higher than the highest state number in the original method.

Addendum: Come to think of it, the state machine can be found by reflection by inspecting the nested types of the type containing the async method. (Example follows below.) The name of the state machine will always be the (full?) name of the method suffixed by some characters which makes the name an invalid type name in C# (the name is only valid in IL). We could maybe get the type of the state machine this way when the user calls Shim.Replace. In this way, we could at least know which shims relate to which state machine.

Example: The following class:

public class Program
{
  public static async Task<int> GetIntAsync() => await Task.FromResult(1);
}

Will be compiled to something like the following:

public class Program
{
  public static async Task<int> GetIntAsync() => // Code to invoke state machine below

  internal class GetIntAsync()d__2 : IAsyncStateMachine
  {
    // Implementation of state machine among which we find:
    void MoveNext()
    {
      // Code for async machinery and somewhere down the line...
      Task.FromResult(1);
      // Code to set result on the state machine
    }
  }
}

Please note that I'm not entirely sure about the names and access modifiers (I'm writing this from my phone).

@Miista Miista added this to the v2.1 milestone Jan 23, 2024
Miista pushed a commit that referenced this issue Jan 24, 2024
@asukachev
Copy link

Hi, I can help with the testing, I have a lot of cases, which I can mock up.
This is a gamechanging feature, which I'd love to see. Thumbs up for this!

Miista pushed a commit that referenced this issue Jan 27, 2024
Miista pushed a commit that referenced this issue Jan 27, 2024
@Miista Miista linked a pull request Jan 27, 2024 that will close this issue
@Miista
Copy link
Owner Author

Miista commented Jan 27, 2024

@asukachev You're not going to believe it! Support for async methods is here!

Please, please, please, check out the code locally and run it through its paces on your machine!

The tests target the following frameworks:

  • .NET Core 2.0
  • .NET Core 3.0
  • .NET Core 3.1
  • .NET 4.7
  • .NET 4.8
  • .NET 6
  • .NET 7
  • .NET 8

Miista pushed a commit that referenced this issue Jan 27, 2024
@Miista
Copy link
Owner Author

Miista commented Jan 27, 2024

It turns out that the solution was simply to devirtualize the method differently if and only if the method is an async method.

Normally, we can get the method simply by calling GetMethod on the type where the method is declared. However, MoveNext cannot be found this way. I suspect this is due to it being explicitly implemented.

Whatever the reason, we can find it by going through the IAsyncStateMachine interface itself. We do this by navigating via the "interface map" to the explicitly implemented MoveNext method. From here, the interface map provides us the target method on the actual type (the state machine).

An added bonus is that the replacement methods will have their own state machines already generated by the compiler. Therefore, at that point we're really just moving the pointer from original state machine to the replacement state machine.

This gives us the best solution I could hope for as:

  1. We are able to provide unlimited number of awaits in the replacement method.
  2. We don't have to handle the async machinery.

Rationale

The reason that it async methods work is that we now actively rewrite the MoveNext method. We've always rewritten the entry point (that is, the method passed to PoseContext.Isolate). What is new is that we now resolve the MoveNext method correctly and that we therefore now are able to rewrite it correctly.

Resolving the MoveNext method correctly means that we actively find the state machine's implementation of MoveNext and rewrite that.

Miista pushed a commit that referenced this issue Jan 27, 2024
@Miista Miista removed the blocked This issue is blocked by one or more issues label Jan 27, 2024
@Miista Miista self-assigned this Jan 27, 2024
@asukachev
Copy link

    public static void SyncContext(String st)
    {
        Console.WriteLine("Started StaticMethod()");

        int value = st.GetCount();

        Console.WriteLine("Value produced: " + value);

        Thread.Sleep(1000);

        Console.WriteLine("Completed StaticMethod()");
    }
    static void Main()
    {

        Shim extensionShim = Shim.Replace(() => TestExtensions.GetCount(Is.A<string>())).With((string str) => 2);
        Shim contextShim = Shim.Replace(() => Program.SyncContext(Is.A<string>())).With((string s) =>
        {
            Console.WriteLine("Did nothing");
        });

        PoseContext.Isolate(() =>
        {
            String str = "qwerwq";
            SyncContext(str);

        }, contextShim, extensionShim);

        Console.WriteLine("Finished");
    }

This works 100% fine.

Both method and property are replaced.

@Miista
Copy link
Owner Author

Miista commented Feb 5, 2024

@asukachev You should be able to run the async method in the sync context as follows:

    public async static Task AsyncContext(String st)
    {
        Console.WriteLine("Started StaticMethod()");

        int value = st.GetCount();

        Console.WriteLine("Value produced: " + value);

        await Task.Delay(1000);

        Console.WriteLine("Completed StaticMethod()");
    }

    async static Task Main()
    {
        Shim extensionShim = Shim.Replace(() => TestExtensions.GetCount(Is.A<string>())).With((string str) => 2);

        PoseContext.Isolate(() =>
        {
            String str = "qwerwq";
            AsyncContext(str).GetAwaiter().GetResult();
        }, extensionShim);

        Console.WriteLine("Finished");
    }

Although I see now that you are shimming an extension method. While shimming an extension method in an async context should be supported, I haven't actually verified it. That said, I don't see why it wouldn't work.

The examples in the README use the async context, though I have only gotten that to work from an async Main. I haven't been able to shim using the async context using the testing framework. Please refer to: #12 (comment)

@asukachev
Copy link

asukachev commented Feb 5, 2024

That's great!

It is able to replace the method indeed.
Still, sadly, unable to shim extension method when it is used in async context.
From this example, string extension method just returns string.Length, but I want to hard replace it to return 2. Using this code fails to shim the method.

    public async static Task AsyncContext(String st)
    {
        Console.WriteLine("Started StaticMethod()");

        int value = st.GetCount();

        Console.WriteLine("Value produced: " + value);

        await Task.Delay(1000);

        Console.WriteLine("Completed StaticMethod()");
    }

    static void Main()
    {

        Shim extensionShim = Shim.Replace(() => TestExtensions.GetCount(Is.A<string>())).With((string str) => 2);

        PoseContext.Isolate(() =>
        {
            String str = "qwerwq";
            AsyncContext(str).GetAwaiter().GetResult();

        }, extensionShim);

        Console.WriteLine("Finished");
    }

@Miista
Copy link
Owner Author

Miista commented Feb 5, 2024

@asukachev Try this:

    public async static Task AsyncContext(String st)
    {
        Console.WriteLine("Started StaticMethod()");

        int value = st.GetCount();

        Console.WriteLine("Value produced: " + value);

        await Task.Delay(1000);

        Console.WriteLine("Completed StaticMethod()");
    }

    static void Main()
    {

        Shim extensionShim = Shim.Replace(() => TestExtensions.GetCount(Is.A<string>())).With(delegate (string str) { return 2; });

        PoseContext.Isolate(() =>
        {
            String str = "qwerwq";
            AsyncContext(str).GetAwaiter().GetResult();

        }, extensionShim);

        Console.WriteLine("Finished");
    }

Ref: #27 (comment)

@asukachev
Copy link

I assume it must work, but it didn't for me.
Just in case, this is my environment:

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net7.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
  </PropertyGroup>

@Miista
Copy link
Owner Author

Miista commented Feb 5, 2024

@asukachev I will look into it later today. Thank you for providing your environment.

UPDATE 2024-02-05: I was able to get everything working with the following two changes:

  1. Target framework: The shim was never invoked when targeting net7.0 so I tried targeting netcoreapp2.0 which worked. There seems to be a difference in how async state machines are invoked between netcoreapp2.0 and net7.0.
  2. Reorder operations: I had to move st.GetCount() before the call to Task.Delay. That is, the original code (targeting netcoreapp2.0) would execute but the extension method would be invoked two times instead of just one.

That leads me to believe there are the following issues we must address:

  1. Async methods seem to be compiled (or at least started) differently in .NET 7 (and possibly earlier). My gut feeling is that this is simply a case of us not intercepting the method correctly (e.g. direct call vs. virtual call). Additionally, when targeting net7.0 I experienced that the code could not determine that the actual state machine had the CompilerGeneratedAttribute. The call to GetCustomAttribute<ConpilerGeneratedAttribute> simply returned null.
  2. We are hooking into the async state machine too late apparently. At least, it seems that code before the first await executes twice (more if there are more awaits?).

It seems that async methods are compiled (or at least started) differently in .NET 7. I've compiled a diff (link: https://www.diffchecker.com/fLB4MLdp/) of the output from running the code below against .NET Core 2.0 and .NET 7. Have a look at line 132 (image below).

Perhaps this issue is present in targets lower than .NET 7. Will need to investigate further.
In the IL for .NET Core 2.0 I'm seeing calls to box Pose.Sandbox.Program+<AsyncContext>d__4 which are missing in the IL for .NET 7. This seems to be the case everywhere a state machine is started.
As mentioned earlier in this comment I have a feeling this is simply a case of us not intercepting the MoveNext method correctly for net7.0. The calls to box could back up this hypothesis.

image

public async static Task<int> AsyncContext(string st)
{
    Console.WriteLine("Started StaticMethod()");

    int value = st.GetCount();

    Console.WriteLine("Value produced: " + value);

    await Task.Delay(1000);

    Console.WriteLine("Completed StaticMethod()");

    return 1;
}

public static void Main(string[] args)
{
    Shim extensionShim = Shim
        .Replace(() => TestExtensions.GetCount(Is.A<string>()))
        .With(delegate (string str) { return 2; });

    PoseContext.Isolate( () =>
    {
        var x = AsyncContext("abc").GetAwaiter().GetResult();
    }, extensionShim);

    Console.WriteLine("Finished");
}

UPDATE 2024-03-05: I've made some progress and discoveries related to this. Please see comments: #12 (comment) and #12 (comment)

@asukachev Could I ask you to try the steps described in the Workaround section in #12 (comment)?

@Miista
Copy link
Owner Author

Miista commented Feb 8, 2024

As mentioned in the above comment (#12 (comment)) the following two changes were needed to make the code work as expected:

  1. Target framework: Targeting netcoreapp2.0 makes it so that the shim is invoked. I've yet to check later targets.
  2. Reorder operations: I had to move st.GetCount() before the call to Task.Delay. Having Task.Delay as the first operation causes the method to execute twice with different results each time. I've yet to verify what happens if there are multiple awaits.

In the following I am using the shim below.

Shim
    .Replace(() => TestExtensions.GetCount(Is.A<string>()))
    .With(delegate (string str) { return 2; });

Order of operations

Call before Task.Delay

With the following order of operations:

public async static Task<int> AsyncContext(string st)
{
    Console.WriteLine("Started StaticMethod()");
    int value = st.GetCount();
    Console.WriteLine("Value produced: " + value);
    await Task.Delay(1000);
    Console.WriteLine("Completed StaticMethod()");
    return 1;
}

I get the following output;

Started StaticMethod()
Value produced: 2
Started StaticMethod()
Value produced: 3
Completed StaticMethod()

Analysis

Notice that Started StaticMethod() occurs twice, and that Value produced: also occurs twice although with different results each time.

Task.Delay before call

However, with the following order of operations (Task.Delay comes before st.GetCount()):

public async static Task<int> AsyncContext(string st)
{
    Console.WriteLine("Started StaticMethod()");
    await Task.Delay(1000);
    int value = st.GetCount();
    Console.WriteLine("Value produced: " + value);
    Console.WriteLine("Completed StaticMethod()");
    return 1;
}

I get the following output;

Started StaticMethod()
Started StaticMethod()
Value produced: 2
Completed StaticMethod()

Analysis

Notice that Started StaticMethod() occurs twice, and that Value produced: also just once as it should and it has the expected result (due to the shim being invoked).
However, Started StaticMethod() still occurs twice. This leads me to believe that the rewritten state machine is not emitted correctly.

2 x await

With the following code:

public async static Task<int> AsyncContext(string st)
{
    Console.WriteLine("Started StaticMethod()");
    int value = st.GetCount();
    await Task.Delay(1000);
    Console.WriteLine("Value produced: " + value);
    await Task.Delay(1000);
    Console.WriteLine("Completed StaticMethod()");
    return 1;
}

I get the following output:

Started StaticMethod()
Started StaticMethod()
Value produced: 2
Started StaticMethod()
Completed StaticMethod()

Analysis

Occurrences:

  • Started StaticMethod(): Occurs thrice. Most likely:
    1. Before the first await
    2. When the state machine is resumed before the second await
    3. When the state machine is resumed for the last time
  • Value produced: Occurs once (as expected) with the expected result

3 x await

With the following code:

public async static Task<int> AsyncContext(string st)
{
    Console.WriteLine("Started StaticMethod()");
    await Task.Delay(1000);
    int value = st.GetCount();
    await Task.Delay(1000);
    Console.WriteLine("Value produced: " + value);
    await Task.Delay(1000);
    Console.WriteLine("Completed StaticMethod()");
    return 1;
}

I get the following output:

Started StaticMethod()
Started StaticMethod()
Started StaticMethod()
Value produced: 2
Started StaticMethod()
Completed StaticMethod()

Analysis

Occurrences:

  • Started StaticMethod(): Occurs four times. Most likely:
    1. Before the first await
    2. When the state machine is resumed before the second await
    3. When the state machine is resumed before the third await
    4. When the state machine is resumed for the last time
  • Value produced: Occurs once (as expected) with the expected result

Based on the analysis of having 2 or 3 awaits I believe we can formulate the following hypothesis:

Occurences of $P = 1+A$ where $P =$ Started StaticMethod() and $A =$ await $\blacksquare$

In other words, Started StaticMethod() will occur once as expected and then once for every await.
The question, then, is why does this happen. This is not normal behavior for an async method (or any method for that matter). If only there was some way to inspect the IL that the rewriter produces.

@Miista
Copy link
Owner Author

Miista commented Feb 8, 2024

Following is the IL emitted by the method rewriter:

Analysis

Let's get the obvious of out the way:

  • File "real" and "rewritten" are equal in length meaning they have the same number of lines. File "implementation", however, are three lines shorter than the rest so I will focus on that.

I will compare files "implementation" and "real". The difference in length seems to be caused by the following three instructions:

  • IL_006e: leave IL_01cc
  • IL_00e5: leave IL_01cc
  • IL_0167: leave.s IL_01cc

As of right now I cannot conclude anything with regards to them not being present in the implementation. Perhaps they are needed; perhaps not.

Debugging my way through Rewrite I can see that the MoveNext method indeed has 164 instructions which fits with the (correct) files having a line count of 165 (which includes the method name on the first line). This leads me to believe that we might be leaving out the three abovementioned leave instructions when we rewrite the MoveNext method.

I will proceed to step my way through Rewrite once we get to the first leave instruction.

Alas. Please enjoy the following from the predictive debugger in Rider:
image

A grayed out line means that it will not be executed. Also note the comment regarding emitting the leave instruction if and only if it's NOT being used in an exception block. This is why those three instructions are left out.

Let's see what would happen if we were to emit them anyway.

With those three leave instructions emitted I now get the following output:
image

I am now experiencing something very weird: When I execute the code normally (without a debugger attached), the shim is not called. However, when the debugger is attached (and I explicitly break in the GetIndexOfMatchingShim method), the shim is invoked. Oh well, that's an adventure for another day!

@Miista
Copy link
Owner Author

Miista commented Feb 8, 2024

Notes to self for the adventure on another day:

  • The method seems to be correctly rewritten if the call to st.GetCount() comes before the first await. The first thing should be to verify that this is still correct given that we now emit the leave instructions.

Adventure

If st.GetCount() comes before the first await, the shim is invoked as expected. However, if it comes after the first await then the shim is not invoked

Let's compare the compiler emitted IL against our emitted IL.

Following are the updated IL files:

It seems that the implementation now has two leave instructions too many. Specifically:

  • Line 147, and
  • Line 158

Leaving those out (via debugging and manually moving the stack pointer to not emit the instruction) leaves me with IL which has the expeced number of leave instructions.

Below I've highlighted just the leave instructions.

"Implementation" instruction "Real" instruction
IL_0077: leave IL_01e4 IL_006e: leave IL_01cc
IL_00f1: leave IL_01e4 IL_00e5: leave IL_01cc
IL_0176: leave IL_01e4 IL_0167: leave.s IL_01cc
IL_01ad: leave IL_01cf IL_019b: leave.s IL_01b7
IL_01ca: leave IL_01cf IL_01b5: leave.s IL_01cc

Some notes:

  • IL_01e4 is the very end of the method
  • IL_01cf updates the state of the state machine and returns

What is interesting here is that the 5th instruction jumps to a different label than the expected one. In fact, the label it jumps to will update the state of the state machine and only then return. I'm having a hard time seeing this being the issue.

UPDATE 2024-02-12: I can consistently reproduce the issue by attempting to shim anything from within an async method. The peculiar thing is that if the leave instructions are not emitted (which causes the method to always start from the beginning), then at least one of the executions will have the correct value. However, if the leave instructions are emitted, then there is just one execution (as expected) but that execution has the incorrect value.

As mentioned above the very last leave instruction points to a different label than expected. This instruction, though, is not directly emitted by us. It is emitted via EmitILForExceptionHandlers.

@Miista
Copy link
Owner Author

Miista commented Feb 21, 2024

Just to be clear: shimming async methods should work if the core logic of the replacement comes before the first await.

I know that's far from ideal. I just wanted to get it out there while I'm continuing to work on thorough support for shimming async methods.

@Miista
Copy link
Owner Author

Miista commented Mar 1, 2024

I tried trimming the example to reach a minimum example. I came up with the following:

await Task.Delay(1);
var value = TestExtensions.GetCount(st);
Console.WriteLine("Value produced: " + value);
return 1;

This outputs:

Value produced: 3

Now, look at the example below.

await Task.CompletedTask; // This task is already complete
var value = TestExtensions.GetCount(st);
Console.WriteLine("Value produced: " + value);
return 1;

The example above works and outputs:

Value produced: 2

I suspect this might be due to the task already being completed. In this case, the state machine does not halt; it simply continues to the next line of code.

This may be a hint that the issue lies elsewhere. Perhaps we need to up one level. I will need to inspect the IL where the state machine is started. Maybe the issue is with how control is yielded back to the callee.

Comparing the IL for the two above examples, I cannot see any difference (aside from the call to Task.Delay vs Task.CompletedTask). The files are attached here:

Additional files

In this section I have used the following examples:

Task.Delay

var value = TestExtensions.GetCount(st);
await Task.Delay(1);
Console.WriteLine("Value produced: " + value);
return 1;

With the following IL:

Task.CompletedTask

var value = TestExtensions.GetCount(st);
await Task.CompletedTask; // This task is already complete
Console.WriteLine("Value produced: " + value);
return 1;

With the following IL:

Analysis (skip this whole section; I was wrong!)

Content to follow Scratch that!

Looking at the emitted IL for where the state machine is started I noticed something strange. In both cases (whether utilizing Task.CompletedTask or Task.Delay, the actual IL is vastly different from the emitted IL. I'll elaborate below.

The emitted IL is 79 lines whereas the actual (and expected) IL is just 19 lines. I have attached the files.

It would seem that we don't rewrite the async method entirely as expected. But this is even more strange because I can see when debugging that we do indeed get the (by the compiler) rewritten instructions for the method when rewriting it. Please see screenshot below.

image

This must mean that it goes wrong somewhere after this point.

UPDATE 2024-03-05: Never mind! It was a mistake on my part. I had modified the tracing code to output the body for the AsyncContext method, or so I thought. I had simply forgotten to modify the code which outputted the emitted instructions.

The conclusion is that we emit the correct instructions for the AsyncContext method and thereby also the place where the state machine is started.

For good measure (and to be thorough) I ran the example code again (this time outputting the actual emitted instructions). The results are attached below.

The diff shows that there are no differences aside from the expected verbosity of the actual IL. With this I guess we are back to the fault being with how we rewrite the method.

Testing for more insight

One test we could run (which might give some more insight) would be to see whether we are actually running our rewritten code after an await. Perhaps the runtime changes back to the non-rewritten method seeing as the state machine is started before we rewrite the method.

We should be able to verify this using the following example:

Console.WriteLine("Value: 1")
await Task.Delay(1);
Console.WriteLine("Value: 2")
return 1;

Then we will rewrite this as usual except that all calls to Console.WriteLine will be hard-coded to return a different value. Not via shims but via the MethodRewriter::Rewrite method.

I have hard-coded calls to Console.WriteLine to print Hello.

And we have a winner!!

When I run the following:

Console.WriteLine("Value: 1");
await Task.CompletedTask;
Console.WriteLine("Value: 2");
return 1;

I get the following output:

Hello
Hello

However, if I run the following:

Console.WriteLine("Value: 1");
await Task.Delay(1);
Console.WriteLine("Value: 2");
return 1;

I get the following output:

Hello
Value: 2

This means that the code after the first await was either

  1. Never rewritten or
  2. It wasn't reached.

I know for a fact that 1) is not true because I can see our emitted IL. Second, if I have the breakpoints shown below only the second breakpoint will trigger. We don't hit the first breakpoint because we are in the rewritten IL. However, the second breakpoint exist in the non-rewritten portion of the code and we can therefore break at it. This is a breakthrough (forgive the pun).

image

My hypothesis is that the incorrect continuation is registered when the state machine yields due to a task not yet having completed. Whether this fault lies with the state machine or with our rewriter remains to be discovered. Follow up can be found here: #12 (comment)

It works?

Now for something extremely weird. If I follow the steps laid out in the Workaround section of #12 (comment), shimming async methods works as expected across awaits on .NET Core 2.0 (and above).

I cannot (yet) explain why.

That said, it still requires having a debugger attached.

@Miista
Copy link
Owner Author

Miista commented Mar 4, 2024

Follow up

I am following up on the findings in #12 (comment)

I will attempt to set a breakpoint in AsyncTaskMethodBuilder and see what is being passed in when AwaitUnsafeOnCompleted is called. Unfortunately, that's not so straight forward as I initially thought. For some reason, I cannot view the source when targetting .NET Core 2.0. Well, then I'll just have to do it by hand.

Given the following excerpt from the emitted IL (full file below the example) we know:

  • That the call to Console.WriteLine("Value: 1") is replaced correctly as evident by it not appearing in the output.
  • That the call to Console.WriteLine("Value: 2") is not replaced as it appears verbatim in the output.

That means that the issue lies somewhere in those ~40 lines.

We can narrow this further by noting that:

  • Calls not related to the async machinery are unlikely to affect the rewrite process.

That means that the fault has to be somewhere between label IL_0041 and IL_008c.

IL_0018: ldstr "Value: 1"
IL_0022: call Void WriteLine(System.String)
IL_0028: nop
IL_0029: ldc.i4.1
IL_002a: call System.Threading.Tasks.Task stub_call_System.Threading.Tasks.Task::Delay(Int32)
IL_002f: call System.Runtime.CompilerServices.TaskAwaiter stub_callvirt_System.Threading.Tasks.Task::GetAwaiter(System.Threading.Tasks.Task)
IL_0034: stloc.2
IL_0035: ldloca.s System.Runtime.CompilerServices.TaskAwaiter (2)
IL_0037: call Boolean stub_call_System.Runtime.CompilerServices.TaskAwaiter::get_IsCompleted(System.Runtime.CompilerServices.TaskAwaiter ByRef)
IL_003c: brtrue IL_0084
IL_0041: ldarg.0
IL_0042: ldc.i4.0
IL_0043: dup
IL_0044: stloc.0
IL_0045: stfld Int32 <>1__state
IL_004a: ldarg.0
IL_004b: ldloc.2
IL_004c: stfld System.Runtime.CompilerServices.TaskAwaiter <>u__1
IL_0051: ldarg.0
IL_0052: stloc.3
IL_0053: ldarg.0
IL_0054: ldflda System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[System.Int32] <>t__builder
IL_0059: ldloca.s System.Runtime.CompilerServices.TaskAwaiter (2)
IL_005b: ldloca.s Pose.Sandbox.Program+<Test>d__0 (3)
IL_005d: call Void stub_call_System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[System.Int32]::AwaitUnsafeOnCompleted[TaskAwaiter,<Test>d__0](System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[System.Int32] ByRef, System.Runtime.CompilerServices.TaskAwaiter ByRef, <Test>d__0 ByRef)
IL_0062: nop
IL_0063: leave IL_00d6
IL_0068: ldarg.0
IL_0069: ldfld System.Runtime.CompilerServices.TaskAwaiter <>u__1
IL_006e: stloc.2
IL_006f: ldarg.0
IL_0070: ldflda System.Runtime.CompilerServices.TaskAwaiter <>u__1
IL_0075: initobj System.Runtime.CompilerServices.TaskAwaiter
IL_007b: ldarg.0
IL_007c: ldc.i4.m1
IL_007d: dup
IL_007e: stloc.0
IL_007f: stfld Int32 <>1__state
IL_0084: ldloca.s System.Runtime.CompilerServices.TaskAwaiter (2)
IL_0086: call Void stub_call_System.Runtime.CompilerServices.TaskAwaiter::GetResult(System.Runtime.CompilerServices.TaskAwaiter ByRef)
IL_008b: nop
IL_008c: ldstr "Value: 2"
IL_0096: call Void WriteLine(System.String)

Full IL: mve-tracing-test-emitted.txt

It would be interesting to see which state machine we are executing before and after the await. Perhaps we could achieve this by reflecting on the code at runtime.

It works?

For more context please see the following comment: #12 (comment)

Next steps are to investigate the significance of having a breakpoint after the first await. Updates will follow here: #12 (comment)

@Miista
Copy link
Owner Author

Miista commented Mar 4, 2024

Adding support for .NET Core 2.1+

In completely other news, I tried adding support for .NET Core 2.1 and greater. I was successful. It turns out that from .NET Core 2.1 and later, the async machinery forwards to AsyncMethodBuilderCore which is a private type. Thus, ShouldForward returns true (as we don't want to rewrite inaccesible constructors in System.Private.CoreLib/mscorlib).

I added an exception specifically for AsyncMethodBuilderCore.

This seems to have added support for shimming async methods.

What's more is that it looks like it works across awaits. That is, given the following code:

public static class TestExtensions
{
    public static int GetCount(this string s)
    {
        return 23;
    }
}

public async static Task<int> Test()
{
    Console.WriteLine("123".GetCount());
    await Task.Delay(1);
    Console.WriteLine("1234".GetCount());
    return 1;
}

public static void Main(string[] args)
{
    Shim extensionShim = Shim
        .Replace(() => TestExtensions.GetCount(Is.A<string>()))
        .With((string str) => 2);

    PoseContext.Isolate( () =>
    {
        Test().GetAwaiter().GetResult();
    }, extensionShim);

    Console.WriteLine("Finished");
}

I get the following output:

2
2
Finished

Caveat

The caveats are that:

  1. I must execute the code with a debugger attached, and
  2. I must have a breakpoint in MethodRewriter::EmitILForMethod and Stubs::GenerateStubForDirectCall. It's enough to have the breakpoint in MethodRewriter::EmilILForMethod. Why? I don't know yet.

The breakpoint is hit twice:

  1. When rewriting Test
  2. When rewriting AsyncMethodBuilderCore::Start

Further investigation shows that it's really only the second hit which is significant.

Below are the variables for the MethodRewriter at the point of hitting the breakpoint:

1. When rewriting Test

image

This invocation is when the state machine is started from Test.

2. When rewriting AsyncMethodBuilderCore::Start

image

This invocation is when AsyncTaskMethodBuilder<int> invokes AsyncMethodBuilderCore.

I still don't know why this breakpoint is significant. However, its significance seems to be related to the fact that the breakpoint occurs after the await. I think I read some comment somewhere in the source code for AsyncMethodBuilderCore regarding debuggers.

Workaround

Just to be perfectly clear. Using the following steps I can shim async methods (across awaits):

  1. Target .NET Core 2.1
  2. Attach a debugger
  3. Add a breakpoint in MethodRewriter::EmitILForMethod with the condition: methodInfo.Name.Contains("Start")
  4. Debug the code
  5. Hit the breakpoint twice

@Miista
Copy link
Owner Author

Miista commented Mar 5, 2024

The significance of the breakpoint

I will attempt to inspect the stack before and after the await. It will be a bit tricky to inspect it before the await as we cannot set a breakpoint and hit it (due the method having been rewritten). However, we can inspect it by calling creating an instance of StackTrace manually and going through its Frames. This should at least give us an indication.

I will be printing the stack trace using the following code:

var stackTrace = new StackTrace();
foreach (var stackFrame in stackTrace.GetFrames())
{
    Console.WriteLine(stackFrame);
}

With breakpoints

As indicated in the Workaround section in #12 (comment) there is a difference between running with a breakpoint and running without a breakpoint. In the following, the breakpoint is set as indicated by step 3 in the aforementioned section. This gives me the following stack traces:

Analysis

NOTE: Methods starting with "stub_" and "impl_" are rewritten methods.

There is a single difference between the stack traces. Namely, that the "after" stack trace contains the MoveNext method as its immediate successor. This is expected as the "after" stack trace is the resulting awaiting and thereby being scheduled as a continuation. The specific lines are shown below.

impl_Pose.Sandbox.Program+<Test>d__0::MoveNext at offset 794 in file:line:column <filename unknown>:0:0
stub_callvirt_System.Runtime.CompilerServices.IAsyncStateMachine::MoveNext at offset 317 in file:line:column <filename unknown>:0:0

Disregard the above difference, we are left with two stack traces which look very similar. However, then we notice that the "before" stack trace contains 2 calls to Start. Specifically the following lines:

impl_System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[System.Int32]::Start[<Test>d__0] at offset 131 in file:line:column <filename unknown>:0:0
impl_System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[System.Int32]::Start[<Test>d__0] at offset 131 in file:line:column <filename unknown>:0:0

Without breakpoints

In the following I will be running the code without breakpoints. That is, simply executing the code. This gives me the following stack traces:

Analysis

The two stack traces are wildly different. The "after" stack trace is a mere 3 lines. Specifically:

MoveNext at offset 846 in file:line:column <filename unknown>:0:0
Run at offset 110 in file:line:column <filename unknown>:0:0
Dispatch at offset 425 in file:line:column <filename unknown>:0:0

We notice that none of the methods in the above stack trace start with "stub_" or "impl_", meaning that they are not our (rewritten) methods but the original non-rewritten methods.

With breakpoint vs. without breakpoint

In the following I will compare the "before" stack trace when running with breakpoints and running without breakpoints.

The major difference is that the stack trace with breakpoints enabled contains 2 calls to Start whereas the stack trace without breakpoints contains just a single call. The lines are shown below.

impl_System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[System.Int32]::Start[<Test>d__0] at offset 131 in file:line:column <filename unknown>:0:0
impl_System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[System.Int32]::Start[<Test>d__0] at offset 131 in file:line:column <filename unknown>:0:0
stub_call_System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1[System.Int32]::Start[<Test>d__0] at offset 484 in file:line:column <filename unknown>:0:0

What's more interesting is that the calls seem to have the following order:

  1. stub - Start
  2. impl - Start
  3. impl - Start

In other words, the Start method seems to be called twice in succession. Specifically, the implementation of the Start method. Could this be related to the debugger being attached?

Additional conundrum

I tried rerunning the code without breakpoints and encountered the following alternative stack trace:

MoveNext at offset 754 in file:line:column <filename unknown>:0:0
Run at offset 110 in file:line:column <filename unknown>:0:0
RunOrScheduleAction at offset 97 in file:line:column <filename unknown>:0:0
RunContinuations at offset 300 in file:line:column <filename unknown>:0:0
TrySetResult at offset 101 in file:line:column <filename unknown>:0:0
Complete at offset 168 in file:line:column <filename unknown>:0:0
Run at offset 110 in file:line:column <filename unknown>:0:0
CallCallback at offset 235 in file:line:column <filename unknown>:0:0
Fire at offset 141 in file:line:column <filename unknown>:0:0
FireNextTimers at offset 320 in file:line:column <filename unknown>:0:0

This stack trace is very different from the one I'm normally seeing which is this:

MoveNext at offset 846 in file:line:column <filename unknown>:0:0
Run at offset 110 in file:line:column <filename unknown>:0:0
Dispatch at offset 425 in file:line:column <filename unknown>:0:0

Maybe this is nothing. I would hazard a guess that we are simply seeing the inner workings of the async machinery.

Miista added a commit that referenced this issue May 2, 2024
Miista added a commit that referenced this issue May 2, 2024
Miista added a commit that referenced this issue May 2, 2024
Miista added a commit that referenced this issue May 2, 2024
Miista added a commit that referenced this issue May 2, 2024
Miista added a commit that referenced this issue May 2, 2024
Miista added a commit that referenced this issue May 2, 2024
@Miista Miista modified the milestones: v2.1, v2.2 Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority: high The issue has high priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants