Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP2: Hang when concurrent calls with cancellation #39608

Closed
JamesNK opened this issue Jul 19, 2020 · 17 comments · Fixed by #40243
Closed

HTTP2: Hang when concurrent calls with cancellation #39608

JamesNK opened this issue Jul 19, 2020 · 17 comments · Fixed by #40243
Assignees
Milestone

Comments

@JamesNK
Copy link
Member

JamesNK commented Jul 19, 2020

Description

Grpc.Net.Client (which uses HttpClient internally) is hanging when there are 20 concurrent calls with cancellation happening in a loop.

Repo: repos.zip

  1. Extract to directory
  2. dotnet run -p GrpcService28
  3. dotnet run -p ConsoleApp50
  4. ConsoleApp will throw timeout exceptions after a few hundred/thousand iterations.

Changing ConsoleApp50 target framework from net5.0 to netcoreapp3.1 fixes the timeout exceptions.

Normally I'd attempt to remove Grpc.Net.Client library from the repo, but in this case the gRPC call is using a number of fairly advanced features (bidi streaming, cancellation) so I have left it in.

What console app is doing (20 parallel threads on one HTTP/2 connection):

  1. Starting bidi streaming call, client.EchoAllData(cancellationToken: cts.Token, headers: headers)
  2. Waiting for the call response headers from server, await call.ResponseHeadersAsync.DefaultTimeout();
  3. Writing 64kb message, await call.RequestStream.WriteAsync(...) <- this is where the hang happens
  4. Cancelling call, cts.Cancel();
  5. Go to 1

Configuration

Using nightly SDK: 5.0.100-rc.1.20367.2

Regression?

Yes, this worked in netcoreapp3.1

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Net.Http untriaged New issue has not been triaged by the area owner labels Jul 19, 2020
@msftbot
Copy link
Contributor

msftbot bot commented Jul 19, 2020

Tagging subscribers to this area: @dotnet/ncl
Notify danmosemsft if you want to be subscribed.

@scalablecory scalablecory added this to the 5.0.0 milestone Jul 20, 2020
@scalablecory scalablecory removed the untriaged New issue has not been triaged by the area owner label Jul 20, 2020
@geoffkizer
Copy link
Contributor

@JamesNK Just to clarify... it's not always hanging, right? Only after some period of time/load?

I can see one possible issue introduced by my change. If we get stuck waiting for a write on the connection to complete, then we will fail to cancel the outstanding request write. Do you know if that could be happening here? I.e. is the server itself possibly reading from the connection very slowly, causing backpressure on the client side of the connection?

@geoffkizer
Copy link
Contributor

@JamesNK I'm trying to repro and I get this from the client:

Unhandled exception. Grpc.Core.RpcException: Status(StatusCode="Internal", Detail="Error starting gRPC call. HttpRequestException: The SSL connection could not be established, see inner exception. AuthenticationException: The remote certificate is invalid according to the validation procedure.", DebugException="System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.

I assume I need to set up certs somehow?

@JamesNK
Copy link
Member Author

JamesNK commented Jul 20, 2020

You need to trust the ASP.NET Core dev cert: https://docs.microsoft.com/en-us/aspnet/core/grpc/troubleshoot?view=aspnetcore-3.1#call-a-grpc-service-with-an-untrustedinvalid-certificate

You could also change the client to use http://localhost:5000 (if you do this then you'll need the AppContext switch to support h2c)

@JamesNK
Copy link
Member Author

JamesNK commented Jul 20, 2020

@JamesNK Just to clarify... it's not always hanging, right? Only after some period of time/load?

Correct

I can see one possible issue introduced by my change. If we get stuck waiting for a write on the connection to complete, then we will fail to cancel the outstanding request write. Do you know if that could be happening here? I.e. is the server itself possibly reading from the connection very slowly, causing backpressure on the client side of the connection?

It might not necessarily be from your change. I have only just started running gRPC unit tests against net5.0.

@halter73
Copy link
Member

You could also change the client to use http://localhost:5000 (if you do this then you'll need the AppContext switch to support h2c)

@JamesNK You also need to put the server into h2c mode for this to work, right? E.g:

            webBuilder.ConfigureKestrel(options =>
            {
                options.Listen(IPAddress.Any, 5000, listenOptions =>
                {
                    // Set protocol to HTTP/2-only to put Kestrel in h2c prior knowledge mode.
                    listenOptions.Protocols = HttpProtocols.Http2;
                    // ...

@JamesNK
Copy link
Member Author

JamesNK commented Jul 20, 2020

Not needed. The gRPC template has http on 5000, https on 5001, and both are HTTP/2 only because of appsettings.json.

  "Kestrel": {
    "EndpointDefaults": {
      "Protocols": "Http2"
    }
  }

@geoffkizer
Copy link
Contributor

#39654 does not fix this. I suspect it is unrelated to my change.

@ManickaP ManickaP self-assigned this Jul 22, 2020
@karelz karelz assigned alnikola and unassigned ManickaP Jul 28, 2020
@alnikola
Copy link
Contributor

@JamesNK I investigated your repro and it appears to me there isn't any network related issue. Timeouts you observed are almost surely caused by tasks getting into deadlocks due to missing .ConfigureAwait(false). Once I added it to lines 57 and 65 as follows, everything started working properly and test completed successfully after 20000 requests.

                    if (waitForHeaders)
                    {
                        await call.ResponseHeadersAsync.DefaultTimeout().ConfigureAwait(false);
                    }

                    try
                    {
                        await call.RequestStream.WriteAsync(new DataMessage
                        {
                            Data = ByteString.CopyFrom(data)
                        }).DefaultTimeout().ConfigureAwait(false);
                    }
                    catch (Exception ex)
                    {
                        Console.WriteLine(ex);
                        throw;
                    }

@alnikola
Copy link
Contributor

cc: @geoffkizer

@JamesNK
Copy link
Member Author

JamesNK commented Jul 29, 2020

That's odd. This is a console app. I thought there was no sync context in console apps?

Also the console app is correctly awaiting all tasks. There are no .Result or .GetAwaiter().GetResult() calls to block tasks. It shouldn't require users of the library to add .ConfigureAwait(false).

I will double check what happens on my computer.

@JamesNK
Copy link
Member Author

JamesNK commented Jul 29, 2020

I added ConfigureAwait(false) and it still hangs.

if (waitForHeaders)
{
    await call.ResponseHeadersAsync.DefaultTimeout().ConfigureAwait(false);
}

try
{
    await call.RequestStream.WriteAsync(new DataMessage
    {
        Data = ByteString.CopyFrom(data)
    }).DefaultTimeout().ConfigureAwait(false);
}
catch (Exception ex)
{
    Console.WriteLine(ex);
    throw;
}
C:\Downloads\repos\ConsoleApp50> dotnet run
Go!
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
0 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
50 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
100 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
150 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
200 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
250 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
300 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
350 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
400 ConfigureAwait
450 ConfigureAwait
450 ConfigureAwait
450 ConfigureAwait
450 ConfigureAwait
450 ConfigureAwait
450 ConfigureAwait
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 147
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 147
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 147
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 147
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 147
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 147
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 147
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 147
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 147
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 147
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 147
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 147
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 147
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55

@alnikola
Copy link
Contributor

Sorry, my mistake, there should be one more .ConfigureAwait(false) after await RunParallel({...}). Please, find the full Program type's code below.

class Program
    {
        static async System.Threading.Tasks.Task Main(string[] args)
        {
            Console.WriteLine("Go!");

            var handler = new HttpClientHandler
            {
                ServerCertificateCustomValidationCallback = HttpClientHandler.DangerousAcceptAnyServerCertificateValidator
            };

            GrpcChannel c = GrpcChannel.ForAddress("https://localhost:5001", new GrpcChannelOptions
            {
                HttpHandler = handler,
                ThrowOperationCanceledOnCancellation = true
            });
            Greeter.GreeterClient client = new Greeter.GreeterClient(c);

            // Arrange
            var data = new byte[1024 * 64];
            const int interations = 20000;
            bool waitForHeaders = true;

            await RunParallel(count: 20, async () =>
            {
                for (int i = 0; i < interations; i++)
                {
                    if (i % 50 == 0)
                    {
                        Console.WriteLine(i);
                    }

                    var cts = new CancellationTokenSource();
                    var headers = new Metadata();
                    if (waitForHeaders)
                    {
                        headers.Add("flush-headers", bool.TrueString);
                    }
                    var call = client.EchoAllData(cancellationToken: cts.Token, headers: headers);

                    if (waitForHeaders)
                    {
                        await call.ResponseHeadersAsync.DefaultTimeout().ConfigureAwait(false);
                    }

                    try
                    {
                        await call.RequestStream.WriteAsync(new DataMessage
                        {
                            Data = ByteString.CopyFrom(data)
                        }).DefaultTimeout().ConfigureAwait(false);
                    }
                    catch (Exception ex)
                    {
                        Console.WriteLine(ex);
                        throw;
                    }

                    cts.Cancel();
                }
            }).ConfigureAwait(false);

            Console.WriteLine("Done");
            Console.ReadKey();
        }

        public static Task RunParallel(int count, Func<Task> action)
        {
            var actionTasks = new Task[count];
            for (int i = 0; i < actionTasks.Length; i++)
            {
                actionTasks[i] = action();
            }

            return Task.WhenAll(actionTasks);
        }
    }

@alnikola
Copy link
Contributor

@JamesNK Could you please confirm whether the full version above fixes the issue on your machine?

@JamesNK
Copy link
Member Author

JamesNK commented Jul 31, 2020

It still fails

await RunParallel(count: 20, async () =>
{
    for (int i = 0; i < interations; i++)
    {
        if (i % 50 == 0)
        {
            Console.WriteLine(i + " ConfigureAwait + RunParallel");
        }

        var cts = new CancellationTokenSource();
        var headers = new Metadata();
        if (waitForHeaders)
        {
            headers.Add("flush-headers", bool.TrueString);
        }
        var call = client.EchoAllData(cancellationToken: cts.Token, headers: headers);

        if (waitForHeaders)
        {
            await call.ResponseHeadersAsync.DefaultTimeout().ConfigureAwait(false);
        }

        try
        {
            await call.RequestStream.WriteAsync(new DataMessage
            {
                Data = ByteString.CopyFrom(data)
            }).DefaultTimeout().ConfigureAwait(false);
        }
        catch (Exception ex)
        {
            Console.WriteLine(ex);
            throw;
        }

        cts.Cancel();
    }
}).ConfigureAwait(false);
C:\Downloads\repos\ConsoleApp50> dotnet run -c Release
Go!
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
0 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
50 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
100 ConfigureAwait + RunParallel
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55
System.TimeoutException: The operation at C:\Downloads\repos\ConsoleApp50\Program.cs:58 timed out after reaching the limit of 5000ms.
   at ConsoleApp50.TaskExtensions.TimeoutAfter(Task task, TimeSpan timeout, String filePath, Int32 lineNumber) in C:\Downloads\repos\ConsoleApp50\Program.cs:line 149
   at ConsoleApp50.Program.<>c__DisplayClass0_0.<<Main>b__0>d.MoveNext() in C:\Downloads\repos\ConsoleApp50\Program.cs:line 55

@alnikola
Copy link
Contributor

Hmm, you are right, it started happening again on my machine, too. That's really strange because before I fully ran it several times without any issues. Will continue investigation.

@alnikola
Copy link
Contributor

It seems I found the root cause. There is a deadlock between processing WindowUpdateFrame and cancelling Http2Stream.

  1. Http2Stream.SendDataAsync calls Http2Connection.SendStreamDataAsync passing _requestBodyCancellationSource.Token to cancel sending request body if a need arises later.
  2. Http2Connection.SendStreamDataAsync tries to acquire connection window credit of the frame's size to send a next chunk of data. To do that, it calls await _connectionWindow.RequestCreditAsync(frameSize, cancellationToken) where cancellationToken is the token supplied by Http2Stream at the step 1.
  3. In case no credit is available, CreditManager.RequestCreditAsync creates CancelableCreditWaiter passing this (the reference to the current CreditManager instance) as SyncObject and still the same cancellationToken to the constructor.
  4. CancelableCreditWaiter registers a callback on that cancellationToken and stores the CancellationTokenRegistration in _registration. The first thing the callback will do is to acquire lock on _syncObject that points to CreditManager from the step 3.
  5. Some time later. GrpcCall<T>.CancelCall disposes HttpResponseMessage that subsequently disposes HttpConnectionResponseContent where Http2Stream.Cancel gets called at some point.
  6. Http2Stream.Cancel checks whether a request is still in progress and since it is so, it tries to cancel the request by calling requestBodyCancellationSource?.Cancel().
  7. requestBodyCancellationSource calls the CancelableCreditWaiter's cancellation callback registered at step 4 that tries to lock on _syncObejct which still points to _connectionWindow from step 2. However, the lock happens to be already taken, so it waits.
  8. In parallel with steps 5-7, Http2Connection.ProcessIncomingFramesAsync receives a WindowUpdate frame and calls ProcessWindowUpdateFrame.
  9. ProcessWindowUpdateFrame updates connection window credits by calling _connectionWindow.AdjustCredit(amount). _connectionWindow is the same CreditManager on which we called RequestCreditAsync at the step 2 and which passed itself (this) as _syncObject to CancelableCreditWaiter.
  10. CreditManager.AdjustCredit locks on itself (this) and calls waiter.Dispose() where waiter is CancelableCreditWaiter from step 4 with a cancellation callback.
  11. CancelableCreditWaiter.Dispose eventually calls _registration.Dispose. _registration was assigned to at step 4 and is linked to request cancellation token belonging to _requestBodyCancellationSource from step 1.
  12. _registration.Dispose eventually calls CancellationTokenSource.WaitForCallbackToComplete that start waiting for an executing cancellation callback to complete by looping over SpinLock with while (ExecutingCallback == id). However, the callback being executed right now is the one where thread closing Http2Stream got blocked at step 7 while waiting for a lock on the same _connectionWindow object as the one we locked on at step 10.
  13. Deadlock!

@msftbot msftbot bot locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants