Skip to content

"async" fails as the number of threads increases #28480

@ghost

Description

UPDATE 03/05/2021:
A few months ago we discovered that this was somehow tied to the use of the async keyword, and have since been able to prove it in the AcmeWebApi test application. As @davidfowl has stated in our other thread, this is most likely a race condition. Since it is virtually impossible to write any application that doesn't use async in some way now (due to the core library changes). it is not possible to run tests that only use synchronous code. I can say that when we did have synchronous code, we were seeing a drastically higher performance metric than we currently are seeing.

Leaving the following, as that is what we started the thread with:

We have run into several issues with ASP.NET Core that appear to be threading related. I initially created #26955, as that was the first issue that we ran into, but creating an application that can be tested is till ongoing. In the process of creating an application for that purpose, we were able to replicate another issue, which is the topic of this thread. The application linked below replicates this issue under the following conditions:

  • 3,500+ concurrent clients (we are required to exceed 8,000).
  • Average throughput in total across all connections is 6,000 req/s (this is our minimum for maximum throughput).
  • VM has 2 dedicated CPUs and 4GB RAM.

Under these conditions we observe numerous HeartbeatSlow issues across random connections (threads), which in our full application leads to complete system failure over time. We are working on providing ways to replicate the other issues that we have observed, but these are currently the only ones that we can replicate for you in a test application.

This issue ONLY exists on Linux (we used Ubuntu 20.04.1 LTS for verification) and results in both a significant reduction of throughput and a significantly higher latency. When running in our full application, this issue, along with others, causes a complete system failure (APPCRASH) as the process runs out of memory. No matter how much we try, this issue, and the others, cannot be replicated in a Windows environment (Windows 10 and Windows Server 2019 were tested).

SDK: 3.1.301
VS: 16.8.2

The test application is available in the private repo AcmeWebApi.

FULL DISCOLSURE
I work for Webroot / Carbonite / OpenText and the application discussed above is the property of said entities. Microsoft is a direct / indirect customer of ours, so I am limited on the information that I am allowed to provide.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Perfarea-networkingIncludes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions