Skip to content

Avoid race that can cause Kestrel's RequestAborted to not fire #62385

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 15, 2025

Conversation

halter73
Copy link
Member

@halter73 halter73 commented Jun 18, 2025

Before this change, Kestrel had a narrow race where it could fail to trigger the RequestAborted token despite the underlying connection closing. Because CancellationTokenSource.TryReset() is not thread-safe, concurrent calls to Cancel() made by CancelRequestAbortedTokenCallback() on a background thread can end up being ignored.

https://github.com/dotnet/runtime/blob/2dbdff1a7aebd64b4b265d99b173cd77f088c36e/src/libraries/System.Private.CoreLib/src/System/Threading/CancellationTokenSource.cs#L477-L479

I observed exactly this when debugging a hanging test from Send MCP-Protocol-Version header in Streamable HTTP client transport · modelcontextprotocol/csharp-sdk@735be0f by looking at the memory dump from testresults-windows-latest-Release. There you can see that there's a SSE request hanging because RequestAborted never firing despite ConnectionContext.ConnectionClosed firing and Http1Connection._connectionAborted being set.

I think this is definitely the right change for HTTP/1.1, and I copied the CanReuse logic from HTTP/2 for HTTP/3 so it hopefully shouldn't cause issues there, but please double check this @JamesNK. If the new logic does cause issues, it likely means the previous logic was even more broken for HTTP/3, because that would mean _connectionAborted could be Reset() to false while CancelRequestAbortedTokenCallback() from a call to HttpContext.Abort() from a previous request was still scheduled on the thread pool.

To fix this, I made _connectionAborted a terminal state and never reset _abortedCts to null. I think the only alternatives would be to somehow synchronize with the CancelRequestAbortedTokenCallback() and delay resetting until that completes or to not pool altogether.

@halter73 halter73 requested a review from stephentoub June 18, 2025 01:34
@github-actions github-actions bot added the area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions label Jun 18, 2025
@halter73
Copy link
Member Author

I think we should also backport this to .NET 8 and 9.

@JamesNK
Copy link
Member

JamesNK commented Jun 18, 2025

Add a test that a HTTP/3 stream isn't reused when aborted?

@dotnet-policy-service dotnet-policy-service bot added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Jun 25, 2025
@Copilot Copilot AI review requested due to automatic review settings August 15, 2025 21:37
@halter73 halter73 force-pushed the thread-safe-request-aborted branch from 5630fc3 to 21db906 Compare August 15, 2025 21:37
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a race condition in Kestrel where the RequestAborted token could fail to fire despite the underlying connection closing. The issue stems from the non-thread-safe nature of CancellationTokenSource.TryReset() when concurrent Cancel() calls are made by background threads.

Key changes:

  • Made _connectionAborted a terminal state that is never reset to prevent race conditions
  • Added CanReuse property to HTTP/3 streams to determine reusability based on connection state
  • Updated stream reuse logic to check CanReuse before reusing HTTP/3 streams
  • Added comprehensive tests to verify state is not reused for aborted requests

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/Servers/Kestrel/Core/src/Internal/Http/HttpProtocol.cs Modified Reset() method to make _connectionAborted terminal and only reset CancellationTokenSource when connection is not aborted
src/Servers/Kestrel/Core/src/Internal/Http2/Http2Stream.cs Removed reset of _connectionAborted flag in OnReset() to maintain terminal state
src/Servers/Kestrel/Core/src/Internal/Http3/Http3Stream.cs Added CanReuse property and removed _connectionAborted reset in OnReset()
src/Servers/Kestrel/Core/src/Internal/Http3/Http3Connection.cs Updated stream reuse logic to check CanReuse property before reusing streams
src/Servers/Kestrel/test/Interop.FunctionalTests/Http3/Http3RequestTests.cs Added tests for client and server-aborted requests to verify state is not reused

@halter73 halter73 removed the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Aug 15, 2025
@halter73 halter73 enabled auto-merge (squash) August 15, 2025 22:02
@halter73 halter73 merged commit 0ad8374 into main Aug 15, 2025
29 checks passed
@halter73 halter73 deleted the thread-safe-request-aborted branch August 15, 2025 22:45
@dotnet-policy-service dotnet-policy-service bot added this to the 10.0-rc1 milestone Aug 15, 2025
wtgodbe added a commit that referenced this pull request Aug 18, 2025
* Avoid race that can cause Kestrel's RequestAborted to not fire (#62385)

* Send Keep-Alive Ping Immediately When Previous Ping Is Overdue (#63195)

* Make new validations consistent with System.ComponentModel.DataAnnotations behavior (#63231)

* Add support for type-level validation attributes, update validation ordering

* Code review fix, test fix

* Fix trimming annotation

* Fix trimming annotation

* Separate caches for property and type attributes

* Fix typo

Co-authored-by: Brennan <brecon@microsoft.com>

* Fix typo

Co-authored-by: Brennan <brecon@microsoft.com>

* Fix and simplify the emitted code

* Update src/Validation/test/Microsoft.Extensions.Validation.GeneratorTests/ValidationsGeneratorTestBase.cs

---------

Co-authored-by: Brennan <brecon@microsoft.com>
Co-authored-by: Safia Abdalla <safia@microsoft.com>

* Search for slnx files when setting solution-relative content root (#61305)

* Address API review feedback for what was IApiEndpointMetadata (#63283)

---------

Co-authored-by: Stephen Halter <halter73@gmail.com>
Co-authored-by: Reagan Yuan <reaganyuan27@gmail.com>
Co-authored-by: Ondřej Roztočil <roztocil@outlook.com>
Co-authored-by: Brennan <brecon@microsoft.com>
Co-authored-by: Safia Abdalla <safia@microsoft.com>
Co-authored-by: Jacob Bundgaard <jbu@templafy.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants