-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Description
Description
After upgrading from .NET 9 to .NET 10 (no code or package changes), HTTPS requests through SocketsHttpHandler more frequently require connection times that exceed 500ms — a threshold that was consistently met on .NET 9.
This is causing failures in the Azure Cosmos DB .NET SDK, which uses a hard-coded 500ms first-attempt timeout for internal metadata/address resolution requests.
We have filed Azure/azure-cosmos-dotnet-v3#5642 for the SDK side, but the underlying question remains: what changed in .NET 10's SocketsHttpHandler/SslStream pipeline that increases HTTPS request latency compared to .NET 9?
We cloned the Cosmos DB SDK and changed the first-attempt timeout from 500ms to 5 seconds. All errors disappeared completely. This confirms that requests which completed within 500ms on .NET 9 now intermittently exceed 500ms on .NET 10 — with no code or package changes, only the runtime upgrade.
See the detailed reproduction steps and screenshots in azure-cosmos-dotnet-v3#5642.
Reproduction Steps
See the detailed reproduction steps and screenshots in azure-cosmos-dotnet-v3#5642, which includes before/after comparisons with traffic-shaped latency on both .NET 9 and .NET 10.
Expected behavior
HTTPS requests through SocketsHttpHandler should have comparable connection establishment latency to .NET 9. Requests that consistently completed within 500ms on .NET 9 should not intermittently exceed 500ms on .NET 10.
Actual behavior
- Intermittent
TaskCanceledExceptionon HTTPS requests that take slightly over 500ms - The same requests consistently complete within 500ms on .NET 9
- We can reproduce this locally using traffic shaping and introducing artificial delays
- It happens intermittently inside of Azure
Regression?
No response
Known Workarounds
No response
Configuration
- .NET 9: No issues (identical code and packages)
- .NET 10.0.x: Reproducible
- OS: Linux (Azure App Services) and Windows
- Downstream library: Azure Cosmos DB SDK 3.46.0
Other information
Potentially relevant .NET 10 changes
- #112383 — Disposed HTTP/1.1 connections are no longer returned to the pool, potentially reducing pool hit rate and forcing more fresh TCP+TLS connection establishments
- #110744 — Race condition fix in connection timeout CTS assignment, changes connection establishment timing
Stack trace
System.Threading.Tasks.TaskCanceledException: The operation was canceled.
---> System.IO.IOException: Unable to read data from the transport
connection: Operation canceled.
---> System.Net.Sockets.SocketException (125): Operation canceled
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs
.ThrowException(...)
at System.Net.Security.SslStream
.EnsureFullTlsFrameAsync...
at System.Net.Security.SslStream
.ReadAsyncInternal...
at System.Net.Http.HttpConnection.SendAsync(...)
at System.Net.Http.HttpConnectionPool
.SendWithVersionDetectionAndRetryAsync(...)
at System.Net.Http.Metrics.MetricsHandler
.SendAsyncWithMetrics(...)
at System.Net.Http.DiagnosticsHandler
.SendAsyncCore(...)Impact
The Azure Cosmos DB .NET SDK has an internal 500ms first-attempt timeout for control plane operations that worked reliably on .NET 9. After upgrading to .NET 10, these requests intermittently exceed 500ms, causing recurring TaskCanceledException errors across multiple microservices in production (Azure App Services).
While the immediate fix belongs in the Cosmos SDK (azure-cosmos-dotnet-v3#5642),
the latency regression in SocketsHttpHandler may affect other libraries with similar internal timeout policies.