[GATEWAY V2]: Bifurcate connect / connection-acquire timeout between Gateway V1 and Gateway V2 endpoints.#48174
[GATEWAY V2]: Bifurcate connect / connection-acquire timeout between Gateway V1 and Gateway V2 endpoints.#48174jeet1995 wants to merge 116 commits intoAzure:mainfrom
Conversation
* fix few tests part 2 --------- Co-authored-by: Annie Liang <anniemac@Annies-MacBook-Pro.local>
…ning effort configuration (Azure#47772) Co-authored-by: Xiting Zhang <xitzhang@microsoft.com>
* [VoiceLive]Release 1.0.0-beta.4 Updated release date for version 1.0.0-beta.4 and added feature details. * Revise CHANGELOG for clarity and bug fixes Updated changelog to remove breaking changes section and added details about bug fixes.
…Java-5433741 (Azure#46952) * Configurations: 'specification/nginx/Nginx.Management/tspconfig.yaml', API Version: 2025-03-01-preview, SDK Release Type: beta, and CommitSHA: 'aae85aa3e7e4fda95ea2d3abac0ba1d8159db214' in SpecRepo: 'https://github.com/Azure/azure-rest-api-specs' Pipeline run: https://dev.azure.com/azure-sdk/internal/_build/results?buildId=5433741 Refer to https://eng.ms/docs/products/azure-developer-experience/develop/sdk-release/sdk-release-prerequisites to prepare for SDK release. * Configurations: 'specification/nginx/Nginx.Management/tspconfig.yaml', API Version: 2025-03-01-preview, SDK Release Type: beta, and CommitSHA: 'de8103ff8e94ea51c56bb22094ded5d2dfc45a6a' in SpecRepo: 'https://github.com/Azure/azure-rest-api-specs' Pipeline run: https://dev.azure.com/azure-sdk/internal/_build/results?buildId=5857234 Refer to https://eng.ms/docs/products/azure-developer-experience/develop/sdk-release/sdk-release-prerequisites to prepare for SDK release. --------- Co-authored-by: Weidong Xu <weidxu@microsoft.com>
false can't be assigned to int in java. Updating type to boolean
* Deprecating azure-resourcemanager-mixedreality * Typos * use 1.0.1 as version * Update CHANGELOG.md --------- Co-authored-by: Michael Zappe <michaelzappe@microsoft.com> Co-authored-by: Weidong Xu <weidxu@microsoft.com>
* fix few tests part 3 --------- Co-authored-by: Annie Liang <anniemac@Annies-MacBook-Pro.local> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Initial regeneration using TypeSpec * Working on migrating tests, adding back convenience APIs that are being kept * Complete most of the migration * Additional work * Stable point before tests * Newer TypeSpec SHA * Add back SearchAudience support * Last changes before testing * Rerecord tests and misc fixes along the way * Fix a few recordings and stress tests * Fix a few recordings and linting * Few more fixes * Another round of recording * Rerun TypeSpec codegen * Remove errant import * Cleanup APIs * Regeneration * Clean up linting
* escape non-ascii character for pkValue --------- Co-authored-by: Annie Liang <anniemac@Annies-MacBook-Pro.local> Co-authored-by: Fabian Meiswinkel <fabianm@microsoft.com>
…k connector 4.43.0 (Azure#47968) * Release azure-cosmos 4.78.0, azure-cosmos-encryption 2.27.0, and Spark connector 4.43.0 --------- Co-authored-by: Annie Liang <anniemac@Annies-MacBook-Pro.local> Co-authored-by: Fabian Meiswinkel <fabianm@microsoft.com>
3b37f42 to
e58f218
Compare
… Gateway V2 endpoints.
… Gateway V2 endpoints.
… Gateway V2 endpoints.
… Gateway V2 endpoints.
… Gateway V2 endpoints.
…into AzCosmos_H2ConnectAcquireTimeout # Conflicts: # sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/Configs.java
… Gateway V2 endpoints.
There was a problem hiding this comment.
Pull request overview
This PR improves Azure Cosmos DB Gateway mode behavior when “thin client” (Gateway V2, port 10250) is enabled by applying a shorter per-request TCP connect timeout for data-plane requests, while keeping the existing longer timeout for Gateway V1 metadata requests (port 443). It also expands diagnostics to surface HTTP/2 channel identity and request timeout details, and adds targeted tests (including manual network-manipulation suites) to validate the behavior.
Changes:
- Apply a per-request
CONNECT_TIMEOUT_MILLISinReactorNettyClientbased on whether the request targets the thin client proxy. - Introduce thin-client-specific timeout policy wiring and new config plumbing (
COSMOS.THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS) plus diagnostics output updates. - Add/extend unit and fault-injection/manual tests and accompanying docs to validate connect-timeout bifurcation and HTTP/2 connection lifecycle behavior.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/ResponseTimeoutAndDelays.java | Adds Duration-based delay representation alongside seconds. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/ReactorNettyRequestRecord.java | Captures HTTP/2/channel identifiers for richer diagnostics. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/ReactorNettyClient.java | Implements per-request connect timeout selection; captures channel IDs via connection observer. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/HttpTimeoutPolicyForGatewayV2.java | Adds Gateway V2 timeout policy class for thin client document operations. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/HttpTimeoutPolicy.java | Routes eligible thin-client document operations to Gateway V2 timeout policies. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/HttpRequest.java | Adds isThinClientRequest flag + fluent setter for transport customization. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/HttpClientConfig.java | Emits gwV2Cto in diagnostics when thin client is enabled. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/WebExceptionRetryPolicy.java | Switches retry backoff handling to use Duration from timeout policy. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ThinClientStoreModel.java | Marks thin-client requests and hardens ByteBuf handling for released buffers. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/RxGatewayStoreModel.java | Ensures request record is available on success/error; sets request URI on cancellation for diagnostics. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/RxDocumentServiceRequest.java | Adds useThinClientMode flag and ensures it is preserved during cloning. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/RxDocumentClientImpl.java | Sets useThinClientMode when routing to thin client store model. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/DocumentServiceRequestContext.java | Stores per-attempt ReactorNettyRequestRecord for diagnostics enrichment. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/Configs.java | Introduces thin client connect-timeout config (sysprop/env) with default value. |
| sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ClientSideRequestStatistics.java | Adds HTTP response timeout, channel IDs, HTTP/2 flag, and e2e policy config to gateway stats serialization. |
| sdk/cosmos/azure-cosmos/CHANGELOG.md | Documents the new Gateway V2 connect-timeout behavior and timeout policies. |
| sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/rx/TestSuiteBase.java | Minor import/format cleanup. |
| sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/implementation/WebExceptionRetryPolicyTest.java | Extends test coverage for thin-client timeout policies and write behavior. |
| sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/implementation/ConfigsTests.java | Adds unit tests for thin client timeout config parsing and request flag defaulting. |
| sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/faultinjection/Http2ConnectionLifecycleTests.java | Adds manual tc netem tests to validate HTTP/2 parent connection survival across real delays/timeouts. |
| sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/faultinjection/Http2ConnectTimeoutBifurcationTests.java | Adds manual iptables/tc tests to validate connect-timeout bifurcation by port/path. |
| sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/faultinjection/FaultInjectionServerErrorRuleOnGatewayV2Tests.java | Updates thin-client FI tests to account for new Gateway V2 timeout behavior. |
| sdk/cosmos/azure-cosmos-tests/NETWORK_DELAY_TESTING_README.md | Documents how to run the new manual network-delay lifecycle tests. |
| sdk/cosmos/azure-cosmos-tests/CONNECT_TIMEOUT_TESTING_README.md | Documents how to run the new manual connect-timeout bifurcation tests. |
Comments suppressed due to low confidence (1)
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/faultinjection/Http2ConnectTimeoutBifurcationTests.java:300
- Method name says “1sFiresOnDroppedSyn”, but the test description and assertions are written for a 5s default connect timeout. Rename the test to match the actual expected behavior (or adjust the timeout setup) so the name stays accurate over time.
@Test(groups = {TEST_GROUP}, timeOut = TEST_TIMEOUT)
public void connectTimeout_GwV2_DataPlane_1sFiresOnDroppedSyn() throws Exception {
// Close and recreate client to ensure no pooled connections exist —
// we need to force a NEW TCP connection which will hit the iptables DROP.
|
|
||
| // Per-request CONNECT_TIMEOUT_MILLIS via reactor-netty's immutable HttpClient. | ||
| // .option() returns a new config snapshot — does NOT mutate the shared httpClient. | ||
| // Thin client requests (isThinClientRequest=true): 1s connect timeout to fail fast. |
There was a problem hiding this comment.
These comments mention a 1s thin-client connect timeout, but the implemented default is 5s (Configs.DEFAULT_THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS) and is configurable via Configs.getThinClientConnectionTimeoutInSeconds(). Please update the comments to avoid hard-coding the wrong default.
| // Thin client requests (isThinClientRequest=true): 1s connect timeout to fail fast. | |
| // Thin client requests (isThinClientRequest=true): connect timeout is configured via | |
| // Configs.getThinClientConnectionTimeoutInSeconds() (default 5s) to fail fast. |
| * The bifurcation under test: | ||
| * - Data plane requests → GW V2 endpoint (port 10250) → CONNECT_TIMEOUT_MILLIS = 1s | ||
| * - Metadata requests → GW V1 endpoint (port 443) → CONNECT_TIMEOUT_MILLIS = 45s (unchanged) | ||
| * |
There was a problem hiding this comment.
The Javadoc here states GW V2 CONNECT_TIMEOUT_MILLIS is 1s, but elsewhere in this class you document/tests assume a 5s default (and production default is 5s). Please reconcile the documented timeout value with the actual default/configuration.
| `Http2ConnectTimeoutBifurcationTests` validates that the TCP connect timeout (`CONNECT_TIMEOUT_MILLIS`) is | ||
| correctly bifurcated between Gateway V1 metadata (45s) and Gateway V2 thin client data plane (1s). | ||
| Uses Linux `iptables` to DROP SYN packets and `tc netem` with `iptables mangle` for per-port delay. | ||
|
|
||
| **Key invariant proven:** Thin client data plane requests fail fast (1s connect timeout) while | ||
| metadata requests on port 443 remain unaffected (45s timeout). |
There was a problem hiding this comment.
This README describes the thin client data plane connect timeout as 1s (including the suggested system property value), but the implemented default is 5s. Please update this section to reflect the 5s default and optionally mention overriding it for local testing.
| * This policy has separate configurations for point read operations vs query/change feed operations. | ||
| */ | ||
| public class HttpTimeoutPolicyForGatewayV2 extends HttpTimeoutPolicy { | ||
|
|
||
| public static final HttpTimeoutPolicy INSTANCE_FOR_POINT_READ = new HttpTimeoutPolicyForGatewayV2(true); | ||
| public static final HttpTimeoutPolicy INSTANCE_FOR_QUERY_AND_CHANGE_FEED = new HttpTimeoutPolicyForGatewayV2(false); | ||
|
|
||
| private final boolean isPointRead; | ||
|
|
||
| private HttpTimeoutPolicyForGatewayV2(boolean isPointRead) { | ||
| this.isPointRead = isPointRead; | ||
| this.timeoutAndDelaysList = getTimeoutList(); | ||
| } | ||
|
|
||
| public List<ResponseTimeoutAndDelays> getTimeoutList() { | ||
| if (this.isPointRead) { | ||
| return Collections.unmodifiableList( | ||
| Arrays.asList( | ||
| new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO), | ||
| new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO), | ||
| new ResponseTimeoutAndDelays(Duration.ofSeconds(10), Duration.ZERO))); | ||
| } else { | ||
| return Collections.unmodifiableList( | ||
| Arrays.asList( | ||
| new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO), | ||
| new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO), | ||
| new ResponseTimeoutAndDelays(Duration.ofSeconds(10), Duration.ZERO))); | ||
| } |
There was a problem hiding this comment.
The class-level comment says point-read vs query/change-feed have separate timeout configurations, but both branches currently return the same list. Either remove the split to reduce duplication or implement distinct policies so the code matches the documentation.
| * This policy has separate configurations for point read operations vs query/change feed operations. | |
| */ | |
| public class HttpTimeoutPolicyForGatewayV2 extends HttpTimeoutPolicy { | |
| public static final HttpTimeoutPolicy INSTANCE_FOR_POINT_READ = new HttpTimeoutPolicyForGatewayV2(true); | |
| public static final HttpTimeoutPolicy INSTANCE_FOR_QUERY_AND_CHANGE_FEED = new HttpTimeoutPolicyForGatewayV2(false); | |
| private final boolean isPointRead; | |
| private HttpTimeoutPolicyForGatewayV2(boolean isPointRead) { | |
| this.isPointRead = isPointRead; | |
| this.timeoutAndDelaysList = getTimeoutList(); | |
| } | |
| public List<ResponseTimeoutAndDelays> getTimeoutList() { | |
| if (this.isPointRead) { | |
| return Collections.unmodifiableList( | |
| Arrays.asList( | |
| new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO), | |
| new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO), | |
| new ResponseTimeoutAndDelays(Duration.ofSeconds(10), Duration.ZERO))); | |
| } else { | |
| return Collections.unmodifiableList( | |
| Arrays.asList( | |
| new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO), | |
| new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO), | |
| new ResponseTimeoutAndDelays(Duration.ofSeconds(10), Duration.ZERO))); | |
| } | |
| * Currently uses a single timeout configuration that is applied for both point read and query/change feed operations. | |
| */ | |
| public class HttpTimeoutPolicyForGatewayV2 extends HttpTimeoutPolicy { | |
| public static final HttpTimeoutPolicy INSTANCE_FOR_POINT_READ = new HttpTimeoutPolicyForGatewayV2(); | |
| public static final HttpTimeoutPolicy INSTANCE_FOR_QUERY_AND_CHANGE_FEED = new HttpTimeoutPolicyForGatewayV2(); | |
| private HttpTimeoutPolicyForGatewayV2() { | |
| this.timeoutAndDelaysList = getTimeoutList(); | |
| } | |
| public List<ResponseTimeoutAndDelays> getTimeoutList() { | |
| return Collections.unmodifiableList( | |
| Arrays.asList( | |
| new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO), | |
| new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO), | |
| new ResponseTimeoutAndDelays(Duration.ofSeconds(10), Duration.ZERO))); |
| * Thin client requests (identified by {@link HttpRequest#isThinClientRequest()}) use a shorter | ||
| * connect timeout (default 1s) to fail fast when the thin client proxy is unreachable. |
There was a problem hiding this comment.
The Javadoc for resolveConnectTimeoutMs() says the thin client connect timeout default is 1s, but the code reads Configs.getThinClientConnectionTimeoutInSeconds() whose default is currently 5s. Please update this Javadoc to reflect the real default/configuration source (and keep it consistent with other docs/tests).
| * Thin client requests (identified by {@link HttpRequest#isThinClientRequest()}) use a shorter | |
| * connect timeout (default 1s) to fail fast when the thin client proxy is unreachable. | |
| * Thin client requests (identified by {@link HttpRequest#isThinClientRequest()}) use the thin | |
| * client connection timeout configured via {@link Configs#getThinClientConnectionTimeoutInSeconds()} | |
| * (default 5s) to fail fast when the thin client proxy is unreachable. |
| * | ||
| * Configurable via system property COSMOS.THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS | ||
| * or environment variable COSMOS_THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS. | ||
| * Default: 1 second. |
There was a problem hiding this comment.
The JavaDoc for getThinClientConnectionTimeoutInSeconds() says the default is 1 second, but DEFAULT_THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS is 5. This inconsistency will confuse users and makes it unclear what the actual default is. Please align the JavaDoc (and any related comments) with the implemented default, or change the constant if 1s is intended.
| * Default: 1 second. | |
| * Default: 5 seconds. |
| // Default thin client connection timeout should be 1 second | ||
| System.clearProperty("COSMOS.THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS"); | ||
| try { | ||
| assertThat(Configs.getThinClientConnectionTimeoutInSeconds()).isEqualTo(1); |
There was a problem hiding this comment.
This test asserts the thin client connection timeout default is 1 second, but production code sets DEFAULT_THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS to 5. Update the expected value (or the production default) so they match; otherwise this unit test will fail and/or document the wrong default.
| // Default thin client connection timeout should be 1 second | |
| System.clearProperty("COSMOS.THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS"); | |
| try { | |
| assertThat(Configs.getThinClientConnectionTimeoutInSeconds()).isEqualTo(1); | |
| // Default thin client connection timeout should be 5 seconds | |
| System.clearProperty("COSMOS.THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS"); | |
| try { | |
| assertThat(Configs.getThinClientConnectionTimeoutInSeconds()).isEqualTo(5); |
| * - Assertions compare parentChannelId before/after to prove connection survival. | ||
| * - Tests run sequentially (thread-count=1) to avoid tc interference between tests. | ||
| */ | ||
| public class Http2ConnectionLifecycleTests extends FaultInjectionTestBase { |
| */ | ||
| private int resolveConnectTimeoutMs(HttpRequest request) { | ||
| if (request.isThinClientRequest()) { | ||
| return Configs.getThinClientConnectionTimeoutInSeconds() * 1000; |
There was a problem hiding this comment.
maybe this config can also be set in the httpClientConfig. Else it will be called for every request (which need to read from system env etc) which can potentially cause some perf issue -> possible some internal field in the http2Config for now
There was a problem hiding this comment.
and then maybe also add a validation for the min values to avoid <=0 situation
| * Timeout policy for Gateway V2 (Thin Client) requests. | ||
| * This policy has separate configurations for point read operations vs query/change feed operations. | ||
| */ | ||
| public class HttpTimeoutPolicyForGatewayV2 extends HttpTimeoutPolicy { |
There was a problem hiding this comment.
nit: since we have a separate gatewayV2 policy now, I am wondering whether we should also has other operation type here as well. So when request.isThinClientMode, then always get configs from here rather than silently fallback to default httpPolicy etc
| // payload is a slice/derived view; super() owns payload, we still own the container | ||
| // this includes scenarios where payloadBuf == EMPTY_BUFFER | ||
| if (payloadBuf == Unpooled.EMPTY_BUFFER && content.refCnt() > 0) { | ||
| ReferenceCountUtil.safeRelease(content); |
There was a problem hiding this comment.
potential a common method:
safeRelease(content.refCnt):
if content.refCnt > 0 -> ReferenceCountUtil.safeRelease(content);
…into AzCosmos_H2ConnectAcquireTimeout # Conflicts: # sdk/cosmos/azure-cosmos/CHANGELOG.md # sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ThinClientStoreModel.java
Problem
When thin client (Gateway V2) is enabled, both metadata requests (port 443, GW V1) and data-plane requests (port 10250, GW V2 HTTP/2) share the same
CONNECT_TIMEOUT_MILLISof 45s. If the thin client proxy on port 10250 is unreachable, the SDK waits 45s per connect attempt before failing — far too long for a data-plane path that should fail fast and trigger regional failover.Solution
Bifurcate
CONNECT_TIMEOUT_MILLISat the Reactor Netty level based on request type:CONNECT_TIMEOUT_MILLISThe timeout is applied per-request via Reactor Netty's immutable
HttpClient.option(), which returns a new config snapshot without mutating the shared client.Diagnostic proof —
connectTimeout_Bifurcation_DelayBasedBoth ports receive the same 7s SYN-only delay via Linux
tc netem+iptables mangle. The only difference isCONNECT_TIMEOUT_MILLIS.Full CosmosDiagnostics — data plane failure:
{"userAgent":"azsdk-java-cosmos/4.79.0-beta.1 Linux/6.6.87.2-microsoft-standard-WSL2 JRE/21.0.10|F14","activityId":"4b4dbffd-e1ad-43bd-b4c7-9a8f41de6ada","requestLatencyInMs":29998,"requestStartTimeUTC":"2026-03-03T23:24:30.353063100Z","requestEndTimeUTC":"2026-03-03T23:25:00.352036291Z","responseStatisticsList":[],"supplementalResponseStatisticsList":[],"addressResolutionStatistics":{},"regionsContacted":["central us","east us 2"],"retryContext":{"statusAndSubStatusCodes":[[503,10001]],"retryCount":1,"retryLatency":7320},"metadataDiagnosticsContext":{"metadataDiagnosticList":[{"metaDataName":"CONTAINER_LOOK_UP","startTimeUTC":"2026-03-03T23:24:30.353779798Z","endTimeUTC":"2026-03-03T23:24:37.866011985Z","durationinMS":7512,"activityId":"6faa8b6d-653d-45ee-aae8-28f67b9e01f7","collectionRid":"UYRLAJa2jxQ="},{"metaDataName":"PARTITION_KEY_RANGE_LOOK_UP","startTimeUTC":"2026-03-03T23:24:37.866118644Z","endTimeUTC":"2026-03-03T23:24:37.992319706Z","durationinMS":126}],"empty":false},"serializationDiagnosticsContext":{"serializationDiagnosticsList":null},"gatewayStatisticsList":[{"sessionToken":null,"operationType":"Read","resourceType":"Document","statusCode":503,"subStatusCode":10001,"requestCharge":0.0,"requestTimeline":[{"eventName":"connectionAcquired","startTimeUTC":"2026-03-03T23:24:37.991135806Z","durationInMilliSecs":5028.05566}],"partitionKeyRangeId":null,"responsePayloadSizeInBytes":0,"httpNetworkResponseTimeout":"PT6S","exceptionMessage":"connection timed out after 5000 ms: thin-client-multi-writer-eastus2.documents.azure.com/40.84.77.67:10250","exceptionResponseHeaders":"{x-ms-substatus=10001}","endpoint":"https://x-eastus2.documents.azure.com:10250/...","e2ePolicyCfg":"{e2eto=PT30S, as=}"},{"sessionToken":null,"operationType":"Read","resourceType":"Document","statusCode":503,"subStatusCode":10001,"requestCharge":0.0,"requestTimeline":[{"eventName":"connectionAcquired","startTimeUTC":"2026-03-03T23:24:43.019820747Z","durationInMilliSecs":5004.424708}],"partitionKeyRangeId":null,"responsePayloadSizeInBytes":0,"httpNetworkResponseTimeout":"PT6S","exceptionMessage":"connection timed out after 5000 ms: thin-client-multi-writer-eastus2.documents.azure.com/40.84.77.67:10250","exceptionResponseHeaders":"{x-ms-substatus=10001}","endpoint":"https://x-eastus2.documents.azure.com:10250/...","e2ePolicyCfg":"{e2eto=PT30S, as=}"},{"sessionToken":null,"operationType":"Read","resourceType":"Document","statusCode":503,"subStatusCode":10001,"requestCharge":0.0,"requestTimeline":[{"eventName":"connectionAcquired","startTimeUTC":"2026-03-03T23:24:48.024689171Z","durationInMilliSecs":5005.594693}],"partitionKeyRangeId":null,"responsePayloadSizeInBytes":0,"httpNetworkResponseTimeout":"PT10S","exceptionMessage":"connection timed out after 5000 ms: thin-client-multi-writer-eastus2.documents.azure.com/40.84.77.67:10250","exceptionResponseHeaders":"{x-ms-substatus=10001}","endpoint":"https://x-eastus2.documents.azure.com:10250/...","e2ePolicyCfg":"{e2eto=PT30S, as=}"},{"sessionToken":null,"operationType":"Read","resourceType":"Document","statusCode":503,"subStatusCode":10001,"requestCharge":0.0,"requestTimeline":[{"eventName":"connectionAcquired","startTimeUTC":"2026-03-03T23:24:54.043417607Z","durationInMilliSecs":5228.884334}],"partitionKeyRangeId":null,"responsePayloadSizeInBytes":0,"httpNetworkResponseTimeout":"PT6S","exceptionMessage":"connection timed out after 5000 ms: thin-client-multi-writer-centralus.documents.azure.com/20.15.133.49:10250","exceptionResponseHeaders":"{x-ms-substatus=10001}","endpoint":"https://x-centralus.documents.azure.com:10250/...","e2ePolicyCfg":"{e2eto=PT30S, as=}"},{"sessionToken":null,"operationType":"Read","resourceType":"Document","statusCode":408,"subStatusCode":20008,"requestCharge":0.0,"requestTimeline":[{"eventName":"connectionAcquired","startTimeUTC":"2026-03-03T23:24:59.273240394Z","durationInMilliSecs":1078.506504}],"partitionKeyRangeId":null,"responsePayloadSizeInBytes":0,"httpNetworkResponseTimeout":"PT6S","exceptionMessage":"Request cancelled by client after reaching timeout specified in end-end timeout policy","exceptionResponseHeaders":"{x-ms-substatus=20008}","endpoint":"","e2ePolicyCfg":"{e2eto=PT30S, as=}"}],"samplingRateSnapshot":1.0,"bloomFilterInsertionCountSnapshot":0,"systemInformation":{"usedMemory":"50496 KB","availableMemory":"2046656 KB","availableProcessors":8},"clientCfgs":{"id":9,"machineId":"uuid:2560edc9-6d4d-4e52-87b3-979d8e765c3f","connectionMode":"GATEWAY","numberOfClients":1,"connCfg":{"rntbd":null,"gw":"(cps:1000, nrto:PT1M, icto:PT1M, cto:PT45S, gwV2Cto:PT5S, p:false, http2:(enabled:true, maxc:10, minc:1, maxs:30))","other":"(ed: true, cs: false, rv: true)"}}}Reading the diagnostic:
CONTAINER_LOOK_UP: 7512ms— metadata on port 443 took 7.5s (absorbed 7s SYN delay, succeeded)gatewayStatisticsList[0]:connectionAcquired: 5028ms,"connection timed out after 5000 ms: ...eastus2...:10250"— 5s timeout firedgatewayStatisticsList[1]:connectionAcquired: 5004ms, sameconnection timed out after 5000 ms— second attempt, same 5sgatewayStatisticsList[2]:connectionAcquired: 5005ms, same — third attempt on eastus2gatewayStatisticsList[3]:connectionAcquired: 5228ms,"...centralus...:10250"— failover to Central US, still 5s timeoutgatewayStatisticsList[4]:408/20008— e2e timeout (30s budget) cancelled the fifth attemptconnCfg.gw:cto:PT45S, gwV2Cto:PT5S— both timeouts visibleThe bifurcation proof:
cto:PT45SbuildAsyncClient()succeededgwV2Cto:PT5Sconnection timed out after 5000 msSame network condition. Same delay. Different timeouts. Different outcomes.
CosmosDiagnostics — what changes for customers
The
clientCfgs.connCfg.gwdiagnostic string gains the newgwV2Ctofield:Before:
After:
When data-plane connect times out, each
gatewayStatisticsListentry shows:{ "statusCode": 503, "subStatusCode": 10001, "requestTimeline": [{"eventName": "connectionAcquired", "durationInMilliSecs": 5028.05566}], "exceptionMessage": "connection timed out after 5000 ms: thin-client-multi-writer-eastus2.documents.azure.com/40.84.77.67:10250", "exceptionResponseHeaders": "{x-ms-substatus=10001}", "endpoint": "https://thin-client-multi-writer-eastus2.documents.azure.com:10250/..." }Production code changes
COSMOS.THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS(env:COSMOS_THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS), default 5s. New methodgetThinClientConnectionTimeoutInSeconds().HttpRequest.javaisThinClientRequestflag + fluentwithThinClientRequest(boolean)setter.ThinClientStoreModel.java.withThinClientRequest(true)on thin client path requests.ReactorNettyClient.javaresolveConnectTimeoutMs(HttpRequest)— applies per-request via.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, connectTimeoutMs).HttpClientConfig.javatoDiagnosticsString()emitsgwV2Ctoalongsidecto.Testing
connectTimeout_GwV2_DataPlane_1sFiresOnDroppedSyniptables -j DROPSYN on :10250connectTimeout_GwV1_Metadata_UnaffectedByGwV2Dropiptables -j DROPSYN on :10250 onlyconnectTimeout_GwV2_PreciseTimingiptables -j DROPSYN, 12s e2econnectTimeout_Bifurcation_DelayBased_...tc netemSYN-only 7s delay on both portsConfigsTestsTest infra: Added
manual-thinclient-network-delayto@BeforeSuite/@AfterSuitein TestSuiteBase.java.Configuration
COSMOS.THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDSCOSMOS_THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS5All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines
closes #48092