Skip to content

[GATEWAY V2]: Bifurcate connect / connection-acquire timeout between Gateway V1 and Gateway V2 endpoints.#48174

Open
jeet1995 wants to merge 116 commits intoAzure:mainfrom
jeet1995:AzCosmos_H2ConnectAcquireTimeout
Open

[GATEWAY V2]: Bifurcate connect / connection-acquire timeout between Gateway V1 and Gateway V2 endpoints.#48174
jeet1995 wants to merge 116 commits intoAzure:mainfrom
jeet1995:AzCosmos_H2ConnectAcquireTimeout

Conversation

@jeet1995
Copy link
Member

@jeet1995 jeet1995 commented Feb 28, 2026

Problem

When thin client (Gateway V2) is enabled, both metadata requests (port 443, GW V1) and data-plane requests (port 10250, GW V2 HTTP/2) share the same CONNECT_TIMEOUT_MILLIS of 45s. If the thin client proxy on port 10250 is unreachable, the SDK waits 45s per connect attempt before failing — far too long for a data-plane path that should fail fast and trigger regional failover.

Solution

Bifurcate CONNECT_TIMEOUT_MILLIS at the Reactor Netty level based on request type:

Path Port CONNECT_TIMEOUT_MILLIS Rationale
Metadata (GW V1) 443 45s (unchanged) Account init, address resolution — long timeout is expected
Data plane (GW V2) 10250 5s (new default) Thin client proxy — fail fast, let SDK retry/failover

The timeout is applied per-request via Reactor Netty's immutable HttpClient.option(), which returns a new config snapshot without mutating the shared client.


Diagnostic proof — connectTimeout_Bifurcation_DelayBased

Both ports receive the same 7s SYN-only delay via Linux tc netem + iptables mangle. The only difference is CONNECT_TIMEOUT_MILLIS.

Full CosmosDiagnostics — data plane failure:

{"userAgent":"azsdk-java-cosmos/4.79.0-beta.1 Linux/6.6.87.2-microsoft-standard-WSL2 JRE/21.0.10|F14","activityId":"4b4dbffd-e1ad-43bd-b4c7-9a8f41de6ada","requestLatencyInMs":29998,"requestStartTimeUTC":"2026-03-03T23:24:30.353063100Z","requestEndTimeUTC":"2026-03-03T23:25:00.352036291Z","responseStatisticsList":[],"supplementalResponseStatisticsList":[],"addressResolutionStatistics":{},"regionsContacted":["central us","east us 2"],"retryContext":{"statusAndSubStatusCodes":[[503,10001]],"retryCount":1,"retryLatency":7320},"metadataDiagnosticsContext":{"metadataDiagnosticList":[{"metaDataName":"CONTAINER_LOOK_UP","startTimeUTC":"2026-03-03T23:24:30.353779798Z","endTimeUTC":"2026-03-03T23:24:37.866011985Z","durationinMS":7512,"activityId":"6faa8b6d-653d-45ee-aae8-28f67b9e01f7","collectionRid":"UYRLAJa2jxQ="},{"metaDataName":"PARTITION_KEY_RANGE_LOOK_UP","startTimeUTC":"2026-03-03T23:24:37.866118644Z","endTimeUTC":"2026-03-03T23:24:37.992319706Z","durationinMS":126}],"empty":false},"serializationDiagnosticsContext":{"serializationDiagnosticsList":null},"gatewayStatisticsList":[{"sessionToken":null,"operationType":"Read","resourceType":"Document","statusCode":503,"subStatusCode":10001,"requestCharge":0.0,"requestTimeline":[{"eventName":"connectionAcquired","startTimeUTC":"2026-03-03T23:24:37.991135806Z","durationInMilliSecs":5028.05566}],"partitionKeyRangeId":null,"responsePayloadSizeInBytes":0,"httpNetworkResponseTimeout":"PT6S","exceptionMessage":"connection timed out after 5000 ms: thin-client-multi-writer-eastus2.documents.azure.com/40.84.77.67:10250","exceptionResponseHeaders":"{x-ms-substatus=10001}","endpoint":"https://x-eastus2.documents.azure.com:10250/...","e2ePolicyCfg":"{e2eto=PT30S, as=}"},{"sessionToken":null,"operationType":"Read","resourceType":"Document","statusCode":503,"subStatusCode":10001,"requestCharge":0.0,"requestTimeline":[{"eventName":"connectionAcquired","startTimeUTC":"2026-03-03T23:24:43.019820747Z","durationInMilliSecs":5004.424708}],"partitionKeyRangeId":null,"responsePayloadSizeInBytes":0,"httpNetworkResponseTimeout":"PT6S","exceptionMessage":"connection timed out after 5000 ms: thin-client-multi-writer-eastus2.documents.azure.com/40.84.77.67:10250","exceptionResponseHeaders":"{x-ms-substatus=10001}","endpoint":"https://x-eastus2.documents.azure.com:10250/...","e2ePolicyCfg":"{e2eto=PT30S, as=}"},{"sessionToken":null,"operationType":"Read","resourceType":"Document","statusCode":503,"subStatusCode":10001,"requestCharge":0.0,"requestTimeline":[{"eventName":"connectionAcquired","startTimeUTC":"2026-03-03T23:24:48.024689171Z","durationInMilliSecs":5005.594693}],"partitionKeyRangeId":null,"responsePayloadSizeInBytes":0,"httpNetworkResponseTimeout":"PT10S","exceptionMessage":"connection timed out after 5000 ms: thin-client-multi-writer-eastus2.documents.azure.com/40.84.77.67:10250","exceptionResponseHeaders":"{x-ms-substatus=10001}","endpoint":"https://x-eastus2.documents.azure.com:10250/...","e2ePolicyCfg":"{e2eto=PT30S, as=}"},{"sessionToken":null,"operationType":"Read","resourceType":"Document","statusCode":503,"subStatusCode":10001,"requestCharge":0.0,"requestTimeline":[{"eventName":"connectionAcquired","startTimeUTC":"2026-03-03T23:24:54.043417607Z","durationInMilliSecs":5228.884334}],"partitionKeyRangeId":null,"responsePayloadSizeInBytes":0,"httpNetworkResponseTimeout":"PT6S","exceptionMessage":"connection timed out after 5000 ms: thin-client-multi-writer-centralus.documents.azure.com/20.15.133.49:10250","exceptionResponseHeaders":"{x-ms-substatus=10001}","endpoint":"https://x-centralus.documents.azure.com:10250/...","e2ePolicyCfg":"{e2eto=PT30S, as=}"},{"sessionToken":null,"operationType":"Read","resourceType":"Document","statusCode":408,"subStatusCode":20008,"requestCharge":0.0,"requestTimeline":[{"eventName":"connectionAcquired","startTimeUTC":"2026-03-03T23:24:59.273240394Z","durationInMilliSecs":1078.506504}],"partitionKeyRangeId":null,"responsePayloadSizeInBytes":0,"httpNetworkResponseTimeout":"PT6S","exceptionMessage":"Request cancelled by client after reaching timeout specified in end-end timeout policy","exceptionResponseHeaders":"{x-ms-substatus=20008}","endpoint":"","e2ePolicyCfg":"{e2eto=PT30S, as=}"}],"samplingRateSnapshot":1.0,"bloomFilterInsertionCountSnapshot":0,"systemInformation":{"usedMemory":"50496 KB","availableMemory":"2046656 KB","availableProcessors":8},"clientCfgs":{"id":9,"machineId":"uuid:2560edc9-6d4d-4e52-87b3-979d8e765c3f","connectionMode":"GATEWAY","numberOfClients":1,"connCfg":{"rntbd":null,"gw":"(cps:1000, nrto:PT1M, icto:PT1M, cto:PT45S, gwV2Cto:PT5S, p:false, http2:(enabled:true, maxc:10, minc:1, maxs:30))","other":"(ed: true, cs: false, rv: true)"}}}

Reading the diagnostic:

  • CONTAINER_LOOK_UP: 7512ms — metadata on port 443 took 7.5s (absorbed 7s SYN delay, succeeded)
  • gatewayStatisticsList[0]: connectionAcquired: 5028ms, "connection timed out after 5000 ms: ...eastus2...:10250"5s timeout fired
  • gatewayStatisticsList[1]: connectionAcquired: 5004ms, same connection timed out after 5000 ms — second attempt, same 5s
  • gatewayStatisticsList[2]: connectionAcquired: 5005ms, same — third attempt on eastus2
  • gatewayStatisticsList[3]: connectionAcquired: 5228ms, "...centralus...:10250" — failover to Central US, still 5s timeout
  • gatewayStatisticsList[4]: 408/20008 — e2e timeout (30s budget) cancelled the fifth attempt
  • connCfg.gw: cto:PT45S, gwV2Cto:PT5S — both timeouts visible

The bifurcation proof:

connCfg.gw: (cps:1000, nrto:PT1M, icto:PT1M, cto:PT45S, gwV2Cto:PT5S, p:false, http2:(enabled:true, maxc:10, minc:1, maxs:30))
                                                 ^^^^^^^^  ^^^^^^^^^^^
                                                 metadata   data plane
                                                 (GW V1)    (GW V2)
Port SYN delay Timeout config Outcome Evidence
443 7s cto:PT45S Connected in 7430ms buildAsyncClient() succeeded
10250 7s gwV2Cto:PT5S Timed out at ~5005ms per attempt connection timed out after 5000 ms

Same network condition. Same delay. Different timeouts. Different outcomes.


CosmosDiagnostics — what changes for customers

The clientCfgs.connCfg.gw diagnostic string gains the new gwV2Cto field:

Before:

"gw":"(cps:1000, nrto:PT1M, icto:PT1M, cto:PT45S, p:false, http2:(enabled:true, maxc:10, minc:1, maxs:30))"

After:

"gw":"(cps:1000, nrto:PT1M, icto:PT1M, cto:PT45S, gwV2Cto:PT5S, p:false, http2:(enabled:true, maxc:10, minc:1, maxs:30))"

When data-plane connect times out, each gatewayStatisticsList entry shows:

{
  "statusCode": 503,
  "subStatusCode": 10001,
  "requestTimeline": [{"eventName": "connectionAcquired", "durationInMilliSecs": 5028.05566}],
  "exceptionMessage": "connection timed out after 5000 ms: thin-client-multi-writer-eastus2.documents.azure.com/40.84.77.67:10250",
  "exceptionResponseHeaders": "{x-ms-substatus=10001}",
  "endpoint": "https://thin-client-multi-writer-eastus2.documents.azure.com:10250/..."
}

Production code changes

File Change
Configs.java New system property COSMOS.THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS (env: COSMOS_THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS), default 5s. New method getThinClientConnectionTimeoutInSeconds().
HttpRequest.java New isThinClientRequest flag + fluent withThinClientRequest(boolean) setter.
ThinClientStoreModel.java Calls .withThinClientRequest(true) on thin client path requests.
ReactorNettyClient.java New resolveConnectTimeoutMs(HttpRequest) — applies per-request via .option(ChannelOption.CONNECT_TIMEOUT_MILLIS, connectTimeoutMs).
HttpClientConfig.java toDiagnosticsString() emits gwV2Cto alongside cto.

Testing

Test Technique Result
connectTimeout_GwV2_DataPlane_1sFiresOnDroppedSyn iptables -j DROP SYN on :10250 PASSED
connectTimeout_GwV1_Metadata_UnaffectedByGwV2Drop iptables -j DROP SYN on :10250 only PASSED
connectTimeout_GwV2_PreciseTiming iptables -j DROP SYN, 12s e2e PASSED (3 entries in 12s — proves ~5s per attempt)
connectTimeout_Bifurcation_DelayBased_... tc netem SYN-only 7s delay on both ports PASSED (see diagnostic proof above)
ConfigsTests Unit test for config parsing PASSED

Test infra: Added manual-thinclient-network-delay to @BeforeSuite/@AfterSuite in TestSuiteBase.java.

Configuration

Property Env Variable Default
COSMOS.THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS COSMOS_THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS 5

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

closes #48092

jeet1995 and others added 30 commits February 2, 2026 17:17
* fix few tests part 2

---------

Co-authored-by: Annie Liang <anniemac@Annies-MacBook-Pro.local>
…ning effort configuration (Azure#47772)

Co-authored-by: Xiting Zhang <xitzhang@microsoft.com>
* [VoiceLive]Release 1.0.0-beta.4

Updated release date for version 1.0.0-beta.4 and added feature details.

* Revise CHANGELOG for clarity and bug fixes

Updated changelog to remove breaking changes section and added details about bug fixes.
…Java-5433741 (Azure#46952)

* Configurations:  'specification/nginx/Nginx.Management/tspconfig.yaml', API Version: 2025-03-01-preview, SDK Release Type: beta, and CommitSHA: 'aae85aa3e7e4fda95ea2d3abac0ba1d8159db214' in SpecRepo: 'https://github.com/Azure/azure-rest-api-specs' Pipeline run: https://dev.azure.com/azure-sdk/internal/_build/results?buildId=5433741 Refer to https://eng.ms/docs/products/azure-developer-experience/develop/sdk-release/sdk-release-prerequisites to prepare for SDK release.

* Configurations:  'specification/nginx/Nginx.Management/tspconfig.yaml', API Version: 2025-03-01-preview, SDK Release Type: beta, and CommitSHA: 'de8103ff8e94ea51c56bb22094ded5d2dfc45a6a' in SpecRepo: 'https://github.com/Azure/azure-rest-api-specs' Pipeline run: https://dev.azure.com/azure-sdk/internal/_build/results?buildId=5857234 Refer to https://eng.ms/docs/products/azure-developer-experience/develop/sdk-release/sdk-release-prerequisites to prepare for SDK release.

---------

Co-authored-by: Weidong Xu <weidxu@microsoft.com>
false can't be assigned to int in java. Updating type to boolean
* Deprecating azure-resourcemanager-mixedreality

* Typos

* use 1.0.1 as version

* Update CHANGELOG.md

---------

Co-authored-by: Michael Zappe <michaelzappe@microsoft.com>
Co-authored-by: Weidong Xu <weidxu@microsoft.com>
* fix few tests part 3


---------

Co-authored-by: Annie Liang <anniemac@Annies-MacBook-Pro.local>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Initial regeneration using TypeSpec

* Working on migrating tests, adding back convenience APIs that are being kept

* Complete most of the migration

* Additional work

* Stable point before tests

* Newer TypeSpec SHA

* Add back SearchAudience support

* Last changes before testing

* Rerecord tests and misc fixes along the way

* Fix a few recordings and stress tests

* Fix a few recordings and linting

* Few more fixes

* Another round of recording

* Rerun TypeSpec codegen

* Remove errant import

* Cleanup APIs

* Regeneration

* Clean up linting
* escape non-ascii character for pkValue

---------

Co-authored-by: Annie Liang <anniemac@Annies-MacBook-Pro.local>
Co-authored-by: Fabian Meiswinkel <fabianm@microsoft.com>
…k connector 4.43.0 (Azure#47968)

* Release azure-cosmos 4.78.0, azure-cosmos-encryption 2.27.0, and Spark connector 4.43.0

---------

Co-authored-by: Annie Liang <anniemac@Annies-MacBook-Pro.local>
Co-authored-by: Fabian Meiswinkel <fabianm@microsoft.com>
@jeet1995 jeet1995 force-pushed the AzCosmos_H2ConnectAcquireTimeout branch from 3b37f42 to e58f218 Compare February 28, 2026 17:40
@jeet1995 jeet1995 changed the title Az cosmos h2 connect acquire timeout Bifurcate connect / connection-acquire timeout between Gateway V1 and Gateway V2 endpoints. Mar 4, 2026
@jeet1995 jeet1995 marked this pull request as ready for review March 4, 2026 00:08
Copilot AI review requested due to automatic review settings March 4, 2026 00:08
@jeet1995 jeet1995 requested review from a team as code owners March 4, 2026 00:08
@jeet1995 jeet1995 changed the title Bifurcate connect / connection-acquire timeout between Gateway V1 and Gateway V2 endpoints. [GATEWAY V2]: Bifurcate connect / connection-acquire timeout between Gateway V1 and Gateway V2 endpoints. Mar 4, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves Azure Cosmos DB Gateway mode behavior when “thin client” (Gateway V2, port 10250) is enabled by applying a shorter per-request TCP connect timeout for data-plane requests, while keeping the existing longer timeout for Gateway V1 metadata requests (port 443). It also expands diagnostics to surface HTTP/2 channel identity and request timeout details, and adds targeted tests (including manual network-manipulation suites) to validate the behavior.

Changes:

  • Apply a per-request CONNECT_TIMEOUT_MILLIS in ReactorNettyClient based on whether the request targets the thin client proxy.
  • Introduce thin-client-specific timeout policy wiring and new config plumbing (COSMOS.THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS) plus diagnostics output updates.
  • Add/extend unit and fault-injection/manual tests and accompanying docs to validate connect-timeout bifurcation and HTTP/2 connection lifecycle behavior.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/ResponseTimeoutAndDelays.java Adds Duration-based delay representation alongside seconds.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/ReactorNettyRequestRecord.java Captures HTTP/2/channel identifiers for richer diagnostics.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/ReactorNettyClient.java Implements per-request connect timeout selection; captures channel IDs via connection observer.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/HttpTimeoutPolicyForGatewayV2.java Adds Gateway V2 timeout policy class for thin client document operations.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/HttpTimeoutPolicy.java Routes eligible thin-client document operations to Gateway V2 timeout policies.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/HttpRequest.java Adds isThinClientRequest flag + fluent setter for transport customization.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/http/HttpClientConfig.java Emits gwV2Cto in diagnostics when thin client is enabled.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/WebExceptionRetryPolicy.java Switches retry backoff handling to use Duration from timeout policy.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ThinClientStoreModel.java Marks thin-client requests and hardens ByteBuf handling for released buffers.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/RxGatewayStoreModel.java Ensures request record is available on success/error; sets request URI on cancellation for diagnostics.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/RxDocumentServiceRequest.java Adds useThinClientMode flag and ensures it is preserved during cloning.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/RxDocumentClientImpl.java Sets useThinClientMode when routing to thin client store model.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/DocumentServiceRequestContext.java Stores per-attempt ReactorNettyRequestRecord for diagnostics enrichment.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/Configs.java Introduces thin client connect-timeout config (sysprop/env) with default value.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ClientSideRequestStatistics.java Adds HTTP response timeout, channel IDs, HTTP/2 flag, and e2e policy config to gateway stats serialization.
sdk/cosmos/azure-cosmos/CHANGELOG.md Documents the new Gateway V2 connect-timeout behavior and timeout policies.
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/rx/TestSuiteBase.java Minor import/format cleanup.
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/implementation/WebExceptionRetryPolicyTest.java Extends test coverage for thin-client timeout policies and write behavior.
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/implementation/ConfigsTests.java Adds unit tests for thin client timeout config parsing and request flag defaulting.
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/faultinjection/Http2ConnectionLifecycleTests.java Adds manual tc netem tests to validate HTTP/2 parent connection survival across real delays/timeouts.
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/faultinjection/Http2ConnectTimeoutBifurcationTests.java Adds manual iptables/tc tests to validate connect-timeout bifurcation by port/path.
sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/faultinjection/FaultInjectionServerErrorRuleOnGatewayV2Tests.java Updates thin-client FI tests to account for new Gateway V2 timeout behavior.
sdk/cosmos/azure-cosmos-tests/NETWORK_DELAY_TESTING_README.md Documents how to run the new manual network-delay lifecycle tests.
sdk/cosmos/azure-cosmos-tests/CONNECT_TIMEOUT_TESTING_README.md Documents how to run the new manual connect-timeout bifurcation tests.
Comments suppressed due to low confidence (1)

sdk/cosmos/azure-cosmos-tests/src/test/java/com/azure/cosmos/faultinjection/Http2ConnectTimeoutBifurcationTests.java:300

  • Method name says “1sFiresOnDroppedSyn”, but the test description and assertions are written for a 5s default connect timeout. Rename the test to match the actual expected behavior (or adjust the timeout setup) so the name stays accurate over time.
    @Test(groups = {TEST_GROUP}, timeOut = TEST_TIMEOUT)
    public void connectTimeout_GwV2_DataPlane_1sFiresOnDroppedSyn() throws Exception {
        // Close and recreate client to ensure no pooled connections exist —
        // we need to force a NEW TCP connection which will hit the iptables DROP.


// Per-request CONNECT_TIMEOUT_MILLIS via reactor-netty's immutable HttpClient.
// .option() returns a new config snapshot — does NOT mutate the shared httpClient.
// Thin client requests (isThinClientRequest=true): 1s connect timeout to fail fast.
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments mention a 1s thin-client connect timeout, but the implemented default is 5s (Configs.DEFAULT_THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS) and is configurable via Configs.getThinClientConnectionTimeoutInSeconds(). Please update the comments to avoid hard-coding the wrong default.

Suggested change
// Thin client requests (isThinClientRequest=true): 1s connect timeout to fail fast.
// Thin client requests (isThinClientRequest=true): connect timeout is configured via
// Configs.getThinClientConnectionTimeoutInSeconds() (default 5s) to fail fast.

Copilot uses AI. Check for mistakes.
Comment on lines +43 to +46
* The bifurcation under test:
* - Data plane requests → GW V2 endpoint (port 10250) → CONNECT_TIMEOUT_MILLIS = 1s
* - Metadata requests → GW V1 endpoint (port 443) → CONNECT_TIMEOUT_MILLIS = 45s (unchanged)
*
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Javadoc here states GW V2 CONNECT_TIMEOUT_MILLIS is 1s, but elsewhere in this class you document/tests assume a 5s default (and production default is 5s). Please reconcile the documented timeout value with the actual default/configuration.

Copilot uses AI. Check for mistakes.
Comment on lines +5 to +10
`Http2ConnectTimeoutBifurcationTests` validates that the TCP connect timeout (`CONNECT_TIMEOUT_MILLIS`) is
correctly bifurcated between Gateway V1 metadata (45s) and Gateway V2 thin client data plane (1s).
Uses Linux `iptables` to DROP SYN packets and `tc netem` with `iptables mangle` for per-port delay.

**Key invariant proven:** Thin client data plane requests fail fast (1s connect timeout) while
metadata requests on port 443 remain unaffected (45s timeout).
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This README describes the thin client data plane connect timeout as 1s (including the suggested system property value), but the implemented default is 5s. Please update this section to reflect the 5s default and optionally mention overriding it for local testing.

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +39
* This policy has separate configurations for point read operations vs query/change feed operations.
*/
public class HttpTimeoutPolicyForGatewayV2 extends HttpTimeoutPolicy {

public static final HttpTimeoutPolicy INSTANCE_FOR_POINT_READ = new HttpTimeoutPolicyForGatewayV2(true);
public static final HttpTimeoutPolicy INSTANCE_FOR_QUERY_AND_CHANGE_FEED = new HttpTimeoutPolicyForGatewayV2(false);

private final boolean isPointRead;

private HttpTimeoutPolicyForGatewayV2(boolean isPointRead) {
this.isPointRead = isPointRead;
this.timeoutAndDelaysList = getTimeoutList();
}

public List<ResponseTimeoutAndDelays> getTimeoutList() {
if (this.isPointRead) {
return Collections.unmodifiableList(
Arrays.asList(
new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO),
new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO),
new ResponseTimeoutAndDelays(Duration.ofSeconds(10), Duration.ZERO)));
} else {
return Collections.unmodifiableList(
Arrays.asList(
new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO),
new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO),
new ResponseTimeoutAndDelays(Duration.ofSeconds(10), Duration.ZERO)));
}
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class-level comment says point-read vs query/change-feed have separate timeout configurations, but both branches currently return the same list. Either remove the split to reduce duplication or implement distinct policies so the code matches the documentation.

Suggested change
* This policy has separate configurations for point read operations vs query/change feed operations.
*/
public class HttpTimeoutPolicyForGatewayV2 extends HttpTimeoutPolicy {
public static final HttpTimeoutPolicy INSTANCE_FOR_POINT_READ = new HttpTimeoutPolicyForGatewayV2(true);
public static final HttpTimeoutPolicy INSTANCE_FOR_QUERY_AND_CHANGE_FEED = new HttpTimeoutPolicyForGatewayV2(false);
private final boolean isPointRead;
private HttpTimeoutPolicyForGatewayV2(boolean isPointRead) {
this.isPointRead = isPointRead;
this.timeoutAndDelaysList = getTimeoutList();
}
public List<ResponseTimeoutAndDelays> getTimeoutList() {
if (this.isPointRead) {
return Collections.unmodifiableList(
Arrays.asList(
new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO),
new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO),
new ResponseTimeoutAndDelays(Duration.ofSeconds(10), Duration.ZERO)));
} else {
return Collections.unmodifiableList(
Arrays.asList(
new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO),
new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO),
new ResponseTimeoutAndDelays(Duration.ofSeconds(10), Duration.ZERO)));
}
* Currently uses a single timeout configuration that is applied for both point read and query/change feed operations.
*/
public class HttpTimeoutPolicyForGatewayV2 extends HttpTimeoutPolicy {
public static final HttpTimeoutPolicy INSTANCE_FOR_POINT_READ = new HttpTimeoutPolicyForGatewayV2();
public static final HttpTimeoutPolicy INSTANCE_FOR_QUERY_AND_CHANGE_FEED = new HttpTimeoutPolicyForGatewayV2();
private HttpTimeoutPolicyForGatewayV2() {
this.timeoutAndDelaysList = getTimeoutList();
}
public List<ResponseTimeoutAndDelays> getTimeoutList() {
return Collections.unmodifiableList(
Arrays.asList(
new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO),
new ResponseTimeoutAndDelays(Duration.ofSeconds(6), Duration.ZERO),
new ResponseTimeoutAndDelays(Duration.ofSeconds(10), Duration.ZERO)));

Copilot uses AI. Check for mistakes.
Comment on lines +258 to +259
* Thin client requests (identified by {@link HttpRequest#isThinClientRequest()}) use a shorter
* connect timeout (default 1s) to fail fast when the thin client proxy is unreachable.
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Javadoc for resolveConnectTimeoutMs() says the thin client connect timeout default is 1s, but the code reads Configs.getThinClientConnectionTimeoutInSeconds() whose default is currently 5s. Please update this Javadoc to reflect the real default/configuration source (and keep it consistent with other docs/tests).

Suggested change
* Thin client requests (identified by {@link HttpRequest#isThinClientRequest()}) use a shorter
* connect timeout (default 1s) to fail fast when the thin client proxy is unreachable.
* Thin client requests (identified by {@link HttpRequest#isThinClientRequest()}) use the thin
* client connection timeout configured via {@link Configs#getThinClientConnectionTimeoutInSeconds()}
* (default 5s) to fail fast when the thin client proxy is unreachable.

Copilot uses AI. Check for mistakes.
*
* Configurable via system property COSMOS.THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS
* or environment variable COSMOS_THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS.
* Default: 1 second.
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JavaDoc for getThinClientConnectionTimeoutInSeconds() says the default is 1 second, but DEFAULT_THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS is 5. This inconsistency will confuse users and makes it unclear what the actual default is. Please align the JavaDoc (and any related comments) with the implemented default, or change the constant if 1s is intended.

Suggested change
* Default: 1 second.
* Default: 5 seconds.

Copilot uses AI. Check for mistakes.
Comment on lines +185 to +188
// Default thin client connection timeout should be 1 second
System.clearProperty("COSMOS.THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS");
try {
assertThat(Configs.getThinClientConnectionTimeoutInSeconds()).isEqualTo(1);
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test asserts the thin client connection timeout default is 1 second, but production code sets DEFAULT_THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS to 5. Update the expected value (or the production default) so they match; otherwise this unit test will fail and/or document the wrong default.

Suggested change
// Default thin client connection timeout should be 1 second
System.clearProperty("COSMOS.THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS");
try {
assertThat(Configs.getThinClientConnectionTimeoutInSeconds()).isEqualTo(1);
// Default thin client connection timeout should be 5 seconds
System.clearProperty("COSMOS.THINCLIENT_CONNECTION_TIMEOUT_IN_SECONDS");
try {
assertThat(Configs.getThinClientConnectionTimeoutInSeconds()).isEqualTo(5);

Copilot uses AI. Check for mistakes.
Copy link
Member

@FabianMeiswinkel FabianMeiswinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Thanks!

* - Assertions compare parentChannelId before/after to prove connection survival.
* - Tests run sequentially (thread-count=1) to avoid tc interference between tests.
*/
public class Http2ConnectionLifecycleTests extends FaultInjectionTestBase {
Copy link
Member

@xinlian12 xinlian12 Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amazing test 👍

*/
private int resolveConnectTimeoutMs(HttpRequest request) {
if (request.isThinClientRequest()) {
return Configs.getThinClientConnectionTimeoutInSeconds() * 1000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this config can also be set in the httpClientConfig. Else it will be called for every request (which need to read from system env etc) which can potentially cause some perf issue -> possible some internal field in the http2Config for now

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and then maybe also add a validation for the min values to avoid <=0 situation

* Timeout policy for Gateway V2 (Thin Client) requests.
* This policy has separate configurations for point read operations vs query/change feed operations.
*/
public class HttpTimeoutPolicyForGatewayV2 extends HttpTimeoutPolicy {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since we have a separate gatewayV2 policy now, I am wondering whether we should also has other operation type here as well. So when request.isThinClientMode, then always get configs from here rather than silently fallback to default httpPolicy etc

// payload is a slice/derived view; super() owns payload, we still own the container
// this includes scenarios where payloadBuf == EMPTY_BUFFER
if (payloadBuf == Unpooled.EMPTY_BUFFER && content.refCnt() > 0) {
ReferenceCountUtil.safeRelease(content);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potential a common method:

safeRelease(content.refCnt):
if content.refCnt > 0 -> ReferenceCountUtil.safeRelease(content);

Copy link
Member

@xinlian12 xinlian12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks~

…into AzCosmos_H2ConnectAcquireTimeout

# Conflicts:
#	sdk/cosmos/azure-cosmos/CHANGELOG.md
#	sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ThinClientStoreModel.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE REQ] [Thin Client][Gateway V2] Connect timeout bifurcation — 1s for thin client proxy (port 10250) vs 45s for gateway (port 443)