[BUG] Cosmos hangs forever with CosmosEndToEndOperationLatencyPolicyConfig set #40786
Open
3 tasks done
Labels
Client
This issue points to a problem in the data-plane of the library.
Cosmos
customer-reported
Issues that are reported by GitHub users external to the Azure organization.
needs-team-attention
This issue needs attention from Azure service team or SDK team
question
The issue doesn't require a change to the product in order to be resolved. Most issues start as that
Service Attention
This issue is responsible by Azure service team.
Describe the bug
Certain operations cause the Cosmos SDK to hang forever and certain operations do not respect the timeout set by CosmosEndToEndOperationLatencyPolicyConfig.
It seems the hangs occur for operations that span partitions.
To Reproduce
See this example repository and test: https://github.com/lnist/cosmos-sdk-hang/blob/main/src/test/java/cosmosTimeouts.java
In the test you need to fill in the connection string and master key for cosmos.
The test utilizes WireMock to simulate a delay in accessing the cosmos backend. For this a self-signed certificate is used, since the Cosmos SDK insists on using HTTPS.
If you execute the tests then they are all expected to fail due to timeout from the Cosmos SDK. That does not happen.
The
readAllContainers
andproperties
tests both return the desired data, but it takes longer than the configured timeout of 1 second. They should fail instead.The
readNonDefaultPartitionKey
,count
,readAll
, andwriteBulk
all respect the timeout of 1 second if the DELAY parameter is set to 2_000, but they hang forever (until the test timeout of 1 minutes) if the DELAY parameter is set to 10_000.Note: The code includes a couple of configurations that I think are redundant, but they were used during extensive testing, so I did not want to change them. A quick test without them seems to indicate the issues are present with default parameters (except of course for the CosmosEndToEndOperationLatencyPolicyConfig)
Code Snippet
Add the code snippet that causes the issue.
Expected behavior
The API uses the configured timeout.
Setup (please complete the following information):
Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report
The text was updated successfully, but these errors were encountered: