-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixStackOverflowErrorWhenSerializingCosmosDiagnostics #40272
fixStackOverflowErrorWhenSerializingCosmosDiagnostics #40272
Conversation
API change check API changes are not detected in this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @xinlian12
@@ -90,4 +99,52 @@ public void cancelRecord() throws URISyntaxException, InterruptedException, Json | |||
assertThat(errorStatus).isNotNull(); | |||
assertThat(errorStatus.toString()).isEqualTo(statusString); | |||
} | |||
|
|||
@Test(groups = { "fast" }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mixing Unit tests and fast tests is not a great idea. Now for the unit tests as well, you will initialize TestSuiteBase class for this particular class unnecessary.
Can we keep them separate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is the problem with initializing TestSuiteBase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the tests into cosmosDiagnosticsTests file as part of the test is testing the serialization of the cosmos diagnostics
...t/java/com/azure/cosmos/implementation/directconnectivity/rntbd/RntbdRequestRecordTests.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - thanks!
Co-authored-by: Kushagra Thapar <kushuthapar@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - Thanks!
/azp run java - cosmos - tests |
Azure Pipelines successfully started running 1 pipeline(s). |
Tested the following tests locally: |
/check-enforcer override |
the failed tests does not related to the change in this PR |
Issue:
Within java SDK, when SDK sending the request to server, it will also spin up a separate task which will cancel the request if SDK did not get the response within the expected time. And when SDK get the response back, this task will get cancelled accordingly.
However for some rare cases, when the following combination condition happens, and overflow error can happen:
Because Step2 is not being cancelled properly, so the expire() flow still get triggered, and due to the request failed with exception, as part of the serializing requestRecord flow, it also tried to serialize the exception.
CosmosDiagnostics
is one of the property inCosmosException
, since there is no customized serializer on the CosmosDiagnostics, so it uses default serializer(which decides the property to be serialized based on the property and getter methods), and that is when it can hit theStackOverflowError
.com.fasterxml.jackson.databind.JsonMappingException: Infinite recursion (StackOverflowError) (through reference chain: com.azure.cosmos.CosmosDiagnosticsContext["diagnostics"]->java.util.concurrent.ConcurrentLinkedDeque[0]->com.azure.cosmos.CosmosDiagnostics["diagnosticsContext"]->com.azure.cosmos.CosmosDiagnosticsContext["diagnostics"]->java.util.concurrent.ConcurrentLinkedDeque[0]->com.azure.cosmos.CosmosDiagnostics["diagnosticsContext"]->com.azure.cosmos.CosmosDiagnosticsContext["diagnostics"]->java.util.concurrent.ConcurrentLinkedDeque[0]->com.azure.cosmos.CosmosDiagnostics["diagnosticsContext"]->com.azure.cosmos.CosmosDiagnosticsContext["diagnostics"]->java.util.concurrent.ConcurrentLinkedDeque[0]->com.azure.cosmos.CosmosDiagnostics["diagnosticsContext"]->com.azure.cosmos.CosmosDiagnosticsContext["diagnostics"]->java.util.concurrent.ConcurrentLinkedDeque[0]->com.azure.cosmos.CosmosDiagnostics["diagnosticsContext"]->com.azure.cosmos.CosmosDiagnosticsContext["di
Fixes:
JsonIgnore
ongetDiagnosticsContext
withinCosmosDiagnostics
RntbdRequestRecord
, changed to use error.toString when serializing the error