Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: fix flaky test by buffering possibly chunked content #17923

Merged
merged 1 commit into from
Apr 29, 2024

Conversation

npepinpe
Copy link
Member

@npepinpe npepinpe commented Apr 26, 2024

Description

This PR attempts to fix a flaky test which looks to be caused by the interaction of Jackson and the underlying body subscriber.

The error was previously: https://github.com/camunda/zeebe/actions/runs/8846555512/job/24292686857

java.io.IOException: chunked transfer encoding, state: READING_LENGTH
	at java.net.http/jdk.internal.net.http.HttpClientImpl.send(HttpClientImpl.java:964)
	at java.net.http/jdk.internal.net.http.HttpClientFacade.send(HttpClientFacade.java:133)
	at io.camunda.zeebe.it.management.ControlledActorClockEndpointIT.testPinningTime(ControlledActorClockEndpointIT.java:68)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at java.base/java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:194)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
Caused by: java.io.IOException: chunked transfer encoding, state: READING_LENGTH
	at java.net.http/jdk.internal.net.http.common.Utils.wrapWithExtraDetail(Utils.java:391)
	at java.net.http/jdk.internal.net.http.Http1Response$BodyReader.onReadError(Http1Response.java:676)
	at java.net.http/jdk.internal.net.http.Http1AsyncReceiver.checkForErrors(Http1AsyncReceiver.java:302)
	at java.net.http/jdk.internal.net.http.Http1AsyncReceiver.flush(Http1AsyncReceiver.java:268)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$LockingRestartableTask.run(SequentialScheduler.java:182)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$CompleteRestartableTask.run(SequentialScheduler.java:149)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler$SchedulableTask.run(SequentialScheduler.java:207)
	at java.net.http/jdk.internal.net.http.HttpClientImpl$DelegatingExecutor.execute(HttpClientImpl.java:177)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:282)
	at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(SequentialScheduler.java:251)
	at java.net.http/jdk.internal.net.http.Http1AsyncReceiver.onReadError(Http1AsyncReceiver.java:516)
	at java.net.http/jdk.internal.net.http.Http1AsyncReceiver.lambda$handlePendingDelegate$3(Http1AsyncReceiver.java:380)
	at java.net.http/jdk.internal.net.http.Http1AsyncReceiver$Http1AsyncDelegateSubscription.cancel(Http1AsyncReceiver.java:163)
	at java.net.http/jdk.internal.net.http.common.HttpBodySubscriberWrapper$SubscriptionWrapper.cancel(HttpBodySubscriberWrapper.java:92)
	at java.net.http/jdk.internal.net.http.ResponseSubscribers$HttpResponseInputStream.close(ResponseSubscribers.java:653)
	at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._closeInput(UTF8StreamJsonParser.java:251)
	at com.fasterxml.jackson.core.base.ParserBase.close(ParserBase.java:412)
	at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4913)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3885)
	at io.camunda.zeebe.it.management.ControlledActorClockEndpointIT$ResponseHandler.unsafeRead(ControlledActorClockEndpointIT.java:138)
	at java.base/java.util.concurrent.CompletableFuture.uniApplyNow(CompletableFuture.java:684)
	at java.base/java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:662)
	at java.base/java.util.concurrent.CompletableFuture.thenApply(CompletableFuture.java:2200)
	at java.base/java.util.concurrent.CompletableFuture$MinimalStage.thenApply(CompletableFuture.java:2948)
	at java.net.http/jdk.internal.net.http.ResponseSubscribers$MappingSubscriber.getBody(ResponseSubscribers.java:843)
	at java.net.http/jdk.internal.net.http.common.HttpBodySubscriberWrapper.getBody(HttpBodySubscriberWrapper.java:360)
	at java.net.http/jdk.internal.net.http.ResponseSubscribers.lambda$getBodyAsync$3(ResponseSubscribers.java:1175)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.io.IOException: subscription cancelled

From it, we can see that the body subscriber (which is an InputStream) ends up getting closed by Jackson. On close, it will cancel the subscription. On cancelling, it will run its action, which calls onReadError with the IOException containing the subscription cancelled message.

The question for me here is, why is this not happening every time? Why is it flaky? 🤔

At any rate, I think we can just work around this by simply changing the underlying subscriber to ofByteArray, which will buffer the complete response first. This response tends to be small anyway, and this is only for a test.

@npepinpe npepinpe added backport stable/8.2 backport stable/8.3 Backport a pull request to 8.3.x backport stable/8.4 Backport a pull request to 8.4.x backport stable/8.5 Backport a pull request to stable/8.5 labels Apr 26, 2024
@github-actions github-actions bot added the component/zeebe Related to the Zeebe component/team label Apr 26, 2024
Copy link
Member

@lenaschoenburg lenaschoenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason it's flaky is because it's a race condition between the HTTP handling completing and Jackson finishing the deserialization and closing the InputStream. From BodySubscribers.ofInputStream():

To ensure that all resources associated with the corresponding exchange are properly released the caller must ensure to either read all bytes until EOF is reached, or call InputStream. close if it is unable or unwilling to do so. Calling close before exhausting the stream may cause the underlying HTTP connection to be closed and prevent it from being reused for subsequent operations.

I don't think Jackson waits for EOF necessarily, see FasterXML/jackson-jaxrs-providers#108 (comment)

@npepinpe npepinpe added this pull request to the merge queue Apr 29, 2024
Merged via the queue into main with commit a4b9fd7 Apr 29, 2024
39 checks passed
@npepinpe npepinpe deleted the np-fix-flaky-clock-it branch April 29, 2024 16:35
@backport-action
Copy link
Collaborator

Successfully created backport PR for stable/8.2:

@backport-action
Copy link
Collaborator

Successfully created backport PR for stable/8.3:

@backport-action
Copy link
Collaborator

Successfully created backport PR for stable/8.4:

@backport-action
Copy link
Collaborator

Successfully created backport PR for stable/8.5:

github-merge-queue bot pushed a commit that referenced this pull request Apr 29, 2024
…ked content (#18066)

# Description
Backport of #17923 to `stable/8.4`.

relates to
original author: @npepinpe
github-merge-queue bot pushed a commit that referenced this pull request Apr 29, 2024
…ked content (#18064)

# Description
Backport of #17923 to `stable/8.2`.

relates to
original author: @npepinpe
github-merge-queue bot pushed a commit that referenced this pull request Apr 29, 2024
…ked content (#18065)

# Description
Backport of #17923 to `stable/8.3`.

relates to
original author: @npepinpe
github-merge-queue bot pushed a commit that referenced this pull request May 7, 2024
…ked content (#18067)

# Description
Backport of #17923 to `stable/8.5`.

relates to
original author: @npepinpe
@Zelldon Zelldon added version:8.2.27 Marks an issue as being completely or in parts released in 8.2.27 version:8.4.7 Marks an issue as being completely or in parts released in 8.4.7 version:8.5.1 Marks an issue as being completely or in parts released in 8.5.1 version:8.3.11 Marks an issue as being completely or in parts released in 8.3.11 labels May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport stable/8.3 Backport a pull request to 8.3.x backport stable/8.4 Backport a pull request to 8.4.x backport stable/8.5 Backport a pull request to stable/8.5 component/zeebe Related to the Zeebe component/team version:8.2.27 Marks an issue as being completely or in parts released in 8.2.27 version:8.3.11 Marks an issue as being completely or in parts released in 8.3.11 version:8.4.7 Marks an issue as being completely or in parts released in 8.4.7 version:8.5.1 Marks an issue as being completely or in parts released in 8.5.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants