-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix #4885: addressing a potential hang with jdk streams #4887
Conversation
- there is a possibility that the buffer gets modified after the work is queued - there is a race condition between the done handling and the async buffer handling. If the executor is shutdown before the task starts running the buffers will be lost.
I think I see how that is possible now - the initial speculative change was wrong. We need to handle the done in the serialexecutor as well to ensure that the buffers that have already been sent are processed first - otherwise we're risking dropping some of that. This seems to be more prevalent with jdk / jetty than okhttp likely due to the threading model of how consume is handled - it's an async task for okhttp. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thx!
SonarCloud Quality Gate failed. |
@manusa things look better with the latest e2e run. There is a familiar looking podit failure with vertx, but on 1.25 - so that shouldn't be using containerd correct? https://github.com/fabric8io/kubernetes-client/actions/runs/4200862573/jobs/7287316314#step:5:1022 I'll see if that can be reproduced. |
Nope, that one is using Minikube on baremetal with Docker. |
It did spawn #4891 |
Description
Fix #4885
Addresses PodIT hanging failure with jdk by checking for complete prior to waiting - the issue is that the request for more with jdk will be executed in the calling thread which already holds the lock and can modify the complete flag prior to the wait.
There are also some failures on the JobIT log - but I have not been able to reproduce that. There are a couple of changes here to help with that logic in general - one is better localize when more is requested. That should be done inline with the processing call, rather than in a completion handler because it looks like they can be executed in parallel. The other is that I don't see now anywhere in the jdk client that guarantees the supplied buffers won't change after be passed to consume. Since we already handled that case with jetty we should do the same here.
Type of change
test, version modification, documentation, etc.)
Checklist