Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client fail-fast (hazelcast.client.max.concurrent.invocations) not working for async operations causing OOM #8568

Closed
ihsandemir opened this issue Jul 21, 2016 · 1 comment

Comments

@ihsandemir
Copy link
Contributor

@ihsandemir ihsandemir commented Jul 21, 2016

Client has a configurable hazelcast.client.max.concurrent.invocations property which limits outstanding client requests. We have observed at some cases that for async calls it is possible that this limit may not work as expected and this may cause outstanding requests to grow which may cause OOM.

When examining the issue we observe a lot of client messages in heap. We see that both the request and response messages are in the heap. This is most probably caused by slow Callback executions. Here is a test case for generation of the issue:
https://github.com/ihsandemir/hazelcast/blob/maxInvocationFix/hazelcast-client/src/test/java/com/hazelcast/client/ClientMaxAllowedInvocationTest.java#L93

@ihsandemir ihsandemir added this to the 3.7 milestone Jul 21, 2016
@ihsandemir ihsandemir self-assigned this Jul 21, 2016
@ihsandemir
Copy link
Contributor Author

@ihsandemir ihsandemir commented Jul 21, 2016

Observation: The client controls number of outstanding invocations using the correlation id. It increases it when it registers a request to be sent and the number is decreased when the response is received from the tcp channel for that request, but it is decremented before notifying the future (at ResponseThread.handleClientMessage ). Hence, this control is not including the part including and after the invocation notify. Hence, the client shall continue sending new requests because the counter is decreased and this may lead a lot of responses (more than the configured overload limit) being processed at the invocation.notify stage. This is especially true for async calls where they have andThen logic.

@enesakar enesakar modified the milestones: 3.8, 3.7 Jul 25, 2016
@asimarslan asimarslan modified the milestones: 3.8.1, 3.8 Jan 27, 2017
sancar added a commit to sancar/hazelcast that referenced this issue Mar 13, 2017
CallIdSequence will be completed to accept next invocation only when
the callbacks running on internal executors are completed.

A second change made to achieve back pressure safely. If response
is already available when andThen is called with an internal callback.
Then internal callback runs on calling thread instead of executor.
Since it is already not permitted to do any blocking call in internal
threads, this will achieve a natural backpressure.

fixes hazelcast#9665
fixes hazelcast#8568
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

5 participants
You can’t perform that action at this time.