New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GEODE-7258: The function retry logic is modified to handle exception #4186
Conversation
94729bc
to
078ce93
Compare
5597e0c
to
d723aed
Compare
geode-core/src/main/java/org/apache/geode/cache/client/internal/ExecuteFunctionOp.java
Show resolved
Hide resolved
geode-core/src/main/java/org/apache/geode/cache/client/internal/ExecuteFunctionOp.java
Outdated
Show resolved
Hide resolved
geode-core/src/main/java/org/apache/geode/cache/client/internal/ExecuteRegionFunctionOp.java
Outdated
Show resolved
Hide resolved
geode-core/src/main/java/org/apache/geode/cache/client/internal/PoolImpl.java
Show resolved
Hide resolved
geode-core/src/main/java/org/apache/geode/cache/client/internal/SingleHopClientExecutor.java
Outdated
Show resolved
Hide resolved
57a1d2f
to
02211bc
Compare
thrown, while trying to connect to a server thats shutdown/closed. Co-authored-by: Anil <agingade@pivotal.io> Co-authored-by: Xiaojian Zhou <gzhou@pivotal.io>
c88b231
to
302a0d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two tests I was running to reproduce this issue were failing without these changes and are now passing with them.
Test 1
start 2 servers with a PARTITION_REDUNDANT region
run a client that:
- does a put into each bucket
- executes onRegion function
- sleeps
- executes onRegion function
Kill a server while the client is sleeping.
In this case, SingleHopClientExecutor submitAllHA catches the following ServerConnectivityException and then retries once on a random server.
SingleHopClientExecutor.submitAllHA maxRetryAttempts=1; caught exception=java.util.concurrent.ExecutionException: org.apache.geode.cache.client.ServerConnectivityException: Pool unexpected closed socket on server connection=Pooled Connection to 192.168.1.4:58672: Connection[DESTROYED]). Server unreachable: could not connect after 1 attempts
ExecuteRegionFunctionSingleHopOp.execute retryAttemptsFromsubmitAllHA=1
ExecuteRegionFunctionOp.execute maxRetryAttempts=-1
OpExecutorImpl.execute conn=Pooled Connection to 192.168.1.4:58675: Connection[192.168.1.4:58675]@83554804
ExecuteRegionFunctionOp.execute done
Executed TestFunction in 21 ms with result: [true]
Test 2
- start 2 servers
- start client executing functions repeatedly
- kill a server
In this case, SingleHopClientExecutor submitAllHA catches one of the following ServerConnectivityExceptions and then retries once on a random server.
SingleHopClientExecutor.submitAllHA maxRetryAttempts=1; caught exception=java.util.concurrent.ExecutionException: org.apache.geode.cache.client.ServerConnectivityException: Pool unexpected closed socket on server connection=Pooled Connection to 192.168.1.4:58611: Connection[DESTROYED]). Server unreachable: could not connect after 1 attempts
ExecuteRegionFunctionSingleHopOp.execute retryAttemptsFromsubmitAllHA=1
OpExecutorImpl.execute conn=Pooled Connection to 192.168.1.4:58608: Connection[192.168.1.4:58608]@83554804
ExecuteRegionFunctionOp.execute done
Executed TestFunction in 57 ms with result: [true]
SingleHopClientExecutor.submitAllHA maxRetryAttempts=1; e=java.util.concurrent.ExecutionException: org.apache.geode.cache.client.ServerConnectivityException: Could not create a new connection to server 192.168.1.4:58611
ExecuteRegionFunctionSingleHopOp.execute retryAttemptsFromsubmitAllHA=1
OpExecutorImpl.execute conn=Pooled Connection to 192.168.1.4:58608: Connection[192.168.1.4:58608]@757298272
ExecuteRegionFunctionOp.execute done
Executed TestFunction in 4 ms with result: [true]
thrown, while trying to connect to a server thats shutdown/closed.
Thank you for submitting a contribution to Apache Geode.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?
Has your PR been rebased against the latest commit within the target branch (typically
develop
)?Is your initial contribution a single, squashed commit?
Does
gradlew build
run cleanly?Have you written or updated unit tests to verify your changes?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
Note:
Please ensure that once the PR is submitted, check Concourse for build issues and
submit an update to your PR as soon as possible. If you need help, please send an
email to dev@geode.apache.org.