-
Notifications
You must be signed in to change notification settings - Fork 695
GEODE-9552: Handle ForcedDisconnectException in ExecutionHandlerContext #6805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Authored-by: Donal Evans <doevans@vmware.com>
| } else if (rootCause instanceof InterruptedException | ||
| || rootCause instanceof CacheClosedException) { | ||
| return RedisResponse.error(RedisConstants.SERVER_ERROR_SHUTDOWN); | ||
| } else if (rootCause instanceof ForcedDisconnectException) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a plan/ticket/record of the plan to remove this once jedis is confirmed to be fixed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A server can always be forced out of the geode cluster due to network partitioning so I think we will always need to handle this exception.
| // This indicates a member departed or got disconnected | ||
| logger.warn( | ||
| "Closing client connection because one of the servers doing this operation departed."); | ||
| channelInactive(ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the ForceDisconnectException should be handled the same was as a CacheClosedException. It's just another way that the local cache may have been shutdown.
I think the best thing to do would be to change the above catch clause for CacheClosedException to handle the parent class CancelException, which covers ForceDisconnectException, CacheClosedException, and other shutdown related exceptions.
BTW, this whole handle method looks scary to me! In general I don't think we should be fishing for root causes of exceptions - that seems like maybe bad exception handling in other layers if we have to do that.
| } else if (rootCause instanceof ForcedDisconnectException) { | ||
| // This indicates a member departed or got disconnected | ||
| logger.warn( | ||
| "Closing client connection because one of the servers doing this operation departed."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the ForcedDisconnectException will now apply to the current server (i.e. the one logging this warning) so I'm not sure we should say "one of the servers". Also I think we should tack onto the log string " + rootCause" so that whatever info was in the ForcedDisconnectException will be in this message. Should we also describe the type of client connection being closed? When this log message is seen it will not be obvious that it is redis related.
| logger.warn( | ||
| "Closing client connection because one of the servers doing this operation departed."); | ||
| channelInactive(ctx); | ||
| return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does returning null do? Will it signal the client to retry the operation? It seems like return RedisResponse.error(RedisConstants.SERVER_ERROR_SHUTDOWN); would be better since the server was forced to disconnect from the cluster.
| } else if (rootCause instanceof InterruptedException | ||
| || rootCause instanceof CacheClosedException) { | ||
| return RedisResponse.error(RedisConstants.SERVER_ERROR_SHUTDOWN); | ||
| } else if (rootCause instanceof ForcedDisconnectException) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A server can always be forced out of the geode cluster due to network partitioning so I think we will always need to handle this exception.
|
Closing this PR for now as further discussion is needed on the best way to handle various server-side exceptions. |
Authored-by: Donal Evans doevans@vmware.com
Thank you for submitting a contribution to Apache Geode.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?
Has your PR been rebased against the latest commit within the target branch (typically
develop)?Is your initial contribution a single, squashed commit?
Does
gradlew buildrun cleanly?Have you written or updated unit tests to verify your changes?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
Note:
Please ensure that once the PR is submitted, check Concourse for build issues and
submit an update to your PR as soon as possible. If you need help, please send an
email to dev@geode.apache.org.