Skip to content

GEODE-8473: Hang in ReplyProcessor21 when forced-disconnect does not establish a cancellation cause#5491

Merged
bschuchardt merged 1 commit intoapache:developfrom
bschuchardt:feature/GEODE-8473
Sep 16, 2020
Merged

GEODE-8473: Hang in ReplyProcessor21 when forced-disconnect does not establish a cancellation cause#5491
bschuchardt merged 1 commit intoapache:developfrom
bschuchardt:feature/GEODE-8473

Conversation

@bschuchardt
Copy link
Contributor

ReplyProcessor21 will not stop waiting for responses to a message during a Forced Disconnect unless ClusterDistributionManager is informed of the disconnect. It sets a rootCause in its CancelCriterion that is polled by ReplyProcessor21's StoppableCountDownLatch.
This commit ensures that ClusterDistributionManager is notified of the disconnect so that it can perform this action.

This is a follow-up PR to GEODE-8467, which ensures that a DisconnectThread is launched to execute the GMSMembership.uncleanShutdown() method.

@kamilla1201

Thank you for submitting a contribution to Apache Geode.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?

  • Has your PR been rebased against the latest commit within the target branch (typically develop)?

  • Is your initial contribution a single, squashed commit?

  • Does gradlew build run cleanly?

  • Have you written or updated unit tests to verify your changes?

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?

Note:

Please ensure that once the PR is submitted, check Concourse for build issues and
submit an update to your PR as soon as possible. If you need help, please send an
email to dev@geode.apache.org.

…establish a cancellation cause

Ensure that the cache is informed of a forced-disconnect in the
DisconnectThread.  This is a follow-on commit to GEODE-8467, which
ensured that the DisconnectThread is launched in the presence of cache
XML generation failure.  This commit adds a try/catch in
GMSMembership.uncleanShutdown() to ensure that the up-stream
ClusterDistributionManager is informed of the failure so it can set the
"rootCause" in its CancelCriterion.  ReplyProcessor21 and other objects
that poll for this "rootCause" will then be released from waiting for
responses to messages sent to other members of the cluster.
Copy link
Contributor

@Bill Bill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one Bruce!

.isEqualTo(expectedException);
verify(listener).membershipFailure(isA(String.class), isA(Throwable.class));
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you tested it all!

listener.membershipFailure(reason, e);
} catch (RuntimeException re) {
logger.warn("Exception caught while shutting down", re);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

solid

@bschuchardt bschuchardt merged commit c48c0c3 into apache:develop Sep 16, 2020
@bschuchardt bschuchardt deleted the feature/GEODE-8473 branch September 16, 2020 16:22
mkevo pushed a commit to Nordix/geode that referenced this pull request Mar 19, 2021
…establish a cancellation cause (apache#5491)

Ensure that the cache is informed of a forced-disconnect in the
DisconnectThread.  This is a follow-on commit to GEODE-8467, which
ensured that the DisconnectThread is launched in the presence of cache
XML generation failure.  This commit adds a try/catch in
GMSMembership.uncleanShutdown() to ensure that the up-stream
ClusterDistributionManager is informed of the failure so it can set the
"rootCause" in its CancelCriterion.  ReplyProcessor21 and other objects
that poll for this "rootCause" will then be released from waiting for
responses to messages sent to other members of the cluster.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants