Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ZOOKEEPER-3188: Improve resilience to network
This PR is the rebase of the [previous pull request](#730), so all the kudos should go to the original authors... In [ZOOKEEPER-3188](https://issues.apache.org/jira/browse/ZOOKEEPER-3188) we add ability to specify several addresses for quorum operations. Also added reconnection attempts if connection to leader lost. In this PR I rebased the changes on the current master, resolving some minor conflicts with: - [ZOOKEEPER-3296](https://issues.apache.org/jira/browse/ZOOKEEPER-3296): Explicitly closing the sslsocket when it failed handshake to prevent issue where peers cannot join quorum - [ZOOKEEPER-3320](https://issues.apache.org/jira/browse/ZOOKEEPER-3320): Leader election port stop listen when hostname unresolvable for some time - [ZOOKEEPER-3385](https://issues.apache.org/jira/browse/ZOOKEEPER-3385): Add admin command to display leader - [ZOOKEEPER-3386](https://issues.apache.org/jira/browse/ZOOKEEPER-3386): Add admin command to display voting view - [ZOOKEEPER-3398](https://issues.apache.org/jira/browse/ZOOKEEPER-3398): Learner.connectToLeader() may take too long to time-out I still want to test the feature manually (e.g. using docker containers with multiple virtual networks / interfaces). The steps to the manual test could be recorded in the [google docs](https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing) as well. Also I think we could add a few more unit tests where we are using multiple addresses. The current tests are using a single address only. Also the Zookeeper documentation needs to be changed (e.g. by a follow-up Jira?) to promote the new feature and the new config format (possibly including also the admin command documentation in relation with [ZOOKEEPER-3386](https://issues.apache.org/jira/browse/ZOOKEEPER-3386) and [ZOOKEEPER-3461](https://issues.apache.org/jira/browse/ZOOKEEPER-3461)) Author: Mate Szalay-Beko <szalay.beko.mate@gmail.com> Author: Mate Szalay-Beko <mszalay@cloudera.com> Reviewers: eolivelli@apache.org, andor@apache.org Closes #1048 from symat/ZOOKEEPER-3188 and squashes the following commits: 3c6fc52 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 356882d [Mate Szalay-Beko] ZOOKEEPER-3188: document new configuration format for using multiple addresses 45b6c0f [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 4b6bcea [Mate Szalay-Beko] ZOOKEEPER-3188: MultiAddress unit tests for Quorum TLS and Kerberos/Digest authentication 40bc44c [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 f875f5c [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 31805e7 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 0f95678 [Mate Szalay-Beko] ZOOKEEPER-3188: skip unreachable addresses when Learner connects to Leader e232c55 [Mate Szalay-Beko] ZOOKEEPER-3188: fix flaky unit MultiAddress unit test e892d8d [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 6f2ab75 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 2eedf26 [Mate Szalay-Beko] ZOOKEEPER-3188: fix PR commits; handle case when Leader can not bind to port on startup 483d2fc [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 a5d6bcb [Mate Szalay-Beko] ZOOKEEPER-3188: support for dynamic reconfig + add more unit tests ed31d2c [Mate Szalay-Beko] ZOOKEEPER-3188: better shutdown for executors (following PR comments) 8713a5b [Mate Szalay-Beko] ZOOKEEPER-3188: add fixes for PR comments 05eae83 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 e823af4 [Mate Szalay-Beko] Merge remote-tracking branch 'origin/master' into ZOOKEEPER-3188 de7bad2 [Mate Szalay-Beko] Merge remote-tracking branch 'origin/master' into ZOOKEEPER-3188 da98a8d [Mate Szalay-Beko] ZOOKEEPER-3188: fix JDK-13 warning 5bd1f4e [Mate Szalay-Beko] ZOOKEEPER-3188: supress spotbugs warning 42a52a6 [Mate Szalay-Beko] ZOOKEEPER-3188: improve based on code review comments 6c4220a [Mate Szalay-Beko] ZOOKEEPER-3188: fix SendWorker.asyncValidateIfSocketIsStillReachable 5b22432 [Mate Szalay-Beko] ZOOKEEPER-3188: fix LeaderElection to work with multiple election addresses 7bfbe7e [Mate Szalay-Beko] ZOOKEEPER-3188: Improve resilience to network
- Loading branch information
Showing
33 changed files
with
2,159 additions
and
508 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
815c8f2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@symat Is there a way we can backport this fix for 3.5.5 or 3.5.8 as it got released recently? I am having trouble with zookeeper 3.6.1 (along with some misbehavior from its dependent services) and I need this fix.
815c8f2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix was the cause of several regressions in 3.6.
It is better to fix 3.6
815c8f2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a large/complex patch also including some leader election message protocol version changes. Also there were several subsequent bugfixes related to the MultiAddress feature later (after this PR). I don't think we should backport all these to 3.5.
(we don't backport new major features to older branches to make sure we don't break anything in a bugfix release)