New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZOOKEEPER-3188: Improve resilience to network #1048
Conversation
I created a simple docker config with multiple virtual networks and managed to test the situation when some of the containers loose the access to one of the virtual networks. I uploaded the docker related scripts / configs here: https://github.com/symat/zookeeper-docker-test During these manual tests I found some situations when the previous patch didn't work.
With these modifications I was able to test the following situation successfully:
I will think how to unittest these features. (or should we crate some docker-based automated integration test?) In the mean while I would appreciate a deep review of these changes, as I am quite new in the Zookeeper code... |
retest maven build |
1 similar comment
retest maven build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me, see my comments below.
import java.util.SortedMap; | ||
import java.util.TreeMap; | ||
import java.net.InetSocketAddress; | ||
import java.util.*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid using asterisk imports. This a common coding practice across ZooKeeper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, thanks!
.collect(Collectors.joining(",")); | ||
} | ||
|
||
private String getSingleAddressString(QuorumPeer.QuorumServer qs, InetSocketAddress address) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that we don't need this magic here which is trying to recreate the original config file contents. The output of admin command doesn't need to match with the configuration, so you can just dump the contents of the internal representation (map) which I think more helpful for anyway reading this output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree, I will change it.
This admin command is about to show the internal view of the voting members (how the zookeeper server thinks who the voting members are and where do they listen). I wouldn't complicate this PR any further, but it might be a good idea to create a follow-up ticket to have some admin command showing if the given server can actually reach all the different ports of the other servers (and not only the voting members). It can help debugging network problems, showing if certain network interfaces on some servers are unreachable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the format, this is how the voting_view
admin command responds now:
{
"current_config" : {
"1" : {
"server_addresses" : [ "/172.16.101.11:2888", "/172.16.102.11:2888" ],
"election_addresses" : [ "/172.16.101.11:3888", "/172.16.102.11:3888" ],
"client_address" : "/0.0.0.0:2181",
"learner_type" : "participant"
},
"2" : {
"server_addresses" : [ "/172.16.101.22:2888", "/172.16.102.22:2888" ],
"election_addresses" : [ "/172.16.101.22:3888", "/172.16.102.22:3888" ],
"client_address" : "/0.0.0.0:2181",
"learner_type" : "participant"
},
"3" : {
"server_addresses" : [ "/172.16.101.33:2888", "/172.16.102.33:2888" ],
"election_addresses" : [ "/172.16.101.33:3888", "/172.16.102.33:3888" ],
"client_address" : "/0.0.0.0:2181",
"learner_type" : "participant"
}
},
"command" : "voting_view",
"error" : null
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect!
} else { | ||
LOG.error("Couldn't bind to " + self.getQuorumAddress(), e); | ||
} | ||
LOG.error("Couldn't bind to " + self.getQuorumAddress(), e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of self.getQuorumAddress()
, you should print address
, because that's the actual address that you're trying to bind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
} catch (ConfigException e) { | ||
throw new InitialMessageException("Badly formed address: %s", addr); | ||
} | ||
String[] addressStrings = new String(b).split(","); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, we probably need 2 things here:
- Increment PROTOCOL_VERSION, because the patch introduces new InitialMessage format,
- Use JSON format and add Jackson JSON serialiser to enable forward compatibility for future changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
referring to our second offline discussion:
- I incremented the PROTOCOL_VERSION
- did not do the JSON format modification, as it makes the future code complicated to always ensure both forward and backward compatibility of the election protocol during the rolling upgrades. The current solution (simply failing if the protocol versions mismatch) is more simple and still working just fine: as the servers are restarted one-by-one, the nodes with the old protocol version and the nodes with the new protocol version will form two partitions, but any given time only one partition will have the quorum.
retest maven build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a couple of comments from a lazy scan
@@ -49,10 +50,11 @@ public String getQuorumAddress() { | |||
|
|||
public String getLearnerMaster() { | |||
QuorumPeer.QuorumServer learnerMaster = observer.getCurrentLearnerMaster(); | |||
if (learnerMaster == null || learnerMaster.addr == null) { | |||
InetSocketAddress address = learnerMaster.addr.getReachableOrOne(); | |||
if (learnerMaster == null || address == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if learnerMaster
is null
here then you'll get a NPE on the learnerMaster.addr.getReachableOrOne()
above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, nice catch!
(actually beside this, we also got a NoSuchElementException if learnerMaster.addr is empty)
I will fix this in the next commit
executor.submit(new LearnerCnxAcceptorHandler(serverSocket, latch))); | ||
|
||
try { | ||
latch.await(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should there be a timeout here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code basically now starting listener threads on all local addresses, then waiting until all the listeners are dead, and then simply start over again (assuming no stop was requested by Leader.shutdown() or no unexpected failure happened in the Listener threads).
I don't think we need a timeout here... ideally we want to wait forever :)
I merged with master (had a lot's of conflicts because of the checkstyle related changes), and also fixed the issue found by @enixon. I also did a manual test with docker before committing. |
I merged with master again, because of minor conflicts with ZOOKEEPER-3448: Introduce MessageTracker to assist debug leader and leaner connectivity issues. |
nice work on rebase and doing cluster level testing on the patch a few things I'd like to discuss:
My current evaluation of the patch is the risk is fairly low if we merge this, as long as we still stick to single address configuration. Though I'd like to conclude the above discussions so we know what we will be getting into if something goes wrong. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a pass on the patch.
Great work, I left a few little comments
I feel we need to setup some kind of integration tests to ensure that a rolling upgrade of an existing cluster from at least 3.5 is working properly.
We must also add docs about this feature.
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Leader.java
Show resolved
Hide resolved
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Leader.java
Outdated
Show resolved
Hide resolved
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java
Show resolved
Hide resolved
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/MultipleAddresses.java
Outdated
Show resolved
Hide resolved
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/MultipleAddresses.java
Show resolved
Hide resolved
break; | ||
} | ||
Thread.sleep(timeout); | ||
} catch (NullPointerException | IOException | InterruptedException ignored) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
catching NullPointerException is a code and usually a bad practice.
please remove this case
catching InterruptedException requires to reset Thread#interrupt flag, please fix or explain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I agree... this one needs to be refactored. I will remove the atomic reference and do a simple parallel stream filtering in the getReachableAddress()
. Please review after I push the new commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@symat why are you talking about parallelStream ? we are not talking about hundreds of addresses, using more that one thread won't help and maybe it could be more expensive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, we are using parallel streams only in two places, where IO operations happen (one for hostname resolution, one for checking if the address is reachable). I agree that we don't have hundreds of addresses, but even with a few addresses, parallelizing can mean that we can find a reachable address for a remote Quorum Server a few seconds quicker.
I agree that it is not a huge deal... but it is a trivial optimization in the code and I don't think it would be unsafe in any ways.
I cleaned up this part in my last commit. If you still think it is unnecessary and would be better to use simple streams, then we can go for it. (You are right that in the rare case of a network error that few seconds will not matter much)
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/MultipleAddresses.java
Outdated
Show resolved
Hide resolved
Thanks for all of you for the comments and review so far! I extended a few exiting test classes (e.g.
An important change: due to the way how we represent the server configs during dynamic reconfig, I had to change the separator character from comma to pipe. So the valid config format now looks like:
regarding the impact on dynamic reconfig:
I try to summarize the open discussions / tasks to see how much work is required to close this issue:
I think if the tasks above are concluded, then it is safe to merge the PR. What do you think? The automated integration tests would be nice to have definitely. These test cases I have in mind:
I would cover these integration tests in follow-up jira ticket(s?), as these seems to be not trivial to automatize. What do you think? |
On Tue, Oct 8, 2019 at 6:49 AM Mate Szalay-Beko ***@***.***> wrote:
*Thanks for all of you for the comments and review so far!*
Nice work to you as well.
... An important change: due to the way how we represent the server
configs during dynamic reconfig, I had to change the separator character
from comma to pipe.
This is the predictable penalty of an ad hoc syntax that isn't consistent
everywhere. Good catch.
-
- ...I actually saw during the tests, that simply adding new
addresses for servers with incremental re-config does not cause a new
leader election (although I haven't created an automated test explicitly
forcing this). I am not sure if it is worth to add a new test for this, as
dynamic reconfigs are relatively rare events during production use I guess.
This feels nice, but it is actually a little worrisome in that it seems to
indicate that the multi connect stuff is more clever than I would expect.
*I try to summarize the open discussions / tasks* to see how much work is
required to close this issue:
- we should add the new config format to the documentation
- we should extend the dynamic reconfig documentation with the
comments above
- we should test (for now I propose only to try out manually) and
document:
- the impact on quorum TLS / kerberos feature
- the impact on rolling upgrades
- and of course we need to resolve any code related conversation what
would be still open (e.g. on parallel streams with @eolivelli
<https://github.com/eolivelli> )
I think that this is a good summary.
*I think if the tasks above are concluded, then it is safe to merge the
PR.* What do you think?
I am only half informed on this, but your summary seems persuasive.
The automated integration tests would be nice to have definitely. These
test cases I have in mind:
- Automated integration test for rolling upgrades (e.g. from 3.5.6 to
master) without changing the config
- Automated integration test for rolling upgrades (e.g. from 3.5.6 to
master) with changing the config to use multiple addresses during the
upgrade
- Automated integration test for simulating the case of network
failure during the use of multiple addresses
- hint: use docker with multiple network interfaces (see:
https://github.com/symat/zookeeper-docker-test )
- I am doing this test manually after each of my commits
*I would cover these integration tests in follow-up jira ticket(s?), as
these seems to be not trivial to automatize.* What do you think?
Without an advanced test framework that can create and control containers,
I think that these kinds of tests are difficult to create and maintain.
To me, the key question is how much value would these automated tests
actually provide. I see very high value in a documented procedure and
outcome for a manual version of these tests, but I am not sure I see a high
value for even repeating these manual tests again. How likely is it that
subsequent changes will cause these tests to fail? Is that probability high
enough to justify the considerable cost of automation?
|
Sounds a great plan. I am fine with parallelStream for dns resolutions. For integration tests we have to do it in a separate ticket, not blocker for merging this patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@symat I experienced something weird with this patch. Tried the following:
I'm having 2 network interfaces in my Mac: wifi + cable, connected with 2 different IPs. Created the following config:
server.1=172.30.64.161:3181:4181|172.30.65.130:3181:4181
server.2=172.30.64.161:3182:4182|172.30.65.130:3182:4182
server.3=172.30.64.161:3183:4183|172.30.65.130:3183:4183
At the beginning both interfaces were up. When I started ZK quorum, it connected successfully and quorum was up within a second. When I disabled the cable interface (pulling the cable out), the nodes started to communicate on the other interface.
After that I swapped the interfaces (cable plugged in, wifi disabled) and nodes went back to looking state and were not able to form a quorum again. Not even after a restart!
Maybe this is not a real life scenario which this patch should be prepared for, but I'm not sure.
...-server/src/test/java/org/apache/zookeeper/server/quorum/QuorumPeerMainMultiAddressTest.java
Outdated
Show resolved
Hide resolved
zookeeper-server/src/test/java/org/apache/zookeeper/test/ReconfigTest.java
Show resolved
Hide resolved
On Tue, Oct 8, 2019 at 8:45 AM Andor Molnár ***@***.***> wrote:
*...*
@symat <https://github.com/symat> I experienced something weird with this
patch. Tried the following:
I'm having 2 network interfaces in my Mac: wifi + cable, connected with 2
different IPs. Created the following config:
server.1=172.30.64.161:3181:4181|172.30.65.130:3181:4181
server.2=172.30.64.161:3182:4182|172.30.65.130:3182:4182
server.3=172.30.64.161:3183:4183|172.30.65.130:3183:4183
At the beginning both interfaces were up. When I started ZK quorum, it
connected successfully and quorum was up within a second. When I disabled
the cable interface (pulling the cable out), the nodes started to
communicate on the other interface.
After that I swapped the interfaces (cable plugged in, wifi disabled) and
nodes went back to looking state and were not able to form a quorum again.
Not even after a restart!
Maybe this is not a real life scenario which this patch should be prepared
for, but I'm not sure.
This is absolutely a real-life scenario and it should work.
If I were doing this, I would be suspicious that something about what I was
seeing wasn't really working the way I thought.
In the zeroth scenario (both interfaces live), how do you verify which link
is being used? What traffic is observed on the wifi side even when it is
ostensibly not used?
In your second scenario (wifi-disabled), can you ssh from one node to
another?
|
I think this patch is in a good shape currently. The issue we found (as said) not a big deal if a single node is failing and I believe it's better not to touch that part of the code just because of this patch. @eolivelli Are you still holding your -1? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to mention that @symat is updating the docs, so hopefully we'll have it soon. |
thanks @anmolnar and @eolivelli for checking the PR! In the last few commits:
I also think that it is safe to merge this PR now. Some parts of the recovery still can be optimized later, but I think it is stable now and also backward compatible. |
retest maven build |
Refer to this link for build results (access rights to CI server needed): |
Merged to master. Thanks @symat and everybody! |
* ZOOKEEPER-3530: add new artifact for compiled c-client code During the old ZooKeeper 3.4 ant builds (ant package-native), there was an artifact (zookeeper-<version>-lib.tar.gz) created just for the C-client, with the following content: ``` usr |--- bin |--- cli_mt |--- cli_st |--- load_gen |--- include |--- zookeeper |--- proto.h |--- recordio.h |--- zookeeper.h |--- zookeeper.jute.h |--- zookeeper_log.h |--- zookeeper_version.h |--- lib |--- libzookeeper_mt.a |--- libzookeeper_mt.la |--- libzookeeper_mt.so |--- libzookeeper_mt.so.2 |--- libzookeeper_mt.so.2.0.0 |--- libzookeeper_st.a |--- libzookeeper_st.la |--- libzookeeper_st.so |--- libzookeeper_st.so.2 |--- libzookeeper_st.so.2.0.0 ``` Currently with maven, when we are generating a tarball during full-build then the C-client is not getting archived. In [PR-1078](apache#1078) we discussed that we should re-introduce the apache-zookeeper-<version>-lib.tar.gz artifact. The goals of this PR are: - re-introduce the 'lib' artifact, with the same structure we had for the older zookeeper 3.4.x ant generated tar file - we should also add the LICENSE.txt file to the archive (it was missing from the 3.4.x version tar.gz file) - the new artifact should be generated only when the full-build profile is set for maven - we should also update the README_packaging.md file Author: Mate Szalay-Beko <mszalay@cloudera.com> Reviewers: nkalmar@apache.org, andor@apache.org Closes apache#1113 from symat/ZOOKEEPER-3530-PR * ZOOKEEPER-3593: fix the default value of jute.maxbuffer in client side and an optimization for the documentation - The default value of `jute.maxbuffer` in client side said it's 4MB, but actually, it was never working, because when the client reads the deserialized znode data, it also calls `checkLength(BinaryInputArchive.java:127)` where `jute.maxbuffer` default value is 1MB. It's easy to reproduce, just read a znode more than 1MB with any special configure client. Look at the stack trace I attached in the JIRA - Users also confused about that the doc said `jute.maxbuffer` must be set on all servers and clients, but their default value is not same in the [ZOOKEEPER-1295](https://issues.apache.org/jira/browse/ZOOKEEPER-1295) - more details in the [ZOOKEEPER-3593](https://issues.apache.org/jira/browse/ZOOKEEPER-3593) Author: maoling <maoling199210191@sina.com> Reviewers: enixon@apache.org, eolivelli@apache.org Closes apache#1129 from maoling/ZOOKEEPER-3593 * ZOOKEEPER-3605: connThrottle needs to be assigned in alternate consructor `connThrottle` needs to be assigned in alternate consructor to avoid NPEs Author: randgalt <jordan@jordanzimmerman.com> Reviewers: Enrico Olivelli, Andor Molnár Closes apache#1132 from Randgalt/ZOOKEEPER-3605 * ZOOKEEPER-1416 - Persistent, recursive watchers ### Background Note: this is a port of apache#136 Implementation for a persistent, recursive watch addition for ZK. These watches are set via a new method, addPersistentWatch() and are removed via the existing watcher removal methods. Persistent, recursive watches have these characteristics: a) Once set, they do not auto-remove when triggered; b) they trigger for all event types (child, data, etc.) on the node they are registered for and any child znode recursively; c) they are efficiently implemented by using the existing watch internals. A new class PathIterator walks up the path parent-by-parent when checking if a watcher applies. ### Implementation Details - A new enum manages the different "modes" for watchers: `WatcherMode`. - For traditional, "standard" watchers, the code path is almost exactly the same. There is very little overhead other than a bit of extra checks in `WatchManager` - Given how this is implemented it was difficult to add support when `WatchManagerOptimized` is used. I'm open to adding it for that version but it will take work. We should consider not supporting persistent/recursive watchers when WatchManagerOptimized is used. I notice that `WatchManagerOptimized` is not even mentioned in the docs. - The mode for a given watcher/path pair is held in a map inside of `WatcherModeManager`. The absence of an entry means Standard. This way, there's no overhead for old, standard watchers. - `PathParentIterator` is the "meat" of the implementation. Rather than set watchers on every ZNode implied by a recursive watcher. WatchManager passes any paths it processes through PathParentIterator which iterates up each parent znode looking for watchers. - The remainder of the changes are scaffolding to match how other watchers are used as well as Jute/API changes to set persistent/recursive watchers ### Testing The tests were written years ago. I think they're comprehensive but reviewers should pay attention to anything that was missed. There is much ZooKeeper knowledge that's only in the heads of ZK committers. - `PersistentWatcherTest` - tests persistent, non-recursive watchers - `PersistentRecursiveWatcherTest` - tests persistent, recursive watchers - `PathParentIteratorTest`- exercises edges of PathParentIterator Author: randgalt <jordan@jordanzimmerman.com> Reviewers: Enrico Olivelli <eolivelli@apache,org>, Norbert Kalmar <nkalmar@apache.org>, Andor Molnár <andor@apache.org>, Justin Mao Ling <maoling199210191@sina.com> Closes apache#1106 from Randgalt/ZOOKEEPER-1416 * ZOOKEEPER-3598: Fix potential data inconsistency issue due to CommitProcessor not gracefully shutdown Note: use exit code 16 for SHUTDOWN_UNGRACEFULLY, since internally we've already using 15 for other exit code, which will be upstreamed later. Author: Fangmin Lyu <fangmin@apache.org> Reviewers: Enrico Olivelli <eolivelli@apache.org>, Michael Han <hanm@apache.org> Closes apache#1130 from lvfangmin/ZOOKEEPER-3598 * ZOOKEEPER-1260: Audit logging in ZooKeeper servers. Author: Mohammad Arshad <arshad@apache.org> Reviewers: Enrico Olivelli <eolivelli@apache.org>, Andor Molnar <andor@apache.org> Closes apache#1133 from arshadmohammad/ZOOKEEPER-1260-AuditLog-master * ZOOKEEPER-3340: Introduce CircularBlockingQueue in QuorumCnxManager.java There are many caveats and comments regarding Exception thrown 'which is safe to ignore'. Added a new data structure that removes all of these comments and will perform without generating exceptions while more clearly implementing the desired behavior without the caveats. Author: Beluga Behr <dam6923@gmail.com> Author: David Mollitor <dam6923@gmail.com> Author: David Mollitor <dmollitor@apache.org> Reviewers: eolivelli@apache.org, andor@apache.org Closes apache#880 from belugabehr/ZOOKEEPER-3340 and squashes the following commits: 7b74dac [David Mollitor] Changed comment from conrete ArrayBlockingQueue to BlockingQueue 38d7e3f [David Mollitor] Shutdown ExecutorService in test f5ea2b6 [David Mollitor] Updated to fix checkstyle ImportOrder: Extra separation b3ac3cc [David Mollitor] Rebased to master c63ab85 [Beluga Behr] Added updates from code review d887780 [Beluga Behr] Introduce new class CircularBlockingQueue 2793db2 [Beluga Behr] Added additional logging ce96b70 [Beluga Behr] ZOOKEEPER-3340: Improve Queue Usage in QuorumCnxManager.java * ZOOKEEPER-2238: Support limiting the maximum number of connections/clients to a zookeeper server Support limiting the maximum number of connections/clients to a zookeeper server Author: sujithsimon22 <sujith.abraham.simon@huawei.com> Reviewers: Justin Mao Ling <maoling199210191@sina.com>, Brian Nixon <enixon@apache.org>, Edward Ribeiro <edward.ribeiro@gmail.com>, Enrico Olivelli <eolivelli@apache.org>, Mohammad Arshad <arshad@apache.org> Closes apache#1108 from sujithsimon22/ZOOKEEPER-2238 * ZOOKEEPER-3571: Ensure test base directory before tests Ensure `build.test.dir` is present as a directory. Otherwise many times it suffers from ``` java.io.IOException: No such file or directory at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2024) at org.apache.zookeeper.test.ClientBase.createTmpDir(ClientBase.java:371) at org.apache.zookeeper.test.ClientBase.setUpWithServerId(ClientBase.java:514) at org.apache.zookeeper.test.ClientBase.setUp(ClientBase.java:491) ``` Author: tison <wander4096@gmail.com> Reviewers: eolivelli@apache.org, andor@apache.org Closes apache#1112 from TisonKun/ZOOKEEPER-3571 * Revert "ZOOKEEPER-3598: Fix potential data inconsistency issue due to CommitProcessor not gracefully shutdown" This reverts commit 79f99af. * ZOOKEEPER-3363: Drop ANT build.xml - drop Ant and Ivy files - drop old Cobertura README file - drop old jdiff file Author: Enrico Olivelli <enrico.olivelli@diennea.com> Author: Enrico Olivelli <eolivelli@apache.org> Reviewers: Norbert Kalmar <nkalmar@apache.org> Closes apache#1139 from eolivelli/fix/drop-ant * ZOOKEEPER-3560: Add response cache to serve get children (2) requests. ZOOKEEPER-3180 introduces response cache but it only covers getData requests. This commit is to extend the response cache based on the infrastructure set up by ZOOKEEPER-3180 to so the response of get children requests can also be served out of cache. Some design decisions: * Only OpCode.getChildren2 is supported, as OpCode.getChildren does not have associated stats and current cache infra relies on stats to invalidate cache. * The children list is stored in a separate response cache object so it does not pollute the existing data cache that's serving getData requests, and this separation also allows potential separate tuning of each cache based on workload characteristics. * As a result of cache object separation, new server metrics is added to measure cache hit / miss for get children requests, that's separated from get data requests. Similar as ZOOKEEPER-3180, the get children response cache is enabled by default, with a default cache size of 400, and can be disabled (together with get data response cache.). Author: Michael Han <lhan@twitter.com> Author: Michael Han <hanm@apache.org> Reviewers: eolivelli@apache.org, andor@apache.org, enixon@apache.org Closes apache#1098 from hanm/cache * ZOOKEEPER-3502: improve the server command: zabstate to have a better observation on the process of leader election - Just as we knew, the rule of LE is that who has the biggest last zxid who will win.when the zxid is the same, chose the myid number bigger one. - Do the following cmd will have a bird's-eye view on LE ``` ➜ bin curl http://localhost:8081/commands/zabstate http://localhost:8082/commands/zabstate http://localhost:8083/commands/zabstate { "myid" : 1, "is_leader" : true, "endpoint" : "/127.0.0.1:2181", "voting" : true, "last_zxid" : "0xfa2c00000000", "zab_epoch" : 64044, "zab_counter" : 0, "zabstate" : "broadcast", "command" : "zabstate", "error" : null }{ "myid" : 2, "is_leader" : false, "endpoint" : "/127.0.0.1:2182", "voting" : true, "last_zxid" : "0xfa2b00000004", "zab_epoch" : 64043, "zab_counter" : 4, "zabstate" : "broadcast", "command" : "zabstate", "error" : null }{ "myid" : 3, "is_leader" : false, "endpoint" : "/127.0.0.1:2183", "voting" : true, "last_zxid" : "0xfa2b00000004", "zab_epoch" : 64043, "zab_counter" : 4, "zabstate" : "broadcast", "command" : "zabstate", "error" : null }% ``` - more details in the [ZOOKEEPER-3502](https://issues.apache.org/jira/browse/ZOOKEEPER-3502) Author: maoling <maoling199210191@sina.com> Reviewers: Brian Nixon <enixon@apache.org>, Andor Molnar <andor@apache.org>, Enrico Olivelli <eolivelli@apache.org> Closes apache#1052 from maoling/ZOOKEEPER-3502 * ZOOKEEPER-3570: make the special client xid constant - more details in the [ZOOKEEPER-3570](https://issues.apache.org/jira/browse/ZOOKEEPER-3570) Author: maoling <maoling199210191@sina.com> Reviewers: Fangmin Lyu <fangmin@apache.org>, Brian Nixon <enixon@apache.org> Closes apache#1136 from maoling/ZOOKEEPER-3570 * ZOOKEEPER-2122: add SSL support for C-client This PR is based on the works of Asnish Amarnath and Suhas Dantkale. Most of the kudos should go to them and those who were reviewing all the previous PRs. **The PR includes the following changes from PR#639:** - OPENSSL 1.1.1 version support in C-client **The PR includes the following changes from PR#990:** - also supporting OPENSSL 1.0.2 - SSL connection on non-blocking socket is handled correctly - Support of Certificate Chains - Fix Memory leaks - Dynamically generated test certificates **The following new changes were added into the PR:** - fix CMake + VisualStudio2019 build with windows - fix C CLI to compile / work both with windows and linux (I tested them manually) - fix (and simplify) the way how the server is started with C unit tests, so it is compatible with maven build now - the test case `testReadOnly` was failing with the previous PR because there was a bug in the C-client code, I fixed that - I also added new test case: `testReadOnlyWithSSL` **Testing this PR on linux:** ``` git clean -xdf # compile ZooKeeper server plus the C-client code mvn clean install -DskipTests -Pfull-build # compile and execute C-client unit tests cd zookeeper-client/ mvn clean install -Pfull-build ``` **Compile the code on windows (only cmake is supported):** - download C-Make: https://cmake.org/download/ - Install community edition of Visual Studio: https://visualstudio.microsoft.com/downloads/ - Download OpenSSL (e.g. 1.0.2): https://slproweb.com/products/Win32OpenSSL.html (e.g. install it to `c:\OpenSSL-Win64`) - compile the java code using: `mvn clean install -DskipTests` - go to the Client folder: `cd zookeeper-client\zookeeper-client-c` - configure the project: `cmake . -D WITH_OPENSSL=c:\OpenSSL-Win64` - build the project: `cmake --build .` **Testing the C-client with SSL manually:** - run the `zookeeper-client/zookeeper-client-c/ssl/gencerts.sh` to generate certificate files (e.g. copy it to an empty folder like `/tmp/ssl/` and start is) - start a ZooKeeper server, using some config file like this one: ``` tickTime=3000 initLimit=10 syncLimit=5 dataDir=/tmp/zkdata secureClientPort=22281 serverCnxnFactory=org.apache.zookeeper.server.NettyServerCnxnFactory ssl.keyStore.location=/tmp/ssl/server.jks ssl.keyStore.password=password ssl.trustStore.location=/tmp/ssl/servertrust.jks ssl.trustStore.password=password ``` - start the command line client (cli.exe on windows, cli_mt or cli_st on linux): `./cli_mt --host localhost:22281 --ssl /tmp/ssl/server.crt,/tmp/ssl/client.crt,/tmp/ssl/clientkey.pem,password` Author: Mate Szalay-Beko <szalay.beko.mate@gmail.com> Author: Mate Szalay-Beko <mszalay@cloudera.com> Reviewers: andor@apache.org Closes apache#1107 from symat/ZOOKEEPER-2122 and squashes the following commits: 08294ce [Mate Szalay-Beko] ZOOKEEPER-2122: update readme + use FQDN in SSL certs during testing 17e504a [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-2122 317241d [Mate Szalay-Beko] ZOOKEEPER-2122: minor fix in SSL certificates used for testing 6f37b66 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into HEAD 9809143 [Mate Szalay-Beko] ZOOKEEPER-2122: add SSL support for C-client * ZOOKEEPER-3590: Zookeeper is unable to set the zookeeper.sasl.client.canonicalize.hostname using system property See https://issues.apache.org/jira/browse/ZOOKEEPER-3590 This is a very small patch that gives sets the option after reading it from the system.getProperty(). There should not be any side effects since this is mostly copy paste. Author: aristotelhs <aristotelhs@lenses.io> Reviewers: eolivelli@apache.org, andor@apache.org Closes apache#1124 from aristotelhs/ZOOKEEPER-3590 * ZOOKEEPER-3627: Update Jackson to 2.9.10.1 Author: Colm O hEigeartaigh <coheigea@apache.org> Reviewers: eolivelli@apache.org, andor@apache.org Closes apache#1150 from coheigea/ZOOKEEPER-3627 * ZOOKEEPER-3473: Improving successful TLS handshake throughput with concurrent control When there are lots of clients trying to re-establish sessions, there might be lots of half finished handshake timed out, and those failed ones keep reconnecting to another server and restarting the handshake from beginning again, which caused herd effect. And the number of total ZK sessions could be supported within session timeout are impacted a lot after enabling TLS. To improve the throughput, we added the TLS concurrent control to reduce the herd effect, and from out benchmark this doubled the sessions we could support within session timeout. E2E test result: Tested performance and correctness from E2E. For correctness, tested both secure and insecure connections, the outstandingHandshakeNum will go to 0 eventually. For performance, tested with 110k sessions with 10s session timeout, there is no session expire when leader election triggered, while before it can only support 50k sessions. Author: Fangmin Lyu <fangmin@apache.org> Reviewers: Enrico Olivelli <eolivelli@apache.org>, Andor Molnar <andor@apache.org> Closes apache#1027 from lvfangmin/ZOOKEEPER-3473 * ZOOKEEPER-3546 - Allow optional deletion of never used Container Nodes Edge cases can cause Container Nodes to never be deleted. i.e. if the container node is created and then the client that create the node crashes the container will not get deleted unless another client creates a node inside of it. This is because the initial implementation does not delete container nodes with a cversion of 0. This PR adds a new system property, "znode.container.maxNeverUsedIntervalMs", that can be set to delete containers with a cversion of 0 that have been retained for a period of time. This is a backward compatible change as the default value for this is Long.MAX_VALUE - i.e. never. Author: randgalt <jordan@jordanzimmerman.com> Reviewers: Enrico Olivelli <eolivelli@apache.org>, Fangmin Lyu <fangmin@apache.org> Closes apache#1138 from Randgalt/ZOOKEEPER-3546-allow-delete-of-never-used-containers * ZOOKEEPER-3620: Allow to override calls to System.exit in server side code - Introduce a way to override calls to System.exit - enable DM_EXIT spotbugs rule see https://issues.apache.org/jira/browse/ZOOKEEPER-3620 for more context. Author: Enrico Olivelli <enrico.olivelli@diennea.com> Author: Enrico Olivelli <eolivelli@apache.org> Reviewers: andor@apache.org Closes apache#1147 from eolivelli/fix/ZOOKEEPER-3620-no-systemexit and squashes the following commits: a234f85 [Enrico Olivelli] Fix checkstyle 4c4fec4 [Enrico Olivelli] Fix spotbugs warning ae339b7 [Enrico Olivelli] Revert changes to VerGen.java 0e5ee07 [Enrico Olivelli] Enable DM_EXIT spotbugs rule for the full code base b05a4bf [Enrico Olivelli] ZOOKEEPER-3620 Allow to override calls to System.exit in server side code: - Use a common utility to call System.exit - Override calls to System.exit to a NO-OP function in tests * ZOOKEEPER-3188: Improve resilience to network This PR is the rebase of the [previous pull request](apache#730), so all the kudos should go to the original authors... In [ZOOKEEPER-3188](https://issues.apache.org/jira/browse/ZOOKEEPER-3188) we add ability to specify several addresses for quorum operations. Also added reconnection attempts if connection to leader lost. In this PR I rebased the changes on the current master, resolving some minor conflicts with: - [ZOOKEEPER-3296](https://issues.apache.org/jira/browse/ZOOKEEPER-3296): Explicitly closing the sslsocket when it failed handshake to prevent issue where peers cannot join quorum - [ZOOKEEPER-3320](https://issues.apache.org/jira/browse/ZOOKEEPER-3320): Leader election port stop listen when hostname unresolvable for some time - [ZOOKEEPER-3385](https://issues.apache.org/jira/browse/ZOOKEEPER-3385): Add admin command to display leader - [ZOOKEEPER-3386](https://issues.apache.org/jira/browse/ZOOKEEPER-3386): Add admin command to display voting view - [ZOOKEEPER-3398](https://issues.apache.org/jira/browse/ZOOKEEPER-3398): Learner.connectToLeader() may take too long to time-out I still want to test the feature manually (e.g. using docker containers with multiple virtual networks / interfaces). The steps to the manual test could be recorded in the [google docs](https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing) as well. Also I think we could add a few more unit tests where we are using multiple addresses. The current tests are using a single address only. Also the Zookeeper documentation needs to be changed (e.g. by a follow-up Jira?) to promote the new feature and the new config format (possibly including also the admin command documentation in relation with [ZOOKEEPER-3386](https://issues.apache.org/jira/browse/ZOOKEEPER-3386) and [ZOOKEEPER-3461](https://issues.apache.org/jira/browse/ZOOKEEPER-3461)) Author: Mate Szalay-Beko <szalay.beko.mate@gmail.com> Author: Mate Szalay-Beko <mszalay@cloudera.com> Reviewers: eolivelli@apache.org, andor@apache.org Closes apache#1048 from symat/ZOOKEEPER-3188 and squashes the following commits: 3c6fc52 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 356882d [Mate Szalay-Beko] ZOOKEEPER-3188: document new configuration format for using multiple addresses 45b6c0f [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 4b6bcea [Mate Szalay-Beko] ZOOKEEPER-3188: MultiAddress unit tests for Quorum TLS and Kerberos/Digest authentication 40bc44c [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 f875f5c [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 31805e7 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 0f95678 [Mate Szalay-Beko] ZOOKEEPER-3188: skip unreachable addresses when Learner connects to Leader e232c55 [Mate Szalay-Beko] ZOOKEEPER-3188: fix flaky unit MultiAddress unit test e892d8d [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 6f2ab75 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 2eedf26 [Mate Szalay-Beko] ZOOKEEPER-3188: fix PR commits; handle case when Leader can not bind to port on startup 483d2fc [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 a5d6bcb [Mate Szalay-Beko] ZOOKEEPER-3188: support for dynamic reconfig + add more unit tests ed31d2c [Mate Szalay-Beko] ZOOKEEPER-3188: better shutdown for executors (following PR comments) 8713a5b [Mate Szalay-Beko] ZOOKEEPER-3188: add fixes for PR comments 05eae83 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 e823af4 [Mate Szalay-Beko] Merge remote-tracking branch 'origin/master' into ZOOKEEPER-3188 de7bad2 [Mate Szalay-Beko] Merge remote-tracking branch 'origin/master' into ZOOKEEPER-3188 da98a8d [Mate Szalay-Beko] ZOOKEEPER-3188: fix JDK-13 warning 5bd1f4e [Mate Szalay-Beko] ZOOKEEPER-3188: supress spotbugs warning 42a52a6 [Mate Szalay-Beko] ZOOKEEPER-3188: improve based on code review comments 6c4220a [Mate Szalay-Beko] ZOOKEEPER-3188: fix SendWorker.asyncValidateIfSocketIsStillReachable 5b22432 [Mate Szalay-Beko] ZOOKEEPER-3188: fix LeaderElection to work with multiple election addresses 7bfbe7e [Mate Szalay-Beko] ZOOKEEPER-3188: Improve resilience to network * ZOOKEEPER-3633: AdminServer commands throw NPE when only secure client port is used When only secureClientPort is defined in the config and there is no regular clientPort, then both the stat and the conf commands on the AdminServer result in 500 Server Error caused by NullPointerExceptions. The problem is that no serverCnxFactory is defined in the ZooKeeperServer in this case, we have only secureServerCnxnFactory. In the fix we return info about both the secure and unsecure connections. Example of the stat command output for secure-only configuration: ``` { "version" : "3.6.0-SNAPSHOT-8e8905069f4bff670c0492fe9e28ced0f86bca00, built on 11/29/2019 08:04 GMT", "read_only" : false, "server_stats" : { "packets_sent" : 1, "packets_received" : 1, "fsync_threshold_exceed_count" : 0, "client_response_stats" : { "last_buffer_size" : -1, "min_buffer_size" : -1, "max_buffer_size" : -1 }, "data_dir_size" : 671094270, "log_dir_size" : 671094270, "last_processed_zxid" : 20, "outstanding_requests" : 0, "server_state" : "standalone", "avg_latency" : 5.0, "max_latency" : 5, "min_latency" : 5, "num_alive_client_connections" : 1, "provider_null" : false, "uptime" : 15020 }, "client_response" : { "last_buffer_size" : -1, "min_buffer_size" : -1, "max_buffer_size" : -1 }, "node_count" : 6, "connections" : [ ], "secure_connections" : [ { "remote_socket_address" : "127.0.0.1:57276", "interest_ops" : 1, "outstanding_requests" : 0, "packets_received" : 1, "packets_sent" : 1 } ], "command" : "stats", "error" : null } ``` Author: Mate Szalay-Beko <szalay.beko.mate@gmail.com> Reviewers: Andor Molnar <andor@apache.org>, Enrico Olivelli <eolivelli@apache.org>, Norbert Kalmar <nkalmar@apache.org> Closes apache#1161 from symat/ZOOKEEPER-3633 * ZOOKEEPER-3595: restore the handling of the fsync parameter Author: Jingguo Yao <yaojingguo@gmail.com> Reviewers: enixon@apache.org, eolivelli@apache.org, maoling199210191@sina.com, fangmin@apache.org Closes apache#1146 from yaojingguo/ZOOKEEPER-3595 * ZOOKEEPER-3546: fix missed change, default should be 0 not Long.MAX_VALUE fix missed change, default should be 0 not Long.MAX_VALUE Author: randgalt <jordan@jordanzimmerman.com> Reviewers: eolivelli@apache.org, fangmin@apache.org Closes apache#1158 from Randgalt/ZOOKEEPER-3546-allow-delete-of-never-used-containers-fixit * ZOOKEEPER-3635: Use Docker and Maven Release Plugin to prepare ZooKeeper releases - configure maven-release-plugin - configure maven-scm-plugin - use maven-antrun-plugin in order to alter C client files during release procedure - update Apache Parent pom to 21 - use openjdk8 for building releases Author: Enrico Olivelli <eolivelli@apache.org> Author: Enrico Olivelli <eolivelli@gmail.com> Author: Enrico Olivelli <enrico.olivelli@diennea.com> Reviewers: Norbert Kalmar <nkalmar@apache.org> Closes apache#1162 from eolivelli/fix/ZOOKEEPER-3635-new-release-proc
This PR is the rebase of the [previous pull request](apache#730), so all the kudos should go to the original authors... In [ZOOKEEPER-3188](https://issues.apache.org/jira/browse/ZOOKEEPER-3188) we add ability to specify several addresses for quorum operations. Also added reconnection attempts if connection to leader lost. In this PR I rebased the changes on the current master, resolving some minor conflicts with: - [ZOOKEEPER-3296](https://issues.apache.org/jira/browse/ZOOKEEPER-3296): Explicitly closing the sslsocket when it failed handshake to prevent issue where peers cannot join quorum - [ZOOKEEPER-3320](https://issues.apache.org/jira/browse/ZOOKEEPER-3320): Leader election port stop listen when hostname unresolvable for some time - [ZOOKEEPER-3385](https://issues.apache.org/jira/browse/ZOOKEEPER-3385): Add admin command to display leader - [ZOOKEEPER-3386](https://issues.apache.org/jira/browse/ZOOKEEPER-3386): Add admin command to display voting view - [ZOOKEEPER-3398](https://issues.apache.org/jira/browse/ZOOKEEPER-3398): Learner.connectToLeader() may take too long to time-out I still want to test the feature manually (e.g. using docker containers with multiple virtual networks / interfaces). The steps to the manual test could be recorded in the [google docs](https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing) as well. Also I think we could add a few more unit tests where we are using multiple addresses. The current tests are using a single address only. Also the Zookeeper documentation needs to be changed (e.g. by a follow-up Jira?) to promote the new feature and the new config format (possibly including also the admin command documentation in relation with [ZOOKEEPER-3386](https://issues.apache.org/jira/browse/ZOOKEEPER-3386) and [ZOOKEEPER-3461](https://issues.apache.org/jira/browse/ZOOKEEPER-3461)) Author: Mate Szalay-Beko <szalay.beko.mate@gmail.com> Author: Mate Szalay-Beko <mszalay@cloudera.com> Reviewers: eolivelli@apache.org, andor@apache.org Closes apache#1048 from symat/ZOOKEEPER-3188 and squashes the following commits: 3c6fc52 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 356882d [Mate Szalay-Beko] ZOOKEEPER-3188: document new configuration format for using multiple addresses 45b6c0f [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 4b6bcea [Mate Szalay-Beko] ZOOKEEPER-3188: MultiAddress unit tests for Quorum TLS and Kerberos/Digest authentication 40bc44c [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 f875f5c [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 31805e7 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 0f95678 [Mate Szalay-Beko] ZOOKEEPER-3188: skip unreachable addresses when Learner connects to Leader e232c55 [Mate Szalay-Beko] ZOOKEEPER-3188: fix flaky unit MultiAddress unit test e892d8d [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 6f2ab75 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 2eedf26 [Mate Szalay-Beko] ZOOKEEPER-3188: fix PR commits; handle case when Leader can not bind to port on startup 483d2fc [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 a5d6bcb [Mate Szalay-Beko] ZOOKEEPER-3188: support for dynamic reconfig + add more unit tests ed31d2c [Mate Szalay-Beko] ZOOKEEPER-3188: better shutdown for executors (following PR comments) 8713a5b [Mate Szalay-Beko] ZOOKEEPER-3188: add fixes for PR comments 05eae83 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 e823af4 [Mate Szalay-Beko] Merge remote-tracking branch 'origin/master' into ZOOKEEPER-3188 de7bad2 [Mate Szalay-Beko] Merge remote-tracking branch 'origin/master' into ZOOKEEPER-3188 da98a8d [Mate Szalay-Beko] ZOOKEEPER-3188: fix JDK-13 warning 5bd1f4e [Mate Szalay-Beko] ZOOKEEPER-3188: supress spotbugs warning 42a52a6 [Mate Szalay-Beko] ZOOKEEPER-3188: improve based on code review comments 6c4220a [Mate Szalay-Beko] ZOOKEEPER-3188: fix SendWorker.asyncValidateIfSocketIsStillReachable 5b22432 [Mate Szalay-Beko] ZOOKEEPER-3188: fix LeaderElection to work with multiple election addresses 7bfbe7e [Mate Szalay-Beko] ZOOKEEPER-3188: Improve resilience to network
This PR is the rebase of the [previous pull request](apache#730), so all the kudos should go to the original authors... In [ZOOKEEPER-3188](https://issues.apache.org/jira/browse/ZOOKEEPER-3188) we add ability to specify several addresses for quorum operations. Also added reconnection attempts if connection to leader lost. In this PR I rebased the changes on the current master, resolving some minor conflicts with: - [ZOOKEEPER-3296](https://issues.apache.org/jira/browse/ZOOKEEPER-3296): Explicitly closing the sslsocket when it failed handshake to prevent issue where peers cannot join quorum - [ZOOKEEPER-3320](https://issues.apache.org/jira/browse/ZOOKEEPER-3320): Leader election port stop listen when hostname unresolvable for some time - [ZOOKEEPER-3385](https://issues.apache.org/jira/browse/ZOOKEEPER-3385): Add admin command to display leader - [ZOOKEEPER-3386](https://issues.apache.org/jira/browse/ZOOKEEPER-3386): Add admin command to display voting view - [ZOOKEEPER-3398](https://issues.apache.org/jira/browse/ZOOKEEPER-3398): Learner.connectToLeader() may take too long to time-out I still want to test the feature manually (e.g. using docker containers with multiple virtual networks / interfaces). The steps to the manual test could be recorded in the [google docs](https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing) as well. Also I think we could add a few more unit tests where we are using multiple addresses. The current tests are using a single address only. Also the Zookeeper documentation needs to be changed (e.g. by a follow-up Jira?) to promote the new feature and the new config format (possibly including also the admin command documentation in relation with [ZOOKEEPER-3386](https://issues.apache.org/jira/browse/ZOOKEEPER-3386) and [ZOOKEEPER-3461](https://issues.apache.org/jira/browse/ZOOKEEPER-3461)) Author: Mate Szalay-Beko <szalay.beko.mate@gmail.com> Author: Mate Szalay-Beko <mszalay@cloudera.com> Reviewers: eolivelli@apache.org, andor@apache.org Closes apache#1048 from symat/ZOOKEEPER-3188 and squashes the following commits: 3c6fc52 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 356882d [Mate Szalay-Beko] ZOOKEEPER-3188: document new configuration format for using multiple addresses 45b6c0f [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 4b6bcea [Mate Szalay-Beko] ZOOKEEPER-3188: MultiAddress unit tests for Quorum TLS and Kerberos/Digest authentication 40bc44c [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 f875f5c [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 31805e7 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 0f95678 [Mate Szalay-Beko] ZOOKEEPER-3188: skip unreachable addresses when Learner connects to Leader e232c55 [Mate Szalay-Beko] ZOOKEEPER-3188: fix flaky unit MultiAddress unit test e892d8d [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 6f2ab75 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 2eedf26 [Mate Szalay-Beko] ZOOKEEPER-3188: fix PR commits; handle case when Leader can not bind to port on startup 483d2fc [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 a5d6bcb [Mate Szalay-Beko] ZOOKEEPER-3188: support for dynamic reconfig + add more unit tests ed31d2c [Mate Szalay-Beko] ZOOKEEPER-3188: better shutdown for executors (following PR comments) 8713a5b [Mate Szalay-Beko] ZOOKEEPER-3188: add fixes for PR comments 05eae83 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 e823af4 [Mate Szalay-Beko] Merge remote-tracking branch 'origin/master' into ZOOKEEPER-3188 de7bad2 [Mate Szalay-Beko] Merge remote-tracking branch 'origin/master' into ZOOKEEPER-3188 da98a8d [Mate Szalay-Beko] ZOOKEEPER-3188: fix JDK-13 warning 5bd1f4e [Mate Szalay-Beko] ZOOKEEPER-3188: supress spotbugs warning 42a52a6 [Mate Szalay-Beko] ZOOKEEPER-3188: improve based on code review comments 6c4220a [Mate Szalay-Beko] ZOOKEEPER-3188: fix SendWorker.asyncValidateIfSocketIsStillReachable 5b22432 [Mate Szalay-Beko] ZOOKEEPER-3188: fix LeaderElection to work with multiple election addresses 7bfbe7e [Mate Szalay-Beko] ZOOKEEPER-3188: Improve resilience to network
Can we backport the fix to 3.5.8? I am having some trouble with zookeeper 3.6.1 https://issues.apache.org/jira/browse/ZOOKEEPER-3828 And this fix is the only reason I need 3.6.1, for now. |
Sorry @aishwaryasoni1991, this is a major new feature (also causing some internal protocol changes), our policy is that we don't backport these with bugfix releases to make sure the 3.5 remains stable. |
What problem you have on 3.6.1? |
@symat @eolivelli Thank you for responding. |
This PR is the rebase of the [previous pull request](apache#730), so all the kudos should go to the original authors... In [ZOOKEEPER-3188](https://issues.apache.org/jira/browse/ZOOKEEPER-3188) we add ability to specify several addresses for quorum operations. Also added reconnection attempts if connection to leader lost. In this PR I rebased the changes on the current master, resolving some minor conflicts with: - [ZOOKEEPER-3296](https://issues.apache.org/jira/browse/ZOOKEEPER-3296): Explicitly closing the sslsocket when it failed handshake to prevent issue where peers cannot join quorum - [ZOOKEEPER-3320](https://issues.apache.org/jira/browse/ZOOKEEPER-3320): Leader election port stop listen when hostname unresolvable for some time - [ZOOKEEPER-3385](https://issues.apache.org/jira/browse/ZOOKEEPER-3385): Add admin command to display leader - [ZOOKEEPER-3386](https://issues.apache.org/jira/browse/ZOOKEEPER-3386): Add admin command to display voting view - [ZOOKEEPER-3398](https://issues.apache.org/jira/browse/ZOOKEEPER-3398): Learner.connectToLeader() may take too long to time-out I still want to test the feature manually (e.g. using docker containers with multiple virtual networks / interfaces). The steps to the manual test could be recorded in the [google docs](https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing) as well. Also I think we could add a few more unit tests where we are using multiple addresses. The current tests are using a single address only. Also the Zookeeper documentation needs to be changed (e.g. by a follow-up Jira?) to promote the new feature and the new config format (possibly including also the admin command documentation in relation with [ZOOKEEPER-3386](https://issues.apache.org/jira/browse/ZOOKEEPER-3386) and [ZOOKEEPER-3461](https://issues.apache.org/jira/browse/ZOOKEEPER-3461)) Author: Mate Szalay-Beko <szalay.beko.mate@gmail.com> Author: Mate Szalay-Beko <mszalay@cloudera.com> Reviewers: eolivelli@apache.org, andor@apache.org Closes apache#1048 from symat/ZOOKEEPER-3188 and squashes the following commits: 3c6fc52 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 356882d [Mate Szalay-Beko] ZOOKEEPER-3188: document new configuration format for using multiple addresses 45b6c0f [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 4b6bcea [Mate Szalay-Beko] ZOOKEEPER-3188: MultiAddress unit tests for Quorum TLS and Kerberos/Digest authentication 40bc44c [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 f875f5c [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 31805e7 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 0f95678 [Mate Szalay-Beko] ZOOKEEPER-3188: skip unreachable addresses when Learner connects to Leader e232c55 [Mate Szalay-Beko] ZOOKEEPER-3188: fix flaky unit MultiAddress unit test e892d8d [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 6f2ab75 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 2eedf26 [Mate Szalay-Beko] ZOOKEEPER-3188: fix PR commits; handle case when Leader can not bind to port on startup 483d2fc [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 a5d6bcb [Mate Szalay-Beko] ZOOKEEPER-3188: support for dynamic reconfig + add more unit tests ed31d2c [Mate Szalay-Beko] ZOOKEEPER-3188: better shutdown for executors (following PR comments) 8713a5b [Mate Szalay-Beko] ZOOKEEPER-3188: add fixes for PR comments 05eae83 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 e823af4 [Mate Szalay-Beko] Merge remote-tracking branch 'origin/master' into ZOOKEEPER-3188 de7bad2 [Mate Szalay-Beko] Merge remote-tracking branch 'origin/master' into ZOOKEEPER-3188 da98a8d [Mate Szalay-Beko] ZOOKEEPER-3188: fix JDK-13 warning 5bd1f4e [Mate Szalay-Beko] ZOOKEEPER-3188: supress spotbugs warning 42a52a6 [Mate Szalay-Beko] ZOOKEEPER-3188: improve based on code review comments 6c4220a [Mate Szalay-Beko] ZOOKEEPER-3188: fix SendWorker.asyncValidateIfSocketIsStillReachable 5b22432 [Mate Szalay-Beko] ZOOKEEPER-3188: fix LeaderElection to work with multiple election addresses 7bfbe7e [Mate Szalay-Beko] ZOOKEEPER-3188: Improve resilience to network
This PR is the rebase of the [previous pull request](apache#730), so all the kudos should go to the original authors... In [ZOOKEEPER-3188](https://issues.apache.org/jira/browse/ZOOKEEPER-3188) we add ability to specify several addresses for quorum operations. Also added reconnection attempts if connection to leader lost. In this PR I rebased the changes on the current master, resolving some minor conflicts with: - [ZOOKEEPER-3296](https://issues.apache.org/jira/browse/ZOOKEEPER-3296): Explicitly closing the sslsocket when it failed handshake to prevent issue where peers cannot join quorum - [ZOOKEEPER-3320](https://issues.apache.org/jira/browse/ZOOKEEPER-3320): Leader election port stop listen when hostname unresolvable for some time - [ZOOKEEPER-3385](https://issues.apache.org/jira/browse/ZOOKEEPER-3385): Add admin command to display leader - [ZOOKEEPER-3386](https://issues.apache.org/jira/browse/ZOOKEEPER-3386): Add admin command to display voting view - [ZOOKEEPER-3398](https://issues.apache.org/jira/browse/ZOOKEEPER-3398): Learner.connectToLeader() may take too long to time-out I still want to test the feature manually (e.g. using docker containers with multiple virtual networks / interfaces). The steps to the manual test could be recorded in the [google docs](https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing) as well. Also I think we could add a few more unit tests where we are using multiple addresses. The current tests are using a single address only. Also the Zookeeper documentation needs to be changed (e.g. by a follow-up Jira?) to promote the new feature and the new config format (possibly including also the admin command documentation in relation with [ZOOKEEPER-3386](https://issues.apache.org/jira/browse/ZOOKEEPER-3386) and [ZOOKEEPER-3461](https://issues.apache.org/jira/browse/ZOOKEEPER-3461)) Author: Mate Szalay-Beko <szalay.beko.mate@gmail.com> Author: Mate Szalay-Beko <mszalay@cloudera.com> Reviewers: eolivelli@apache.org, andor@apache.org Closes apache#1048 from symat/ZOOKEEPER-3188 and squashes the following commits: 3c6fc52 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 356882d [Mate Szalay-Beko] ZOOKEEPER-3188: document new configuration format for using multiple addresses 45b6c0f [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 4b6bcea [Mate Szalay-Beko] ZOOKEEPER-3188: MultiAddress unit tests for Quorum TLS and Kerberos/Digest authentication 40bc44c [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 f875f5c [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 31805e7 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 0f95678 [Mate Szalay-Beko] ZOOKEEPER-3188: skip unreachable addresses when Learner connects to Leader e232c55 [Mate Szalay-Beko] ZOOKEEPER-3188: fix flaky unit MultiAddress unit test e892d8d [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 6f2ab75 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 2eedf26 [Mate Szalay-Beko] ZOOKEEPER-3188: fix PR commits; handle case when Leader can not bind to port on startup 483d2fc [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 a5d6bcb [Mate Szalay-Beko] ZOOKEEPER-3188: support for dynamic reconfig + add more unit tests ed31d2c [Mate Szalay-Beko] ZOOKEEPER-3188: better shutdown for executors (following PR comments) 8713a5b [Mate Szalay-Beko] ZOOKEEPER-3188: add fixes for PR comments 05eae83 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 e823af4 [Mate Szalay-Beko] Merge remote-tracking branch 'origin/master' into ZOOKEEPER-3188 de7bad2 [Mate Szalay-Beko] Merge remote-tracking branch 'origin/master' into ZOOKEEPER-3188 da98a8d [Mate Szalay-Beko] ZOOKEEPER-3188: fix JDK-13 warning 5bd1f4e [Mate Szalay-Beko] ZOOKEEPER-3188: supress spotbugs warning 42a52a6 [Mate Szalay-Beko] ZOOKEEPER-3188: improve based on code review comments 6c4220a [Mate Szalay-Beko] ZOOKEEPER-3188: fix SendWorker.asyncValidateIfSocketIsStillReachable 5b22432 [Mate Szalay-Beko] ZOOKEEPER-3188: fix LeaderElection to work with multiple election addresses 7bfbe7e [Mate Szalay-Beko] ZOOKEEPER-3188: Improve resilience to network
This PR is the rebase of the [previous pull request](apache#730), so all the kudos should go to the original authors... In [ZOOKEEPER-3188](https://issues.apache.org/jira/browse/ZOOKEEPER-3188) we add ability to specify several addresses for quorum operations. Also added reconnection attempts if connection to leader lost. In this PR I rebased the changes on the current master, resolving some minor conflicts with: - [ZOOKEEPER-3296](https://issues.apache.org/jira/browse/ZOOKEEPER-3296): Explicitly closing the sslsocket when it failed handshake to prevent issue where peers cannot join quorum - [ZOOKEEPER-3320](https://issues.apache.org/jira/browse/ZOOKEEPER-3320): Leader election port stop listen when hostname unresolvable for some time - [ZOOKEEPER-3385](https://issues.apache.org/jira/browse/ZOOKEEPER-3385): Add admin command to display leader - [ZOOKEEPER-3386](https://issues.apache.org/jira/browse/ZOOKEEPER-3386): Add admin command to display voting view - [ZOOKEEPER-3398](https://issues.apache.org/jira/browse/ZOOKEEPER-3398): Learner.connectToLeader() may take too long to time-out I still want to test the feature manually (e.g. using docker containers with multiple virtual networks / interfaces). The steps to the manual test could be recorded in the [google docs](https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing) as well. Also I think we could add a few more unit tests where we are using multiple addresses. The current tests are using a single address only. Also the Zookeeper documentation needs to be changed (e.g. by a follow-up Jira?) to promote the new feature and the new config format (possibly including also the admin command documentation in relation with [ZOOKEEPER-3386](https://issues.apache.org/jira/browse/ZOOKEEPER-3386) and [ZOOKEEPER-3461](https://issues.apache.org/jira/browse/ZOOKEEPER-3461)) Author: Mate Szalay-Beko <szalay.beko.mate@gmail.com> Author: Mate Szalay-Beko <mszalay@cloudera.com> Reviewers: eolivelli@apache.org, andor@apache.org Closes apache#1048 from symat/ZOOKEEPER-3188 and squashes the following commits: 3c6fc52 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 356882d [Mate Szalay-Beko] ZOOKEEPER-3188: document new configuration format for using multiple addresses 45b6c0f [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 4b6bcea [Mate Szalay-Beko] ZOOKEEPER-3188: MultiAddress unit tests for Quorum TLS and Kerberos/Digest authentication 40bc44c [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 f875f5c [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 31805e7 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 0f95678 [Mate Szalay-Beko] ZOOKEEPER-3188: skip unreachable addresses when Learner connects to Leader e232c55 [Mate Szalay-Beko] ZOOKEEPER-3188: fix flaky unit MultiAddress unit test e892d8d [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 6f2ab75 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 2eedf26 [Mate Szalay-Beko] ZOOKEEPER-3188: fix PR commits; handle case when Leader can not bind to port on startup 483d2fc [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 a5d6bcb [Mate Szalay-Beko] ZOOKEEPER-3188: support for dynamic reconfig + add more unit tests ed31d2c [Mate Szalay-Beko] ZOOKEEPER-3188: better shutdown for executors (following PR comments) 8713a5b [Mate Szalay-Beko] ZOOKEEPER-3188: add fixes for PR comments 05eae83 [Mate Szalay-Beko] Merge remote-tracking branch 'apache/master' into ZOOKEEPER-3188 e823af4 [Mate Szalay-Beko] Merge remote-tracking branch 'origin/master' into ZOOKEEPER-3188 de7bad2 [Mate Szalay-Beko] Merge remote-tracking branch 'origin/master' into ZOOKEEPER-3188 da98a8d [Mate Szalay-Beko] ZOOKEEPER-3188: fix JDK-13 warning 5bd1f4e [Mate Szalay-Beko] ZOOKEEPER-3188: supress spotbugs warning 42a52a6 [Mate Szalay-Beko] ZOOKEEPER-3188: improve based on code review comments 6c4220a [Mate Szalay-Beko] ZOOKEEPER-3188: fix SendWorker.asyncValidateIfSocketIsStillReachable 5b22432 [Mate Szalay-Beko] ZOOKEEPER-3188: fix LeaderElection to work with multiple election addresses 7bfbe7e [Mate Szalay-Beko] ZOOKEEPER-3188: Improve resilience to network
Motivation: Zookeeper 3.6 added a new option that was ability to set mulitple addersss for quorum operations. apache/zookeeper#1048 The multiaddres was not compatible with the old protocol used in ZooKeeper 3.5.x, so the option has been disabled by apache/zookeeper#1251. In apache/zookeeper#1251, the default value of `QuorumPeer.multiAddressEnabled` was changed. https://github.com/apache/zookeeper/blob/e08cc2a782982964a57651f179a468b19e2e6010/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/QuorumPeer.java#L174-L180 Since CentralDogma directly extends `QuorumPeer`, `zookeeper.multiadress` is activated automatically and Central Dogma using Zookeeper 3.5.x can't be upgraded with rolling deployment. Modifications: - Set `multiAddressEnabled` to false when creating `EmbeddedZooKeeper` Result: - Fixed an issue where CentralDogma does not perform rolling updates due to Zookeeper invalid protocol version. - Note that this bug affects 0.61.2, 0.61.3 and 0.61.4
Motivation: Zookeeper 3.6 added a new option that was ability to set mulitple addersss for quorum operations. apache/zookeeper#1048 The multiaddres was not compatible with the old protocol used in ZooKeeper 3.5.x, so the option has been disabled by apache/zookeeper#1251. In apache/zookeeper#1251, the default value of `QuorumPeer.multiAddressEnabled` was changed. https://github.com/apache/zookeeper/blob/e08cc2a782982964a57651f179a468b19e2e6010/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/QuorumPeer.java#L174-L180 Since CentralDogma directly extends `QuorumPeer`, `zookeeper.multiadress` is activated automatically and Central Dogma using Zookeeper 3.5.x can't be upgraded with rolling deployment. Modifications: - Set `multiAddressEnabled` to false when creating `EmbeddedZooKeeper` Result: - Fixed an issue where CentralDogma does not perform rolling updates due to Zookeeper invalid protocol version. - Note that this bug affects 0.61.2, 0.61.3 and 0.61.4
Motivation: Zookeeper 3.6 added a new option that was ability to set multiple adders for quorum operations. apache/zookeeper#1048 The multi address was not compatible with the old protocol used in ZooKeeper 3.5.x, so apache/zookeeper#1251 has disabled the option. During fixing incompatibility in apache/zookeeper#1251, it omitted to change the default value of `QuorumPeer.multiAddressEnabled` to `false`. https://github.com/apache/zookeeper/blob/e08cc2a782982964a57651f179a468b19e2e6010/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/QuorumPeer.java#L174-L180 The default is correctly set to false when a Zookeeper server is used as a standalone mode. However, if a new `QuorumPeer` instance is manually created, it will remain `true.` Since CentralDogma directly extends `QuorumPeer`, `zookeeper.multiadress` is activated automatically and Central Dogma using Zookeeper 3.5.x can't be upgraded with a rolling deployment. Modifications: - Set `multiAddressEnabled` to false when creating `EmbeddedZooKeeper` Result: - Fixed an issue where CentralDogma does not perform rolling updates due to the invalid protocol version of Zookeeper. - Note that this bug affects 0.61.2, 0.61.3 and 0.61.4
This PR is the rebase of the previous pull request, so all the kudos should go to the original authors...
In ZOOKEEPER-3188 we add ability to specify several addresses for quorum operations. Also added reconnection attempts if connection to leader lost.
In this PR I rebased the changes on the current master, resolving some minor conflicts with:
I still want to test the feature manually (e.g. using docker containers with multiple virtual networks / interfaces). The steps to the manual test could be recorded in the google docs as well.
Also I think we could add a few more unit tests where we are using multiple addresses. The current tests are using a single address only.
Also the Zookeeper documentation needs to be changed (e.g. by a follow-up Jira?) to promote the new feature and the new config format (possibly including also the admin command documentation in relation with ZOOKEEPER-3386 and ZOOKEEPER-3461)