Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix v5.0 regression that left zoo pods unreachable #231

Merged
merged 1 commit into from
Dec 2, 2018

Conversation

solsson
Copy link
Contributor

@solsson solsson commented Dec 2, 2018

The destabilization effort with v5.0.0 is going great :)

I didn't notice this flaw until I experimented with scaling for #228 (comment). The label change in #191 should have been reflected in the zoo service. It would have been apparent if I watched the logs during testing, with errors like:

[2018-12-02 20:00:24,587] WARN Failed to resolve address: zoo-1.zoo (org.apache.zookeeper.server.quorum.QuorumPeer)
java.net.UnknownHostException: zoo-1.zoo
	at java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:797)
	at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1505)
	at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1364)
	at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1298)
	at java.base/java.net.InetAddress.getByName(InetAddress.java:1248)
	at org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer.recreateSocketAddresses(QuorumPeer.java:180)
	at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:591)
	at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:534)
	at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:454)
	at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:435)
	at java.base/java.lang.Thread.run(Thread.java:834)
[2018-12-02 20:00:24,587] WARN Cannot open channel to 6 at election address zoo-2.zoo:3888 (org.apache.zookeeper.server.quorum.QuorumCnxManager)
java.net.UnknownHostException: zoo-2.zoo
	at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:220)

Maybe we should export prometheus metrics based on the mntr command, see https://zookeeper.apache.org/doc/r3.4.13/zookeeperAdmin.html#sc_zkCommands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant