Zookeeper.new blows up if any host cannot be resolved #21

Closed
mrgordon opened this Issue Jul 30, 2012 · 6 comments

Comments

Projects
None yet
4 participants
Contributor

mrgordon commented Jul 30, 2012

If I pass multiple hosts to Zookeeper.new, then I get an exception if any of the hosts cannot be resolved. However, I connect successfully if the hosts do not have Zookeeper but I can resolve them. Finally, I get failures when connecting to unreachable hosts that can be resolved when they are remote (which is what I would expect) but I get back a Zookeeper::Client object if I connect to non-existent Zookeepers on localhost.

For example, if I have Zookeeper running on localhost:2181

Zookeeper.new('localhost:2181,a:2181')
=> RuntimeError: error connecting to zookeeper: 2

Zookeeper.new('localhost:2181,google.com:2181')
=> #<Zookeeper::Client:0x1106c3228 @host="localhost:2181,google.com:2181"...>

Zookeeper.new("localhost:1,localhost:2")
=> #<Zookeeper::Client:0x110d73578 @host="localhost:1,localhost:2"...>

I'm not sure if this is intended behavior but it caught me by surprise.

Contributor

slyphon commented Aug 6, 2012

I have a feeling the only option here is to raise a specific Zookeeper
error in this case. I'll look into adding that.

On Mon, Jul 30, 2012 at 6:12 PM, Matthew Gordon
reply@reply.github.com
wrote:

If I pass multiple hosts to Zookeeper.new, then I get an exception if any of the hosts is unknown. However, I connect successfully if the hosts are unreachable or do not have Zookeeper but I can resolve them.

For example, if I have Zookeeper running on localhost:2181

Zookeeper.new('localhost:2181,a:2181')
=> RuntimeError: error connecting to zookeeper: 2

Zookeeper.new('localhost:2181,google.com:2181')
=> #<Zookeeper::Client:0x1106c3228 @host="localhost:2181,google.com:2181"...>

Zookeeper.new("localhost:1,localhost:2")
=> #<Zookeeper::Client:0x110d73578 @host="localhost:1,localhost:2"...>

I'm not sure if this is intended behavior but it caught me by surprise.


Reply to this email directly or view it on GitHub:
slyphon#21

syrnick commented Jan 28, 2013

A better behavior would be to accept the config if the quorum is available. It's scary if your cluster is up, but one missing node makes the client library to explode.

Contributor

slyphon commented Jan 29, 2013

I hear you, it's shitty behavior and not helpful when you're trying to debug a problem. Unfortunately, I think this is the least-bad choice given the underlying layers.

This is the behavior of the underlying zookeeper client, which doesn't provide any information as to which host it puked on. In order to accomplish this we'd need to parse the host/port list, try each in turn, drop out failures, and then connect with the remaining config. Sessions are expensive, and this would cause a fair amount of extra load on the cluster. This would also put the user in the position of not knowing what their active configuration was, and it could easily put the cluster in a state where you had inconsistent host lists, which is warned against in the docs.

syrnick commented Jan 29, 2013

That sounds like 10 lines of code :). If could happen only if the full list doesn't work.

I totally see the point of having a consistent list of hosts, but a subset should be strictly OK (per that doc).

This behavior significantly changes the contract of ZK deployment with ruby - you essentially have to have 100% of the nodes up (in DNS).

felixb commented Mar 18, 2015

Is there any progress on this issue?

Contributor

slyphon commented Mar 21, 2015

won't fix, sorry.

slyphon closed this Mar 21, 2015

@tobiashm tobiashm added a commit to karnov/zookeeper that referenced this issue Mar 1, 2017

@tobiashm tobiashm Upgrade C library to version 3.4.9
This fixes a few issues, among those a failure when a bad hostname is given zk-ruby#21
9d4df02

@tobiashm tobiashm added a commit to karnov/zookeeper that referenced this issue Mar 1, 2017

@tobiashm tobiashm Upgrade C library to version 3.4.9
This fixes a few issues, among those a failure when a bad hostname is given.
See also zk-ruby#21 and https://issues.apache.org/jira/browse/ZOOKEEPER-1029
2b6b459
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment