Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zoo-0.zoo is not defined in the deployment #203

Closed
samv opened this issue Sep 21, 2018 · 3 comments
Closed

zoo-0.zoo is not defined in the deployment #203

samv opened this issue Sep 21, 2018 · 3 comments

Comments

@samv
Copy link

samv commented Sep 21, 2018

The example/default configuration file lists 5 servers:

server.4=zoo-0.zoo:2888:3888:participant

This looks like a mistake. It happens to work because there are 5 nodes defined and with 3 nodes in the statefulset, zookeeper will consider that a quorum. However, it is extremely fragile as any outage of a single node will bring the ZK cluster (and hence, the kafka deployment) to hard down; eg bootstrap times out:

$ kafkacat -b k8s.internal.example.com:32401 -L
% ERROR: Failed to acquire metadata: Local: Broker transport failure

Observed log lines from zookeeper:

Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running (org.apache.zookeeper.server.NIOServerCnxn)

I will be testing this theory out soon by removing these two lines and seeing if zk stays happy with a single statefulset node failure.

While I'm here, the statefulsets should be defined with a parallel Pod Management policy so that if eg broker 0 goes down, the statefulset doesn't do a rolling restart of brokers 1+, and the system can recover from multi-node failures faster.

@solsson
Copy link
Contributor

solsson commented Sep 21, 2018

Does this have to do with #34? Do you have two statefulsets or one? I agree with the argument for parallell pod management, but the dependence on zones through volumes is the essential backing of the argument. #191 is an alternative approach. And given the age of this repository I have to accept legacy.

@samv
Copy link
Author

samv commented Sep 21, 2018

My mistake; I searched for a 'zoo' deployment but didn't see it because it had been removed from our fork, so yes I only had one zookeeper statefulset. Thanks for the pointers!

@samv samv closed this as completed Sep 21, 2018
@solsson
Copy link
Contributor

solsson commented Sep 23, 2018

Yeah, this is where templating would do wonders in this repo, generating the server.X list in zookeeper conf. I also thought of generating it in the init container based on looking up scale for the two statefulsets, but that'd introduce a lot more complexity. Helm would really obscure things, and I haven't yet had a proper look at kustomize. I'd very much appreciate a contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants