zoo-0.zoo is not defined in the deployment #203

samv · 2018-09-21T06:12:00Z

The example/default configuration file lists 5 servers:

kubernetes-kafka/zookeeper/10zookeeper-config.yml

Line 27 in 727899a

server.4=zoo-0.zoo:2888:3888:participant

This looks like a mistake. It happens to work because there are 5 nodes defined and with 3 nodes in the statefulset, zookeeper will consider that a quorum. However, it is extremely fragile as any outage of a single node will bring the ZK cluster (and hence, the kafka deployment) to hard down; eg bootstrap times out:

$ kafkacat -b k8s.internal.example.com:32401 -L
% ERROR: Failed to acquire metadata: Local: Broker transport failure

Observed log lines from zookeeper:

Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running (org.apache.zookeeper.server.NIOServerCnxn)

I will be testing this theory out soon by removing these two lines and seeing if zk stays happy with a single statefulset node failure.

While I'm here, the statefulsets should be defined with a parallel Pod Management policy so that if eg broker 0 goes down, the statefulset doesn't do a rolling restart of brokers 1+, and the system can recover from multi-node failures faster.

The text was updated successfully, but these errors were encountered:

solsson · 2018-09-21T08:00:18Z

Does this have to do with #34? Do you have two statefulsets or one? I agree with the argument for parallell pod management, but the dependence on zones through volumes is the essential backing of the argument. #191 is an alternative approach. And given the age of this repository I have to accept legacy.

samv · 2018-09-21T17:07:04Z

My mistake; I searched for a 'zoo' deployment but didn't see it because it had been removed from our fork, so yes I only had one zookeeper statefulset. Thanks for the pointers!

solsson · 2018-09-23T17:50:02Z

Yeah, this is where templating would do wonders in this repo, generating the server.X list in zookeeper conf. I also thought of generating it in the init container based on looking up scale for the two statefulsets, but that'd introduce a lot more complexity. Helm would really obscure things, and I haven't yet had a proper look at kustomize. I'd very much appreciate a contribution.

samv closed this as completed Sep 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zoo-0.zoo is not defined in the deployment #203

zoo-0.zoo is not defined in the deployment #203

samv commented Sep 21, 2018 •

edited

Loading

solsson commented Sep 21, 2018

samv commented Sep 21, 2018

solsson commented Sep 23, 2018

zoo-0.zoo is not defined in the deployment #203

zoo-0.zoo is not defined in the deployment #203

Comments

samv commented Sep 21, 2018 • edited Loading

solsson commented Sep 21, 2018

samv commented Sep 21, 2018

solsson commented Sep 23, 2018

samv commented Sep 21, 2018 •

edited

Loading