-
Notifications
You must be signed in to change notification settings - Fork 734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zookeeper persistent data? #26
Comments
Yes you can. That's part of the reason why we use StatefulSet, the other being that we get predictable host names from which we can deduce an identity number. We have automated topic creation, so if all zookeeper nodes would go down we would re-run that automation and kafka would pick up all topics. If you don't want to run the risk of having to do so, persistent storage the way you suggest is preferrable. |
Done in https://github.com/Yolean/kubernetes-kafka/releases/tag/v2.0.0. To make Zookeeper more robust in the face of zone outages I made (by default) two pods ephemeral and tree with persistent volumes. |
I was wondering about that, looks like an interesting way to handle zone failures (e.g. most cloud block storage is zone-specific). What are the failure scenarios that having a mix of persistent and ephemeral help with, that is better than just having, e.g. 3 persistent nodes or 5 persistent nodes? |
Let's start with Kafka. We use three zones. Thus we start with three Kafka brokers, and default replication factor 3. To increase throughput we can scale to 6, 9, 12 ... brokers and add 1 partition per such step. We can run with producer With Zookeeper the maths are less appealing, given that three zones is a good choice. We've accepted the recomendation from the Kafka (Definitive Guide) book: use 5 or 7 instances. Zookeeper is configured statically, so in the event of an extended zone outage for one of the two zones that host two instances it can neither be reconfigured nor rescheduled. To be honest I have neither tested nor studied zookeeper sufficiently to know which failure modes we can handle. I have just tried to ensure that - as with kafka - only one instance at a time is gone. |
OK, thanks. |
I have struggled way too much with getting zookeeper and kafka to behave nicely in a cloud environment. |
@solsson I noticed this comment you made earlier:
Is that automated topic creation captured in an open source repo somewhere? I'd be interested in hearing more of your thoughts on that subject. |
@StevenACoffman We run |
Update: With #107 merged we'll no longer maintain these definitions. This means that we can probably not recover topics from kafka volumes alone. |
Hi,
from the Readme:
but looking at zookeeper it looks like you are using stateful set with local storage. Maybe I'm missing something, but I could probably use same persistent storage approach as with kafka, eg:
Haven't tested much, but seems to be working?
The text was updated successfully, but these errors were encountered: