-
Notifications
You must be signed in to change notification settings - Fork 734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set internal replication factors to match default and min.insync.replicas #140
Conversation
kafka/10broker-config.yml
Outdated
@@ -57,6 +57,7 @@ data: | |||
num.partitions=1 | |||
|
|||
default.replication.factor=3 | |||
offsets.topic.replication.factor=3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This property is set to 1 below at line 117.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, explains why we didn't get the documented default.
@shrinandj Given that the documented default is 3, and that this error hasn't been reported before, could it be in your case that if a consumer was running when kafka was starting for the first time, the topic is created with 1 replica because there's only one broker? I guess Maybe then the config change in e78f1c5 has no effect? Or will Kafka refuse to create the topic if there is an explicit value 3? I must find time to test this. May be relevant to #116. Update: didn't see your review comment above. It explains why we get 1. I've pushed 321189a. Maybe the gotcha can be alleviated by grouping all of these properties. |
I tried to add a repro case in https://github.com/Yolean/kubernetes-kafka/compare/fix-offset-topic-replication...test-consumergroup?expand=1 but I think it'll be too complex for end-to-end testing this way, while it should be rather trivial to verify that all replicas are considered members of the group. |
#108 is why I had three replicas for __consumer_offsets in the QA cluster I tested on now. It was created with 1 replica there too. The error message there looks a bit different from https://stackoverflow.com/questions/48536347/kafka-consumer-get-marking-the-coordinator-dead-error-when-using-group-ids, but the cause could be the same. |
I also noticed in now that our Kafka Streams meta topics have 1 replica. I wonder which of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
which match the default.replication.factor and min.insync.replicas that we've changed to now.
|
Here's what we had in our new cluster:
Properties Anyway defaults are I think I've also found the reason why our Kafka Streams topics have 1 replica. It's the replication.factor property. It isn't listed as a broker config, so probably it can only be set on clients. |
I had been quite focused on I can't find a metric for the configured topic replication factor, neither from #125 or from #128. The JMX MBean |
It could be argued that applications must check for this as part of QA, but for Kafka Streams this is non-trivial as you typically have lower replication on test clusters, and in production must provision the client with a setting that overrides the default. Hence I added a readiness "test" in e784bca, that could spot the problem before application downtime monitoring does. |
I wanted to try to fix existing topics using Kafka Manager, but it turns out it can add partitions but not increase replication factor: yahoo/CMAK#224. Will use our job instead as in #108 (comment). Will not affect this PR, as #95 was designed for manual topic name change. |
may be impacting the producer clients, losing messages or causing back-pressure in the application. This is most often a “site down” type of problem and will need to be addressed immediately.” Excerpt from: Neha Narkhede, Gwen Shapira, and Todd Palino. ”Kafka: The Definitive Guide”. We now export kafka_controller_kafkacontroller_value{name="OfflinePartitionsCount",} and friends. See #140 for why.
reverting #140 to avoid "does not meet the required replication factor '3' for the offsets topic"
I am new to kafka, I have upgraded my cluster from v3.0 to v3.1 and hit this issue. I had run jobs under maintains folder but
I had also run
On my kafka manager I had observed that all my cluster topics are replicated to 3 but only I could not find any clue and I had to deleted cluster. |
@cemo See https://github.com/Yolean/kubernetes-kafka/tree/master/maintenance#increase-a-topics-replication-factor Kafka maintenance terminology isn't IMO intuitive. Let us know if you find better tooling. There's many ways to use kafka-reassign-partitions.sh, depending on the interesting fact that you basically need to craft the reassignment json yourself. Look at the |
reverting #140 to avoid "does not meet the required replication factor '3' for the offsets topic"
reverting #140 to avoid "does not meet the required replication factor '3' for the offsets topic"
reverting #140 to avoid "does not meet the required replication factor '3' for the offsets topic"
* Scales to 2 brokers + 3 zookeeper instances * Same as default.replication.factor/min.insync.replicas * Minimizes the cluster for use with for example Minikube * Configures internal topics for single broker, reverting Yolean#140 to avoid "does not meet the required replication factor '3' for the offsets topic" * Ksql rc (#1) * Burrow's master now handles api v3 * Container fails to start, I see no logs * This log config looks better, but makes no difference regarding start
reverting #140 to avoid "does not meet the required replication factor '3' for the offsets topic"
Based on @shrinandj's find in #139.
As explained in #116 (comment) we'd like to keep min.insync.replicas=2.
What's odd is that the default, according to docs, is 3. Also I guess this change won't affect a running kafka cluster.
This property isn't mentioned in https://kafka.apache.org/documentation/#prodconfig.
There was a change in 0.11: "The offsets.topic.replication.factor broker config is now enforced upon auto topic creation."
This PR quite possibly needs
transaction.state.log.replication.factor
.config.storage.replication.factor
.status.storage.replication.factor
.A similar fix toreplication.factor
.