Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broker unresponsive due to NotEnoughReplicasException #108

Closed
solsson opened this issue Dec 17, 2017 · 2 comments
Closed

Broker unresponsive due to NotEnoughReplicasException #108

solsson opened this issue Dec 17, 2017 · 2 comments

Comments

@solsson
Copy link
Contributor

solsson commented Dec 17, 2017

One of our brokers was unresponsive, leading to timeouts in clients. Was busy in a loop that logged:

[2017-12-17 21:33:52,534] INFO [GroupCoordinator 1]: Preparing to rebalance group user-sessions-stream with old generation 779 (__consumer_offsets-32) (kafka.coordinator.group.GroupCoordinator)
[2017-12-17 21:33:52,637] INFO [GroupCoordinator 1]: Stabilized group user-sessions-stream generation 780 (__consumer_offsets-32) (kafka.coordinator.group.GroupCoordinator)
[2017-12-17 21:33:52,637] INFO [GroupCoordinator 1]: Assignment received from leader for group user-sessions-stream for generation 780 (kafka.coordinator.group.GroupCoordinator)
[2017-12-17 21:33:52,638] ERROR [ReplicaManager broker=1] Error processing append operation on partition __consumer_offsets-32 (kafka.server.ReplicaManager)
org.apache.kafka.common.errors.NotEnoughReplicasException: Number of insync replicas for partition __consumer_offsets-32 is [1], below required minimum [2]

Quite possibly due to the configuration change in #107.

@solsson
Copy link
Contributor Author

solsson commented Dec 17, 2017

Solved using #95 with the diff:

diff --git a/maintenance/reassign-paritions-job.yml b/maintenance/reassign-paritions-job.yml
index e9e184e..12e9219 100644
--- a/maintenance/reassign-paritions-job.yml
+++ b/maintenance/reassign-paritions-job.yml
@@ -16,9 +16,9 @@ spec:
           value: zookeeper.kafka:2181
         # the following must be edited per job
         - name: TOPICS
-          value: test-produce-consume,test-kafkacat
+          value: __consumer_offsets
         - name: BROKERS
-          value: 0,2
+          value: 0,1,2
         command:
         - /bin/bash
         - -ce

@solsson solsson closed this as completed Dec 17, 2017
@solsson
Copy link
Contributor Author

solsson commented Dec 17, 2017

Maybe the above helped temporarily, or maybe it only stopped the flow of logs for a while.
Actually increasing replicas had better results:

diff --git a/maintenance/reassign-paritions-job.yml b/maintenance/reassign-paritions-job.yml
index e9e184e..0cb4c6a 100644
--- a/maintenance/reassign-paritions-job.yml
+++ b/maintenance/reassign-paritions-job.yml
@@ -16,9 +16,9 @@ spec:
           value: zookeeper.kafka:2181
         # the following must be edited per job
         - name: TOPICS
-          value: test-produce-consume,test-kafkacat
+          value: __consumer_offsets
         - name: BROKERS
-          value: 0,2
+          value: 0,1,2
         command:
         - /bin/bash
         - -ce
@@ -43,6 +43,10 @@ spec:
           echo "# proposed-reassignment.json";
           cat /tmp/proposed-reassignment.json;
 
+          sed -i 's/"replicas":\[.\]/"replicas":[0,1,2]/g' /tmp/proposed-reassignment.json;
+          sed -i 's/,"log_dirs":\["any"\]//g' /tmp/proposed-reassignment.json;
+          cat /tmp/proposed-reassignment.json;
+
           ./bin/kafka-reassign-partitions.sh
           --zookeeper=$ZOOKEEPER
           --execute

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant