-
Notifications
You must be signed in to change notification settings - Fork 628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kafka-Node: Producer never get notified when a partition moved to new broker on auto rebalance. #175
Comments
Which producer are you using: low level or high level? The low level producer, I think, doesn't do this detection job. Not sure, thought. But the high level producer may handle this scenario. The desirable behavior of the high level producer, in my opinion, should be as follows:
@haio what do you say? |
Thanks for the response @kadishmal ... i tried with both Producers (Low & High Level). unfortunately i couldn't see any error in the send callback. however kafka broker started getting below error after auto rebalance. 2015-03-16 10:22:48,370] WARN [KafkaApi-1] Produce request with correlation id 14878 from client slam.local on partition [TESTING,4] failed due to Leader not local for partition [TESTING,4] on broker 1 (kafka.server.KafkaApis)
[2015-03-16 10:22:52,394] WARN [KafkaApi-1] Produce request with correlation id 14882 from client slam.local on partition [TESTING,6] failed due to Leader not local for partition [TESTING,6] on broker 1 (kafka.server.KafkaApis)
[2015-03-16 10:22:55,416] WARN [KafkaApi-1] Produce request with correlation id 14885 from client slam.local on partition [TESTING,6] failed due to Leader not local for partition [TESTING,6] on broker 1 (kafka.server.KafkaApis)
[2015-03-16 10:22:57,436] WARN [KafkaApi-1] Produce request with correlation id 14887 from client slam.local on partition [TESTING,4] failed due to Leader not local for partition [TESTING,4] on broker 1 (kafka.server.KafkaApis) I could see event notification when i initiate manual preferred replica election for rebalance from CLI. this scenario Kafka-Node/lib/Producer works fine. but auto rebalance needs to be fixed i guess. please share your thoughts on this… |
@lsampathreddy the Producer(Low and High Level) will not refresh metadata when broker change so far, I think this feature can be added in next version. |
@haio a response to your last question:
I've checked your PR, the fix you've committed looks very simple. I doubt it fixes this issue. @lsampathreddy would you please fetch the latest version of |
As far as I remember, re-balancing can't be detected via watches set by Zookeeper#listBrokers(), so Instead, after brokers re-balance topics between each other, ZooKeeper nodes located in |
@estliberitas , @kadishmal & @haio Thanks for your responses. @kadishmal Somehow i debug the Kafka scala code, KafkaController never notify the changes to zookeeper when auto-rebalance is enabled, i feel condition not required at KafkaController.scala . if Kafka itself notifies the preferred election changes to Zookeeper via Path(/admin/preferred_replica_election or /admin/reassign_partitions) then Kafka-Node will work perfectly in auto rebalance. i reported an issue at Kafka Jira ISSUE. waiting for their response. @estliberitas if i do manual re-balance with below command Kafka-Node was able to identify the changes. bin/kafka-preferred-replica-election.sh |
@lsampathreddy great! Maybe my 0.8.0 setup was different. We had an issue making simple test: stopping 1 broker, then starting it and waiting for re-balancing. |
The client watch @lsampathreddy In what case you still have this issue? |
Hi @haio, This imbalance could happen when a broker dies and comes back. Kafka 0.8.1 feature auto rebalance periodically check for any partition imbalance on brokers and tries to elect and balance the partition among the brokers. @haio, @estliberitas & @kadishmal please share your thoughts on this... |
I believe it should detect the rebalanced partitions. At least the high producer and consumer should handle this. |
hi @lsampathreddy, how can we reproduce the imbalance on brokers? |
@haio
auto.leader.rebalance.enable=true
leader.imbalance.check.interval.seconds=100
var kafka = require('kafka-node'),
Producer = kafka.Producer,
KeyedMessage = kafka.KeyedMessage,
client = new kafka.Client('127.0.0.1:2181', 'clientId'),
producer = new Producer(client);
var buildPayload = function (msg) {
var payload = [];
payload.push({topic: 'TESTING', messages: msg+" ", partition: Math.floor(Math.random() * 8)});
return payload;
};
producer.on('ready', function () {
console.log('producer is ready...');
});
var counter = 0;
setInterval(function(){
counter ++;
var payloads = buildPayload(counter);
console.log(payloads);
producer.send(payloads, function (err, data) {
//if(err) console.log('error'+ err);
});
}, 1000);
producer.on('error', function (err) {
console.log('err at produer'+err);
})
slam:kafka_2.9.2-0.8.1.1 s.lambu$ bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic TESTING
Topic:TESTING PartitionCount:8 ReplicationFactor:2 Configs:
Topic: TESTING Partition: 0 Leader: 0 Replicas: 0,1 Isr: 0,1
Topic: TESTING Partition: 1 Leader: 1 Replicas: 1,0 Isr: 0,1
Topic: TESTING Partition: 2 Leader: 0 Replicas: 0,1 Isr: 0,1
Topic: TESTING Partition: 3 Leader: 1 Replicas: 1,0 Isr: 0,1
Topic: TESTING Partition: 4 Leader: 0 Replicas: 0,1 Isr: 0,1
Topic: TESTING Partition: 5 Leader: 1 Replicas: 1,0 Isr: 0,1
Topic: TESTING Partition: 6 Leader: 0 Replicas: 0,1 Isr: 0,1
Topic: TESTING Partition: 7 Leader: 1 Replicas: 1,0 Isr: 0,1
bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic TESTING
Topic:TESTING PartitionCount:8 ReplicationFactor:2 Configs:
Topic: TESTING Partition: 0 Leader: 0 Replicas: 0,1 Isr: 0
Topic: TESTING Partition: 1 Leader: 0 Replicas: 1,0 Isr: 0
Topic: TESTING Partition: 2 Leader: 0 Replicas: 0,1 Isr: 0
Topic: TESTING Partition: 3 Leader: 0 Replicas: 1,0 Isr: 0
Topic: TESTING Partition: 4 Leader: 0 Replicas: 0,1 Isr: 0
Topic: TESTING Partition: 5 Leader: 0 Replicas: 1,0 Isr: 0
Topic: TESTING Partition: 6 Leader: 0 Replicas: 0,1 Isr: 0
Topic: TESTING Partition: 7 Leader: 0 Replicas: 1,0 Isr: 0
bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic TESTING
Topic:TESTING PartitionCount:8 ReplicationFactor:2 Configs:
Topic: TESTING Partition: 0 Leader: 0 Replicas: 0,1 Isr: 0,1
Topic: TESTING Partition: 1 Leader: 0 Replicas: 1,0 Isr: 0,1
Topic: TESTING Partition: 2 Leader: 0 Replicas: 0,1 Isr: 0,1
Topic: TESTING Partition: 3 Leader: 0 Replicas: 1,0 Isr: 0,1
Topic: TESTING Partition: 4 Leader: 0 Replicas: 0,1 Isr: 0,1
Topic: TESTING Partition: 5 Leader: 0 Replicas: 1,0 Isr: 0,1
Topic: TESTING Partition: 6 Leader: 0 Replicas: 0,1 Isr: 0,1
Topic: TESTING Partition: 7 Leader: 0 Replicas: 1,0 Isr: 0,1
hence msg lost… :( please update me if I’m doing wrong… |
Thanks, will check this. |
@haio @lsampathreddy what I offered seems a bit tricky to implement because |
This should fix issue SOHU-Co#175. When the client get `NotLeaderForPartition` error, that means the leader for the partition has changed, so it emit a `brokersChanged` event, consumer and producer listen this event and refresh topic metadata.
I have made a fix for this. When the client get |
@haio Thank you for your response. I tested with your changes.. its working perfectly now. 👍 please let me know which version you are going include these changes. |
@lsampathreddy check kafka-node@0.2.24 |
Thanks you @haio |
Scenario,
I have two brokers (broker 1 & broker 2) with auto.leader.rebalance.enable=true.
Create TEST topic with replica: 2, partitions 8.
Use a producer to push message on TEST topic.
when i stopped a broker1, broker2 became leader for all the partitions. few mins later broker1 comes online. after leader.imbalance.check.interval.seconds, all partitions are distribute among two brokers but my producer never gets notified to refresh the metadata.
there after some msgs are getting failing because of payload was sending to broker w/ invalid or non-exist partition.
Is it right behavior of Producer? or am i doing wrong?
The text was updated successfully, but these errors were encountered: