Re-consuming same messages when partitions in Kafka cluster move #218

MaciekRakowski · 2015-07-07T17:24:42Z

When evaluating the kafka-node high level consumer, I noticed a scenario in which it consumes duplicate messages. This issue occurs during a rebalance. Here are the steps to recreate it:

Start a high level consumer with auto-commit set to FALSE (and do NOT commit). The code below (with options properly set) will suffice:
consumer.on('message', function (message) {
console.log(message.value);
}
Send a few messages to the broker using any producer. Preferably, send messages in order to keep track of how many were sent. Keep track of the messages and number of messages sent.
Stop the Kafka broker, and then restart it.
Notice that the messages in step 2 were re-consumed. The Java high level consumer that comes with kafka does not do this.

This issue would be important to fix because some developers may want to commit every Nth message to improve performance. Please note that this issue also happens when partitions on the cluster changes in other ways, such as when partitions move during a rebalance, or when the cluster is expanded and partitions move. It would be ideal to make this consumer resilient as the Java one is. If there is any other information needed to help reproduce or troubleshoot this issue, please let me know.

hyperlink · 2016-08-09T18:52:37Z

I'm curious to know if this issue happens when you force HLC commit on close.

For example like below:

process.on('SIGINT', function () {
    highLevelConsumer.close(true, function () {
        process.exit();
    });
});

The new version 0.5.4 will perform a rebalance when the partition changes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-consuming same messages when partitions in Kafka cluster move #218

Re-consuming same messages when partitions in Kafka cluster move #218

MaciekRakowski commented Jul 7, 2015

hyperlink commented Aug 9, 2016

Re-consuming same messages when partitions in Kafka cluster move #218

Re-consuming same messages when partitions in Kafka cluster move #218

Comments

MaciekRakowski commented Jul 7, 2015

hyperlink commented Aug 9, 2016