Skip to content
This repository has been archived by the owner on Apr 2, 2024. It is now read-only.

Enabling buffering on Kafka output lead leads to infinite repetition of messages upon restart #1749

Closed
trixpan opened this issue Sep 26, 2015 · 3 comments

Comments

@trixpan
Copy link

trixpan commented Sep 26, 2015

Configuring buffering on a Kafka Output will lead to heka to send all messages again whenever heka is restarted.

[TestKafkaOutput]
type = "KafkaOutput"
message_matcher = "Type == 'TcpInput'"
topic = "test_topic"
addrs = ["localhost:9092"]
encoder = "PayloadEncoder"
use_buffering = true
    [TestKafkaOutput.buffering]
    max_file_size = 268435456
    max_buffer_size = 1073741824
    full_action = "block"
    cursor_update_count = 1

The reason seem to be the fact that all messages get inserted into the heka buffers but none of them seems to ever come out.

As consequence upon every single reboot all log messages are resent but the backlogs never cleared.

@xrl
Copy link
Contributor

xrl commented Nov 3, 2015

What version of heka are you running? 0.10b has some queueing bugs which I believe are in the process of being sorted out.

@trixpan
Copy link
Author

trixpan commented Nov 3, 2015

@xrl

I run the dev version. The issue is not related to buffering per-se but to the way Sarama is called without this resulting on a cursor update.

Note how Kafka's output

https://github.com/mozilla-services/heka/blob/dev/plugins/kafka/kafka_output.go#L375

differs from

https://github.com/mozilla-services/heka/blob/dev/plugins/http/http_output.go#L117

Also note changing the code to update the cursor without modifying the publisher type, will lead to message losses in case of the scenarios described in #1750

@rafrombrc
Copy link
Contributor

Fixed in #1887.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants