Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sarama drops message when either gzip/snappy is used #71

Closed
amalakar opened this issue Mar 11, 2014 · 4 comments
Closed

sarama drops message when either gzip/snappy is used #71

amalakar opened this issue Mar 11, 2014 · 4 comments

Comments

@amalakar
Copy link

I wrote a benchmark program to see performance of the library. I noticed that it drops significant number of messages when either gzip/snappy is used with waitForLocal.
I get around 59.5 MBps without compression

I see the following error message in case of gzip/snappy:

[Sarama] 2014/03/11 18:39:06 kafka: Dropped 35497 messages
[Sarama] 2014/03/11 18:39:06 kafka: Dropped 35497 messages
[Sarama] 2014/03/11 18:39:06 kafka: Dropped 35497 messages
[Sarama] 2014/03/11 18:39:06 kafka: Dropped 35497 messages

I am sending 10 million messages, with gzip/snappy, 65k buffer and 1 second buffer timeout. Also eventually the benchmarking program gets killed.

@patricklucas
Copy link

We're seeing this as well. Compression seems to work fine for our low-volume logs but as soon as we hit a few thousand messages per second it starts to drop lines.

Worse, the messages appear to be reenqueued and begin to stack up, eventually OOMing.

@eapache
Copy link
Contributor

eapache commented Jul 2, 2014

It's not clear to me if this is simply a performance issue (where our compression path is slow and we fall behind) or an actual error. The fact that messages are reenqueued is what's throwing me - that only happens when the kafka broker returns us an error, and if it doesn't like the messages we're sending I would expect that to show up even at low volumes.

Can somebody experiencing this take a brief traffic capture when this occurs and see what the broker is actually sending us on the wire? 1.12 prereleases of Wireshark are capable of dissecting kafka traffic if you tell it which port your broker is on.

@eapache
Copy link
Contributor

eapache commented Jul 2, 2014

Additionally, the change I just merged should give us more detailed log messages which might help pinpointing the problem.

@eapache
Copy link
Contributor

eapache commented Nov 13, 2014

Just merged a new producer design which should fix this issue among many others.

@eapache eapache closed this as completed Nov 13, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants