You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
But the application was quickly killed by OOM Killer since the data to be written was created faster than writing to kafka and the data that was accumulated with kafka-go exhausted the memory.
But when I did the same experiment using sarama, 100GB of data was moved to kafka in 2hr10mins!! There was no concurrent routines to do the writes. It was done one write after the other - serially.
Why is this so? How come sarama is able to move data so fast? Is there any tradeoff that comes with this? Can anyone please explain.
The text was updated successfully, but these errors were encountered:
I've seen this issue in segmentio's kafka-go.
Something with settings. It's keeping produce buffer for a second, before sending to a broker (to accumulate).
Could not figure the settings out then...
Also using Sarama now.
Yes I'd imagine this is somewhat due to different defaults in terms of producer buffering and queue sizes, much like with the Java client where you can tune linger time and batching sizes to optimise for throughout vs latency and memory usage.
However, obviously we'd recommend you use Sarama rather than segmentio kafka-go anyway 😀
This is not an issue; but more of a question:
I am trying to send 100GB of data into kafka topic by breaking the 100GB into batches of 100 lines.
Using kafka-go, I see that it writes 1 message per second, as. per : https://www.gitmemory.com/issue/segmentio/kafka-go/326/519375403 . To over this issue, I created a go routine for each write. This immediately improved the throughput.
But the application was quickly killed by OOM Killer since the data to be written was created faster than writing to kafka and the data that was accumulated with kafka-go exhausted the memory.
But when I did the same experiment using sarama, 100GB of data was moved to kafka in 2hr10mins!! There was no concurrent routines to do the writes. It was done one write after the other - serially.
Why is this so? How come sarama is able to move data so fast? Is there any tradeoff that comes with this? Can anyone please explain.
The text was updated successfully, but these errors were encountered: