Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue #147

Closed
l15k4 opened this issue May 17, 2016 · 11 comments
Closed

Performance issue #147

l15k4 opened this issue May 17, 2016 · 11 comments

Comments

@l15k4
Copy link

l15k4 commented May 17, 2016

Hey,

I was following the samples in Readme and they work although they are insanely slow, it takes 17 seconds to pipe 100 messages from topic1 to topic2 and 125 seconds with 1000 events

https://gist.github.com/l15k4/5cdd6be19ec2fcb9a3a32c23f0f6866c

I tried that with docker run --rm -p 9092:9092 -p 2181:2181 spotify/kafka instead of the embedded one...

Am I doing something wrong? Or is it a matter of setting kafka differently ?

@patriknw
Copy link
Member

Thanks for trying that and reporting. Validating performance is indeed something we must do before a final release, see #123

@l15k4
Copy link
Author

l15k4 commented May 17, 2016

@patriknw I found that the bottleneck is the consumer :
https://gist.github.com/l15k4/5cdd6be19ec2fcb9a3a32c23f0f6866c#file-reactivekafkasuite-scala-L54-L58

which is consuming 5 messages per second. The Producer and "pipe" are super-fast and finish quickly but the Consumer is super slow and even after all messages are emitted it goes 5 messages/sec...

However if I change it to Source.plainSource from Source.atMostOnceSource it consumes 125 messages per second ... Which is acceptable. I didn't expect offset committing to have such an extreme impact

@patriknw
Copy link
Member

I can understand that offset committing is slow, and that is why batching of commits should be used. It's anyway important that the single commits are decent. 5 messages per second is not decent. 125 is also really bad.

@kciesielski
Copy link
Contributor

Just for the record: I modified the gist a bit in order to make sure we're only measuring the final consumer: https://gist.github.com/kciesielski/8a07f1562d25e497bdb5b22d1c71b55c
Running on my local machine gave following results:
With embedded Kafka:
plainSource 100m/2s, 1000m/2s
atMostOnceSource 100m/5.5s, 1000m/13s

With local Kafka:
plainSource 100m/0.4s, 1000m/0.4s
atMostOnceSource 100m/2s, 1000m/12-13s

Note that this includes time needed to connect.

@avakhrenev
Copy link

avakhrenev commented May 18, 2016

@l15k4 Note that you can't fetch more than max.partition.fetch.bytes * partitionCount / pollInterval bytes per unit of time. I would suggest to set pollInterval to zero and try again.

@l15k4
Copy link
Author

l15k4 commented May 18, 2016

Ok, I changed to flozano/kafka:0.9.0.0 and IN -> topic1 -> topic2 -> OUT now performs 25 000 m/s with plainSources ... sorted out, thank you !

@l15k4 l15k4 closed this as completed May 18, 2016
@kciesielski
Copy link
Contributor

@l15k4 Is flozano/kafka:0.9.0.0 some custom docker image? By the way, the newest version of Kafka is 0.9.0.1, it may contain some performance fixes.

@l15k4
Copy link
Author

l15k4 commented May 19, 2016

@kciesielski yeah https://github.com/flozano/docker-kafka ... They have a bug in the latest/0.9.0.1 build. I'm building my own image as there is no image out there that would bundle ZK and one could easily change server.properties, which is implemented here https://github.com/wurstmeister/kafka-docker/blob/master/start-kafka.sh#L29-L40

@kciesielski
Copy link
Contributor

@l15k4 Interesting, is that a bug in Kafka itself or just a bug in docker image?

@l15k4
Copy link
Author

l15k4 commented May 19, 2016

@kciesielski In the image spotify/docker-kafka#34 (comment)

@kciesielski
Copy link
Contributor

I see, thanks :)

@ennru ennru added this to the invalid milestone Jun 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants