New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output processing is slower than expected #2313

Closed
hc4 opened this Issue May 31, 2016 · 7 comments

Comments

Projects
None yet
4 participants
@hc4
Contributor

hc4 commented May 31, 2016

Problem description

Output speed is limited by output_batch_size*8 messages per second
It is unclear why.

For example my ES node (I use single node) can process up to 6k messages per second.
But if I setup output_batch_size=100 and outputbuffer_processors=4, then I can get only 800 output messages per second.
With output_batch_size=1000 it will output 5-6k messages per second as expected.

Environment

  • Graylog Version: 2.0.2
  • Elasticsearch Version: 2.3.3
@hc4

This comment has been minimized.

Show comment
Hide comment
@hc4

hc4 May 31, 2016

Contributor

Just checked with 2 processors and 100 queue size - output speed approximately the same (800 msg/s).
So the formula isn't fully correct...

Contributor

hc4 commented May 31, 2016

Just checked with 2 processors and 100 queue size - output speed approximately the same (800 msg/s).
So the formula isn't fully correct...

@joschi

This comment has been minimized.

Show comment
Hide comment
@joschi

joschi May 31, 2016

Contributor

@hc4 Yes, the two settings output_batch_size (max. number of messages sent to Elasticsearch in a batch) and outputbuffer_processors (number of threads working on the output buffer) have direct influence on the throughput of messages to Elasticsearch.

Do you have any specific question about that?

Contributor

joschi commented May 31, 2016

@hc4 Yes, the two settings output_batch_size (max. number of messages sent to Elasticsearch in a batch) and outputbuffer_processors (number of threads working on the output buffer) have direct influence on the throughput of messages to Elasticsearch.

Do you have any specific question about that?

@hc4

This comment has been minimized.

Show comment
Hide comment
@hc4

hc4 May 31, 2016

Contributor

why is it capped?
I don't understand why I can't send 10 bulks by 100msg in one second?

Checked with 8 processors and bulkd size =100. Speed remains the same ~800msg/s.
What is the bottleneck? ES bulk processing or some graylog internals?

Contributor

hc4 commented May 31, 2016

why is it capped?
I don't understand why I can't send 10 bulks by 100msg in one second?

Checked with 8 processors and bulkd size =100. Speed remains the same ~800msg/s.
What is the bottleneck? ES bulk processing or some graylog internals?

@dennisoelkers

This comment has been minimized.

Show comment
Hide comment
@dennisoelkers

dennisoelkers May 31, 2016

Member

I think you're raising valid questions, @hc4. Could you please supply the following metrics of your Graylog server for the different scenarios:

org.graylog2.outputs.BlockingBatchedESOutput.batchSize
org.graylog2.outputs.BlockingBatchedESOutput.bufferFlushes
org.graylog2.outputs.BlockingBatchedESOutput.bufferFlushesRequested
org.graylog2.outputs.BlockingBatchedESOutput.processTime

Comparing them would be of great help validating your concerns and possibly improving the throughput.

Member

dennisoelkers commented May 31, 2016

I think you're raising valid questions, @hc4. Could you please supply the following metrics of your Graylog server for the different scenarios:

org.graylog2.outputs.BlockingBatchedESOutput.batchSize
org.graylog2.outputs.BlockingBatchedESOutput.bufferFlushes
org.graylog2.outputs.BlockingBatchedESOutput.bufferFlushesRequested
org.graylog2.outputs.BlockingBatchedESOutput.processTime

Comparing them would be of great help validating your concerns and possibly improving the throughput.

@hc4

This comment has been minimized.

Show comment
Hide comment
@hc4

hc4 May 31, 2016

Contributor

For 100x4:
image

For 2000x4:
image

In second case actual output speed:
image

Contributor

hc4 commented May 31, 2016

For 100x4:
image

For 2000x4:
image

In second case actual output speed:
image

@hc4

This comment has been minimized.

Show comment
Hide comment
@hc4

hc4 Jun 1, 2016

Contributor

There is still label needs-input.
Should I provide more info?

Contributor

hc4 commented Jun 1, 2016

There is still label needs-input.
Should I provide more info?

@jalogisch

This comment has been minimized.

Show comment
Hide comment
@jalogisch

jalogisch Oct 10, 2016

Member

This issue is fairly old and there hasn't been much activity on it. Closing, but please re-open if it still occurs.

Member

jalogisch commented Oct 10, 2016

This issue is fairly old and there hasn't been much activity on it. Closing, but please re-open if it still occurs.

@jalogisch jalogisch closed this Oct 10, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment