New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send data in batches #3

Merged
merged 12 commits into from Feb 11, 2018

Conversation

Projects
None yet
2 participants
@patrobinson
Contributor

patrobinson commented Feb 2, 2018

While using this plugin in production I discovered performance was extremely bad when the volume of events was more than a dozen per second. Output plugins are, by default, single threaded and since the multi_receive method isn't implemented it's just a single thread sending one message at a time to firehose. That basically limits this plugin to 10-20 events per second.

Firehose supports batch sends, which allows sending up to 500 at a time. This PR implements the multi_receive method, which Logstash automatically takes advantage of, to send up to 500 records at a time. We also check to ensure other limits, such as the maximum size of a single payload or single record, are not exceeded. Finally we refactor the tests to provide better guarantees it's working as required.

@chupakabr

thanks for your PR mate. Just curious, does logstash have an official repo of this plugin? I think I made PR to their repo long time ago..

@@ -43,12 +43,19 @@ class LogStash::Outputs::Firehose < LogStash::Outputs::Base
TEMPFILE_EXTENSION = "txt"
FIREHOSE_STREAM_VALID_CHARACTERS = /[\w\-]/
# These are hard limits
FIREHOSE_PUT_BATCH_SIZE_LIMIT = 4_000_000 # 4MB

This comment has been minimized.

@chupakabr

chupakabr Feb 8, 2018

Owner

would be nice to have these properties configurable

This comment has been minimized.

@patrobinson

patrobinson Feb 8, 2018

Contributor

This is a hard limit from AWS, you cannot change it so I'm not sure the value in making it configurable.

@patrobinson

This comment has been minimized.

Contributor

patrobinson commented Feb 8, 2018

@chupakabr there is, but your original PR has not been merged https://github.com/logstash-plugins/logstash-output-firehose

I figured this is as close to the upstream as I can get my changes.

Alternatively I can release this as a gem on Rubygems and become the unofficial maintainer

@chupakabr

This comment has been minimized.

Owner

chupakabr commented Feb 9, 2018

@patrobinson if you are fancy becoming an unofficial maintainer - feel free! :) I'll approve your PRs meanwhile here.
Cheers!

@chupakabr

This comment has been minimized.

Owner

chupakabr commented Feb 9, 2018

@patrobinson please resolve merge conflicts first, happened after merging your first PR ;)

@chupakabr chupakabr merged commit 7b34b8f into chupakabr:master Feb 11, 2018

@chupakabr

This comment has been minimized.

Owner

chupakabr commented Feb 11, 2018

cheers mate! @patrobinson

@patrobinson patrobinson deleted the envato:send-data-in-batches branch Jun 22, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment