Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send data in batches #3

Merged
merged 12 commits into from Feb 11, 2018
Merged

Send data in batches #3

merged 12 commits into from Feb 11, 2018

Conversation

@patrobinson
Copy link
Contributor

@patrobinson patrobinson commented Feb 2, 2018

While using this plugin in production I discovered performance was extremely bad when the volume of events was more than a dozen per second. Output plugins are, by default, single threaded and since the multi_receive method isn't implemented it's just a single thread sending one message at a time to firehose. That basically limits this plugin to 10-20 events per second.

Firehose supports batch sends, which allows sending up to 500 at a time. This PR implements the multi_receive method, which Logstash automatically takes advantage of, to send up to 500 records at a time. We also check to ensure other limits, such as the maximum size of a single payload or single record, are not exceeded. Finally we refactor the tests to provide better guarantees it's working as required.

Copy link
Owner

@chupakabr chupakabr left a comment

thanks for your PR mate. Just curious, does logstash have an official repo of this plugin? I think I made PR to their repo long time ago..

@@ -43,12 +43,19 @@ class LogStash::Outputs::Firehose < LogStash::Outputs::Base
TEMPFILE_EXTENSION = "txt"
FIREHOSE_STREAM_VALID_CHARACTERS = /[\w\-]/

# These are hard limits
FIREHOSE_PUT_BATCH_SIZE_LIMIT = 4_000_000 # 4MB

This comment has been minimized.

@chupakabr

chupakabr Feb 8, 2018
Owner

would be nice to have these properties configurable

This comment has been minimized.

@patrobinson

patrobinson Feb 8, 2018
Author Contributor

This is a hard limit from AWS, you cannot change it so I'm not sure the value in making it configurable.

@patrobinson
Copy link
Contributor Author

@patrobinson patrobinson commented Feb 8, 2018

@chupakabr there is, but your original PR has not been merged https://github.com/logstash-plugins/logstash-output-firehose

I figured this is as close to the upstream as I can get my changes.

Alternatively I can release this as a gem on Rubygems and become the unofficial maintainer

@chupakabr
Copy link
Owner

@chupakabr chupakabr commented Feb 9, 2018

@patrobinson if you are fancy becoming an unofficial maintainer - feel free! :) I'll approve your PRs meanwhile here.
Cheers!

@chupakabr
Copy link
Owner

@chupakabr chupakabr commented Feb 9, 2018

@patrobinson please resolve merge conflicts first, happened after merging your first PR ;)

@chupakabr chupakabr merged commit 7b34b8f into chupakabr:master Feb 11, 2018
@chupakabr
Copy link
Owner

@chupakabr chupakabr commented Feb 11, 2018

cheers mate! @patrobinson

@patrobinson patrobinson deleted the envato:send-data-in-batches branch Jun 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants