New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Send data in batches #3
Send data in batches #3
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for your PR mate. Just curious, does logstash have an official repo of this plugin? I think I made PR to their repo long time ago..
@@ -43,12 +43,19 @@ class LogStash::Outputs::Firehose < LogStash::Outputs::Base | |||
TEMPFILE_EXTENSION = "txt" | |||
FIREHOSE_STREAM_VALID_CHARACTERS = /[\w\-]/ | |||
|
|||
# These are hard limits | |||
FIREHOSE_PUT_BATCH_SIZE_LIMIT = 4_000_000 # 4MB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice to have these properties configurable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a hard limit from AWS, you cannot change it so I'm not sure the value in making it configurable.
@chupakabr there is, but your original PR has not been merged https://github.com/logstash-plugins/logstash-output-firehose I figured this is as close to the upstream as I can get my changes. Alternatively I can release this as a gem on Rubygems and become the unofficial maintainer |
@patrobinson if you are fancy becoming an unofficial maintainer - feel free! :) I'll approve your PRs meanwhile here. |
@patrobinson please resolve merge conflicts first, happened after merging your first PR ;) |
cheers mate! @patrobinson |
While using this plugin in production I discovered performance was extremely bad when the volume of events was more than a dozen per second. Output plugins are, by default, single threaded and since the
multi_receive
method isn't implemented it's just a single thread sending one message at a time to firehose. That basically limits this plugin to 10-20 events per second.Firehose supports batch sends, which allows sending up to 500 at a time. This PR implements the
multi_receive
method, which Logstash automatically takes advantage of, to send up to 500 records at a time. We also check to ensure other limits, such as the maximum size of a single payload or single record, are not exceeded. Finally we refactor the tests to provide better guarantees it's working as required.