New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More memory limiting for HttpPostEmitter #5300
Conversation
6d59e8a
to
4181785
Compare
docs/content/configuration/index.md
Outdated
@@ -222,7 +222,7 @@ The Druid servers [emit various metrics](../operations/metrics.html) and alerts | |||
|`druid.emitter.http.basicAuthentication`|Login and password for authentification in "login:password" form, e. g. `druid.emitter.http.basicAuthentication=admin:adminpassword`|not specified = no authentification| | |||
|`druid.emitter.http.flushTimeOut`|The timeout after which an event should be sent to the endpoint, even if internal buffers are not filled, in milliseconds.|not specified = no timeout| | |||
|`druid.emitter.http.batchingStrategy`|The strategy of how the batch is formatted. "ARRAY" means `[event1,event2]`, "NEWLINES" means `event1\nevent2`, ONLY_EVENTS means `event1event2`.|ARRAY| | |||
|`druid.emitter.http.maxBatchSize`|The maximum batch size, in bytes.|5191680 (i. e. 5 MB)| | |||
|`druid.emitter.http.maxBatchSize`|The maximum batch size, in bytes.|the minimum of (10% of JVM heap size divided by 50) or (5191680 (i. e. 5 MB))| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In memory constrained settings I think it would be better to limit the batchQueueSizeLimit rather than the maxBatchSize. It's because very small batches might be an inefficient way to transfer metrics to the http server. We might want to limit both though. It's common to set Xmx to something small (like 64MB) for middleManagers, and in that case, probably 2 batches in the queue * 2.5 MB per batch makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adjusted this to lower the queue limit first, while also keeping a minimum of 2 queue items
2b0fe96
to
7eaac35
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* More memory limiting for HttpPostEmitter * Less aggressive large events test * Fix tests * Restrict batch queue size first, keep minimum of 2 queue items
* More memory limiting for HttpPostEmitter * Less aggressive large events test * Fix tests * Restrict batch queue size first, keep minimum of 2 queue items
This PR adjusts the default
maxBatchSize
for the HttpPostEmitter such that the default configuration (nomaxBatchSize
orbatchQueueSizeLimit
specified) will limit the product ofmaxBatchSize
*batchQueueSizeLimit
to 10% of the available JVM heap.This can be useful since not all Druid nodes will be provisioned with the same amount of memory. In particular, the middle manager is often provisioned with a much smaller heap than the other nodes, which can result in OOM errors from buffered events when using the default values cluster-wide.
This PR also adjusts the large event handling such that they increment and decrement the
approximateBuffersToEmitCount
counter. Previously it was theoretically possible for the large event queue to keep growing past the configured queue limit.