-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added merge/buffer metric to print buffer pool size every minute #5541
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution @Sarvesh1990!
In addition to the comments on the diff, I've got one more comment: please add a line to the metrics.md
documentation page explaining the new metric and what it means.
@@ -66,6 +71,7 @@ public void emitMetrics(ServiceEmitter emitter, ServiceMetricEvent.Builder metri | |||
{ | |||
if (delegate instanceof PrioritizedExecutorService) { | |||
emitter.emit(metricBuilder.build("segment/scan/pending", ((PrioritizedExecutorService) delegate).getQueueSize())); | |||
emitter.emit(metricBuilder.build("merge/buffers/size", ((DefaultBlockingPool) mergeBufferPool).getPoolSize())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest calling this query/merge/buffersUsed
. Presumably, the other metric (merge buffer wait time) would be something like query/merge/wait
.
@@ -66,6 +71,7 @@ public void emitMetrics(ServiceEmitter emitter, ServiceMetricEvent.Builder metri | |||
{ | |||
if (delegate instanceof PrioritizedExecutorService) { | |||
emitter.emit(metricBuilder.build("segment/scan/pending", ((PrioritizedExecutorService) delegate).getQueueSize())); | |||
emitter.emit(metricBuilder.build("merge/buffers/size", ((DefaultBlockingPool) mergeBufferPool).getPoolSize())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getPoolSize()
is not the right method to use here. It's the number of buffers in the pool that have not been handed out, but we want this metric to be the number that are handed out. Try adding a new method to the BlockingPool interface like getUsedBufferCount()
and then implementing that in DefaultBlockingPool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Regarding metrics.md, I am not sure what to write in the Dimensions part. Also Normal value should ideally be the number of buffers allocated but that value will vary depending on cores. So did not write anything in that.
Let me know if this works fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Sarvesh1990 thanks for your PR! It looks good to me and I left some trivial comments.
@@ -25,6 +25,8 @@ | |||
{ | |||
int maxSize(); | |||
|
|||
int getUsedBufferCount(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a simple javadoc.
@@ -97,7 +97,7 @@ public ExecutorService getProcessingExecutorService( | |||
PrioritizedExecutorService.create( | |||
lifecycle, | |||
config | |||
), | |||
), getMergeBufferPool(config), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be like
PrioritizedExecutorService.create(
lifecycle,
config
),
getMergeBufferPool(config),
executorServiceMonitor
@@ -66,6 +71,7 @@ public void emitMetrics(ServiceEmitter emitter, ServiceMetricEvent.Builder metri | |||
{ | |||
if (delegate instanceof PrioritizedExecutorService) { | |||
emitter.emit(metricBuilder.build("segment/scan/pending", ((PrioritizedExecutorService) delegate).getQueueSize())); | |||
emitter.emit(metricBuilder.build("query/merge/buffersUsed", ((DefaultBlockingPool) mergeBufferPool).getUsedBufferCount())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks that the casting is not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Let me know if its fine.
Hey @gianm @jihoonson - Does the fix looks okay to merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Sarvesh1990 - just read over this again. There is an issue with the implementation in getProcessingExecutorService, but also, I suggest we side-step that anyway. See the comment for more details.
@@ -98,6 +98,7 @@ public ExecutorService getProcessingExecutorService( | |||
lifecycle, | |||
config | |||
), | |||
getMergeBufferPool(config), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be getMergeBufferPool(config)
- it's going to create a new merge buffer pool rather than use the existing one. (Check the implementation of getMergeBufferPool.) So the metric would always be zero since it wouldn't be checking the "real" pool. Instead, the patch should have a @Merging BlockingPool<ByteBuffer> mergePool
as an argument to getProcessingExecutorService
. Then the right one would get injected.
However! I think we should go a different path entirely. The Processing ExecutorService is not really the right place to add this metric. The metric is unrelated to executor processing. I think it'd be better to have the Merging BlockingPool do this itself, in much the same way that the Processing ExecutorService does. By that I mean: define a MetricsEmittingBlockingPool and have that emit the buffersUsed metric. Then, that should be returned by the existing getMergeBufferPool
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @gianm.
Thanks for your comments. I also had doubt while using this function but then I guessed that @LazySingleton annotation will take care of that.
No worries. I will fix this thing and let you know in sometime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great, I'll keep an eye on this pr.
Hi @Sarvesh1990, are you still interested in working on this feature? |
@Sarvesh1990 can you please take this to the finish line? thanks |
Hey. I am really sorry for delaying this. Actually I had to go out of staion because of some important work. I will finish this feature by next week for sure. Also I will start contributing more seriously from next week. |
f8179c1
to
ddff801
Compare
ddff801
to
689792a
Compare
Please find the changes as @gianm wanted in his previous comment. Have moved the responsibility to print merge pool to the pool emitter. Have also renamed ExecutorServiceMonitor to ProcessingMonitor so that we can use the same monitor to print all processing related data. Let me know if it looks okay. Sorry for delaying this. Will be regular from now. |
@@ -37,6 +37,7 @@ Available Metrics | |||
|`query/success/count`|number of queries successfully processed|This metric is only available if the QueryCountStatsMonitor module is included.|| | |||
|`query/failed/count`|number of failed queries|This metric is only available if the QueryCountStatsMonitor module is included.|| | |||
|`query/interrupted/count`|number of queries interrupted due to cancellation or timeout|This metric is only available if the QueryCountStatsMonitor module is included.|| | |||
|`query/merge/buffersUsed`|number of merge buffers allocated to broker while performing groupBy merge queries||| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the DruidProcessingModule.getMergeBufferPool() is called GroupByStrategyV2, and GroupByStrategyV2 is called in historical and realtime when group by query, so this metric also should be documented in historical and realtime section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaijianding is right - it should be documented in those other sections as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Sarvesh1990 for the contribution! This looks generally good but please take a look at the couple of comments.
@@ -37,6 +37,7 @@ Available Metrics | |||
|`query/success/count`|number of queries successfully processed|This metric is only available if the QueryCountStatsMonitor module is included.|| | |||
|`query/failed/count`|number of failed queries|This metric is only available if the QueryCountStatsMonitor module is included.|| | |||
|`query/interrupted/count`|number of queries interrupted due to cancellation or timeout|This metric is only available if the QueryCountStatsMonitor module is included.|| | |||
|`query/merge/buffersUsed`|number of merge buffers allocated to broker while performing groupBy merge queries||| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaijianding is right - it should be documented in those other sections as well.
public MetricsEmittingMergingBlockingPool( | ||
Supplier<T> generator, | ||
int limit, | ||
ProcessingMonitor processingMonitor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code style is a bit off here (the ending paren should be on its own line; also, indentation). If you are using IntelliJ, you can apply Druid's code style from the codestyle config in the repo. Also, our checkstyle should catch some things like this.
Hey @Sarvesh1990 - hope the review was helpful, let us know if you have any questions about further implementation! |
Hey @Sarvesh1990, do you think you'll have an opportunity to pick this up in the future? |
Hi @Sarvesh1990, I'm going to close this, but thanks for the interest in contributing, and please reopen it if you have the time to work on it some more. |
No description provided.