Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added merge/buffer metric to print buffer pool size every minute #5541

Closed
wants to merge 4 commits into from

Conversation

Sarvesh1990
Copy link

No description provided.

Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @Sarvesh1990!

In addition to the comments on the diff, I've got one more comment: please add a line to the metrics.md documentation page explaining the new metric and what it means.

@@ -66,6 +71,7 @@ public void emitMetrics(ServiceEmitter emitter, ServiceMetricEvent.Builder metri
{
if (delegate instanceof PrioritizedExecutorService) {
emitter.emit(metricBuilder.build("segment/scan/pending", ((PrioritizedExecutorService) delegate).getQueueSize()));
emitter.emit(metricBuilder.build("merge/buffers/size", ((DefaultBlockingPool) mergeBufferPool).getPoolSize()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest calling this query/merge/buffersUsed. Presumably, the other metric (merge buffer wait time) would be something like query/merge/wait.

@@ -66,6 +71,7 @@ public void emitMetrics(ServiceEmitter emitter, ServiceMetricEvent.Builder metri
{
if (delegate instanceof PrioritizedExecutorService) {
emitter.emit(metricBuilder.build("segment/scan/pending", ((PrioritizedExecutorService) delegate).getQueueSize()));
emitter.emit(metricBuilder.build("merge/buffers/size", ((DefaultBlockingPool) mergeBufferPool).getPoolSize()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getPoolSize() is not the right method to use here. It's the number of buffers in the pool that have not been handed out, but we want this metric to be the number that are handed out. Try adding a new method to the BlockingPool interface like getUsedBufferCount() and then implementing that in DefaultBlockingPool.

Copy link
Author

@Sarvesh1990 Sarvesh1990 Mar 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Regarding metrics.md, I am not sure what to write in the Dimensions part. Also Normal value should ideally be the number of buffers allocated but that value will vary depending on cores. So did not write anything in that.

Let me know if this works fine.

Copy link
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sarvesh1990 thanks for your PR! It looks good to me and I left some trivial comments.

@@ -25,6 +25,8 @@
{
int maxSize();

int getUsedBufferCount();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a simple javadoc.

@@ -97,7 +97,7 @@ public ExecutorService getProcessingExecutorService(
PrioritizedExecutorService.create(
lifecycle,
config
),
), getMergeBufferPool(config),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be like

        PrioritizedExecutorService.create(
            lifecycle,
            config
        ),
        getMergeBufferPool(config),
        executorServiceMonitor

@@ -66,6 +71,7 @@ public void emitMetrics(ServiceEmitter emitter, ServiceMetricEvent.Builder metri
{
if (delegate instanceof PrioritizedExecutorService) {
emitter.emit(metricBuilder.build("segment/scan/pending", ((PrioritizedExecutorService) delegate).getQueueSize()));
emitter.emit(metricBuilder.build("query/merge/buffersUsed", ((DefaultBlockingPool) mergeBufferPool).getUsedBufferCount()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks that the casting is not needed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Let me know if its fine.

@Sarvesh1990
Copy link
Author

Hey @gianm @jihoonson - Does the fix looks okay to merge?

Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sarvesh1990 - just read over this again. There is an issue with the implementation in getProcessingExecutorService, but also, I suggest we side-step that anyway. See the comment for more details.

@@ -98,6 +98,7 @@ public ExecutorService getProcessingExecutorService(
lifecycle,
config
),
getMergeBufferPool(config),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be getMergeBufferPool(config) - it's going to create a new merge buffer pool rather than use the existing one. (Check the implementation of getMergeBufferPool.) So the metric would always be zero since it wouldn't be checking the "real" pool. Instead, the patch should have a @Merging BlockingPool<ByteBuffer> mergePool as an argument to getProcessingExecutorService. Then the right one would get injected.

However! I think we should go a different path entirely. The Processing ExecutorService is not really the right place to add this metric. The metric is unrelated to executor processing. I think it'd be better to have the Merging BlockingPool do this itself, in much the same way that the Processing ExecutorService does. By that I mean: define a MetricsEmittingBlockingPool and have that emit the buffersUsed metric. Then, that should be returned by the existing getMergeBufferPool method.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @gianm.

Thanks for your comments. I also had doubt while using this function but then I guessed that @LazySingleton annotation will take care of that.

No worries. I will fix this thing and let you know in sometime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great, I'll keep an eye on this pr.

@gianm
Copy link
Contributor

gianm commented Apr 12, 2018

Hi @Sarvesh1990, are you still interested in working on this feature?

@b-slim
Copy link
Contributor

b-slim commented May 3, 2018

@Sarvesh1990 can you please take this to the finish line? thanks

@Sarvesh1990
Copy link
Author

Hey. I am really sorry for delaying this. Actually I had to go out of staion because of some important work. I will finish this feature by next week for sure. Also I will start contributing more seriously from next week.

@Sarvesh1990
Copy link
Author

Hey @gianm @b-slim

Please find the changes as @gianm wanted in his previous comment. Have moved the responsibility to print merge pool to the pool emitter. Have also renamed ExecutorServiceMonitor to ProcessingMonitor so that we can use the same monitor to print all processing related data.

Let me know if it looks okay. Sorry for delaying this. Will be regular from now.

@Sarvesh1990
Copy link
Author

Hey @gianm @b-slim

Does this PR looks okay? Can I start working on the other one?

@Sarvesh1990
Copy link
Author

Hey @gianm @b-slim

Can you please review this PR if it looks okay?

@@ -37,6 +37,7 @@ Available Metrics
|`query/success/count`|number of queries successfully processed|This metric is only available if the QueryCountStatsMonitor module is included.||
|`query/failed/count`|number of failed queries|This metric is only available if the QueryCountStatsMonitor module is included.||
|`query/interrupted/count`|number of queries interrupted due to cancellation or timeout|This metric is only available if the QueryCountStatsMonitor module is included.||
|`query/merge/buffersUsed`|number of merge buffers allocated to broker while performing groupBy merge queries|||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the DruidProcessingModule.getMergeBufferPool() is called GroupByStrategyV2, and GroupByStrategyV2 is called in historical and realtime when group by query, so this metric also should be documented in historical and realtime section?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaijianding is right - it should be documented in those other sections as well.

@gianm gianm self-assigned this Sep 28, 2018
Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Sarvesh1990 for the contribution! This looks generally good but please take a look at the couple of comments.

@@ -37,6 +37,7 @@ Available Metrics
|`query/success/count`|number of queries successfully processed|This metric is only available if the QueryCountStatsMonitor module is included.||
|`query/failed/count`|number of failed queries|This metric is only available if the QueryCountStatsMonitor module is included.||
|`query/interrupted/count`|number of queries interrupted due to cancellation or timeout|This metric is only available if the QueryCountStatsMonitor module is included.||
|`query/merge/buffersUsed`|number of merge buffers allocated to broker while performing groupBy merge queries|||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaijianding is right - it should be documented in those other sections as well.

public MetricsEmittingMergingBlockingPool(
Supplier<T> generator,
int limit,
ProcessingMonitor processingMonitor)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code style is a bit off here (the ending paren should be on its own line; also, indentation). If you are using IntelliJ, you can apply Druid's code style from the codestyle config in the repo. Also, our checkstyle should catch some things like this.

@gianm
Copy link
Contributor

gianm commented Dec 4, 2018

Hey @Sarvesh1990 - hope the review was helpful, let us know if you have any questions about further implementation!

@gianm
Copy link
Contributor

gianm commented Dec 20, 2018

Hey @Sarvesh1990, do you think you'll have an opportunity to pick this up in the future?

@gianm
Copy link
Contributor

gianm commented Jan 8, 2019

Hi @Sarvesh1990, I'm going to close this, but thanks for the interest in contributing, and please reopen it if you have the time to work on it some more.

@gianm gianm closed this Jan 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants