interval chunk query runner now processes individual chunk in a threadpool by himanshug · Pull Request #1150 · apache/druid

himanshug · 2015-02-23T21:07:10Z

this patch enables

interval chunking query processor to process individual chunks in parallel inside the "Processor" executor service.
addition of "chunkPeriod" to query context
removal of "druid.query.chunkPeriod" and "druid.query.<query-type>.chunkPeriod" configuration as this should really be tuned per query [interval] basis

fjy · 2015-02-25T21:17:03Z

docs/content/Querying.md

can you double check the broker docs for druid.processing.numThreads? I think they will need to be updated as well

Can we document how the chunkPeriod context parameter interacts with the existing druid.query.chunkPeriod and druid.query.<queryType>.chunkPeriod configuration parameter?

actually it replaces
druid.query.chunkPeriod, druid.query.<queryType>.chunkPeriod

they are not valid after this pull request. in my experience we found the chunking behavior really needs to be tuned per query [ sometimes based on size of its interval] .

In that case, can we remove the old configs from docs and code as well, instead of keeping unused config around.

himanshug · 2015-02-26T00:49:12Z

@fjy updated as per the review comments.

fjy · 2015-02-26T00:50:10Z

It also looks like we are ignoring the query config chunk Period, which is changing behavior. We should document that in the PR description and in the docs.

drcrallen · 2015-02-26T00:50:31Z

processing/src/main/java/io/druid/query/QueryContextKeys.java

Is this supposed to be a repository for "context":{.....} reserved words?

yes, I created it as those strings were slowly getting hard coded in different places.
this pull request does not have the refactoring done for old code though.

drcrallen · 2015-02-26T00:53:36Z

Why do we need another means of adding a decorator to the tool chest as part of the constructor? Did the needed manipulators not fit in the pre/post merge decorators for some reason?

drcrallen · 2015-02-26T00:55:41Z

Or a better thing to say is that most of the other runner decorators are expressed as a titled method to make them a little more descriptive and a little more generic-use. Is there a particular reason this decorator cannot follow that workflow?

drcrallen · 2015-02-26T01:01:43Z

processing/src/main/java/io/druid/query/QueryContextKeys.java

Can you please add a comment something along the lines of "Static strings in this class should be considered 'reserved words' for the context of a query."

We'll slowly move over the other reserved words to this class.

yeah, but, class name in this case is conveying that intent and comment really seems redundant.

himanshug · 2015-02-26T01:29:16Z

Why do we need another means of adding a decorator to the tool chest as part of the constructor? 
Did the needed manipulators not fit in the pre/post merge decorators for some reason?

@drcrallen query toolchest needs to instantiate IntervalChunkingQueryRunner which needs ExecutorService, QueryWatcher, ServiceEmitter . One way was that I could add 3 of those in the constructors of each query toolchest and have them instantiate IntervalChunkingQueryRunner directly. IntervalChunkingQueryRunnerDecorator allowed me to do that more cleanly by just adding one thing in query toolchest constructors(also, say, one more thing gets added to IntervalChunkingQueryRunner constructor, then only decorator needs to be updated instead of having to modify all the toolchest constructors)

Or a better thing to say is that most of the other runner decorators are expressed as a titled method to make them a little more descriptive and a little more generic-use. Is there a particular reason this decorator cannot follow that workflow?

Can you further explain the alternative you're suggesting?

nishantmonu51 · 2015-02-26T15:25:09Z

IIRC, the purpose of interval chunking was to handle memory pressure on bards when running queries for very long intervals. I wonder how processing the chunks parallely affect the memory usage on bards.

nishantmonu51 · 2015-02-26T15:45:41Z

processing/src/main/java/io/druid/query/IntervalChunkingQueryRunner.java

should this be toolchest.makeMetricBuilder(input) instead ?

also can you check that the query/time emitted contains the correct chunked interval instead of full query interval

thatz a good catch, I haven't checked it yet, but looks like it should indeed be toolChest.makeMetricBuilder(input).

drcrallen · 2015-02-26T19:40:49Z

@himanshug : The alternative would be to have another method for a decorator that handles whatever part of the query execution chain the InteralChunkingQueryRunner would need to be inserted. It just feels weird to me to have a very specific query runner instance that gets passed instead of a method for properly handling that logical step of the query. As an alternative, for example, there could be a "queryPlanningRunnerDecorator" method (similar to pre or post merger decorators) or something that properly describes the step the IntervalChunkingQueryRunner does. I'm not convinced that's the way to go by any means.

In general, if possible I'd rather have InteralChunkingQueryRunner be one option for a particular logical query pipeline step, and have the hooks into the toolchest help ensure that query runner decorator gets applied at the proper step in the query pipeline.

drcrallen · 2015-02-26T19:44:44Z

@nishantmonu51 also has a very good point. Can you please get some JVM / performance stats on this patch for inclusion in this PR?

himanshug · 2015-02-27T03:13:56Z

@drcrallen @nishantmonu51
we tested this patch on a group-by query with about 8 months of our production data and were able to reduce the reponse time from ~29 secs to ~11 secs in the best case. Please see how response time varied with the chunk period. (we had enough threads in processor executor of course to process all chunks in parallel)
We mostly have nested group-by queries and this does not change the memory usage bcoz result-set of internal query is kept in memory to process outer query whether you use chunking or not. Also, in general I did not see noticeable difference (though I admit, at the time, I did not really collect many numbers around memory usage but mostly around response times).

Chunk Period	Response Time
P0D	0m29.134s
P1D	0m16.367s
P2D	0m11.060s
P3D	0m12.788s
P4D	0m13.292s
P5D	0m11.064s
P5D	0m12.218s
P10D	0m11.832s
P15D	0m12.459s
P20D	0m11.478s
P25D	0m12.295s
P30D	0m12.156s
P35D	0m11.939s
P40D	0m12.194s
P45D	0m12.802s
P50D	0m12.573s
P55D	0m13.395s
P60D	0m14.474s
P65D	0m13.374s
P70D	0m15.168s
P75D	0m14.285s
P80D	0m15.309s
P85D	0m14.950s
P90D	0m15.475s
P95D	0m15.440s
P100D	0m17.679s
P105D	0m15.611s
P110D	0m16.409s
P115D	0m19.854s
P120D	0m16.849s
P125D	0m16.818s
P130D	0m17.176s
P135D	0m17.979s
P140D	0m17.399s
P145D	0m17.024s
P150D	0m19.705s
P155D	0m17.996s
P160D	0m17.007s
P165D	0m17.982s
P170D	0m17.555s
P175D	0m19.196s
P180D	0m20.846s

Also, to tell you the truth, it seems that interval chunking query runner never really did any chunking :P
If you look at the code closely at
https://github.com/druid-io/druid/blob/6e315ddcd2eaff70ccda3786ede9a6d36394a15f/processing/src/main/java/io/druid/query/IntervalChunkingQueryRunner.java#L51
if (period.getMillis() == 0) {
return baseRunner.run(query, responseContext);
}

"period.getMillis()" will pretty much always be 0 for e.g. P1D, P1M etc. it should've been "period.toStandardDuration().getMillis()".

fjy · 2015-03-02T21:30:08Z

@himanshug Can you squash the commits and I will merge

…d pool and prints metrics query/time per chunk

himanshug · 2015-03-02T21:45:46Z

@fjy squashing done.

interval chunk query runner now processes individual chunk in a threadpool

xvrl · 2015-04-13T21:58:59Z

server/src/main/java/io/druid/server/ClientQuerySegmentWalker.java

@himanshug any reason this got removed? It looks like we lost a bunch of metrics when upgrading to 0.7.1

@xvrl sorry to keep you waiting, i have been away.
anyways, it used to report pretty much same metric as the request/time from QueryResource. Instead we moved it to IntervalChunkingQueryRunner which now reports query/time for each chunk (if chunking is used).

I see, the problem is that IntervalChunkingQueryRunner is not always used by default, so we lost those metrics when upgrading.

losing that metric should be ok (when no chunking) as numbers reported in query/time and request/time were pretty much the same.

I agree that we are not losing too much information, but for systems that rely on existing metrics to be present that can be an issue. Either we should notify users of backwards incompatible changes, or try to maintain backwards compatibility whenever possible.

I think notifying the users makes sense, release-notes for 0.7.1 would be the best place. Is there a way for me to update those?

I updated the unofficial metrics doc though https://docs.google.com/spreadsheets/d/15XxGrGv2ggdt4SnCoIsckqUNJBZ9ludyhC9hNQa3l-M/edit#gid=0

fjy reviewed Feb 25, 2015
View reviewed changes

drcrallen reviewed Feb 26, 2015
View reviewed changes

nishantmonu51 reviewed Feb 26, 2015
View reviewed changes

fjy added the Feature label Feb 27, 2015

fjy added this to the 0.7.1 milestone Feb 27, 2015

interval chunk query runner now processes individual chunk in a threa…

29039fd

…d pool and prints metrics query/time per chunk

himanshug force-pushed the broker-parallel-chunk-process branch from 4e16e80 to 29039fd Compare March 2, 2015 21:45

fjy added a commit that referenced this pull request Mar 2, 2015

Merge pull request #1150 from himanshug/broker-parallel-chunk-process

e8605c6

interval chunk query runner now processes individual chunk in a threadpool

fjy merged commit e8605c6 into apache:master Mar 2, 2015

himanshug deleted the broker-parallel-chunk-process branch March 2, 2015 21:58

xvrl reviewed Apr 13, 2015
View reviewed changes

gianm mentioned this pull request Mar 4, 2017

Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. #4004

Merged

Conversation

himanshug commented Feb 23, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

himanshug commented Feb 26, 2015

Uh oh!

fjy commented Feb 26, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drcrallen commented Feb 26, 2015

Uh oh!

drcrallen commented Feb 26, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

himanshug commented Feb 26, 2015

Uh oh!

nishantmonu51 commented Feb 26, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drcrallen commented Feb 26, 2015

Uh oh!

drcrallen commented Feb 26, 2015

Uh oh!

himanshug commented Feb 27, 2015

Uh oh!

fjy commented Mar 2, 2015

Uh oh!

himanshug commented Mar 2, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants