New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use GroupBy V2 as default #3953
Conversation
333a100
to
a4d4e7e
Compare
@@ -318,6 +318,7 @@ private static ValueExtractFunction makeValueExtractFunction( | |||
) | |||
{ | |||
final boolean includeTimestamp = GroupByStrategyV2.getUniversalTimestamp(query) == null; | |||
//final boolean includeTimestamp = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, deleted this
) | ||
); | ||
} catch (Exception e) { | ||
Assert.fail(e.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert.fail inside a callback like this seems kinda weird -- usually asserts run inside the main test code. Would it work if this is just a re-throw? Like throw Throwables.propagate(e)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to Throwables.propagate(e)
@@ -112,6 +133,45 @@ public GroupByTimeseriesQueryRunnerTest(QueryRunner runner) | |||
super(runner, false); | |||
} | |||
|
|||
// GroupBy handles timestamps differently when granularity is ALL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this override needed? Do groupBy v1 and v2 behave differently here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GroupBy v1 and v2 behave the same in this case, when the query granularity is set to ALL, the result timestamps will contain the earliest timestamp in the query interval, which differs from Timeseries
Prior to this patch, testFullOnTimeseriesMaxMin()
was not executing mergeResults()
on GroupBy V1, so the timestamp wasn't truncated to the ALL granularity and the results matched the Timeseries results exactly.
40ffcdb
to
11203f7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@gianm @jon-wei after this PR our builds for hadoop indexer test failed to run the groupBy query and failed with exception message "Grouping resources exhausted" from https://github.com/druid-io/druid/blob/master/processing/src/main/java/io/druid/query/groupby/epinephelinae/GroupByMergingQueryRunnerV2.java#L301 with no indication of what the problem was. I added some disk space by
|
Yeah that message could be better. I think nothing else other than temporary-storage-full could cause that, so might as well add that to the error message. |
There's a similar message in GroupByRowProcessor that could be changed too |
@gianm wondering why would we modify the message at both places instead of throwing exception with correct message in SpillingGrouper ... is there a case when throwing exception there is not desirable? |
@himanshug I guess I avoided exceptions because How about changing the return type from |
…ods. Includes two fixes: - groupBy v2 now ignores chunkPeriod, since it wouldn't have helped anyway (its mergeResults returns a lazy sequence) and it generates incorrect results. - Fix chunkPeriod handling for periods of irregular length, like "P1M" or "P1Y". Also includes doc and test fixes: - groupBy v1 was no longer being tested by GroupByQueryRunnerTest since apache#3953, now it is once again. - chunkPeriod documentation was misleading due to its checkered past. Updated it to be more accurate.
…ods. (#4004) * Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. Includes two fixes: - groupBy v2 now ignores chunkPeriod, since it wouldn't have helped anyway (its mergeResults returns a lazy sequence) and it generates incorrect results. - Fix chunkPeriod handling for periods of irregular length, like "P1M" or "P1Y". Also includes doc and test fixes: - groupBy v1 was no longer being tested by GroupByQueryRunnerTest since #3953, now it is once again. - chunkPeriod documentation was misleading due to its checkered past. Updated it to be more accurate. * Remove unused import. * Restore buffer size.
…ods. (apache#4004) * Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. Includes two fixes: - groupBy v2 now ignores chunkPeriod, since it wouldn't have helped anyway (its mergeResults returns a lazy sequence) and it generates incorrect results. - Fix chunkPeriod handling for periods of irregular length, like "P1M" or "P1Y". Also includes doc and test fixes: - groupBy v1 was no longer being tested by GroupByQueryRunnerTest since apache#3953, now it is once again. - chunkPeriod documentation was misleading due to its checkered past. Updated it to be more accurate. * Remove unused import. * Restore buffer size.
…ods. (#4004) (#4015) * Ignore chunkPeriod for groupBy v2, fix chunkPeriod for irregular periods. Includes two fixes: - groupBy v2 now ignores chunkPeriod, since it wouldn't have helped anyway (its mergeResults returns a lazy sequence) and it generates incorrect results. - Fix chunkPeriod handling for periods of irregular length, like "P1M" or "P1Y". Also includes doc and test fixes: - groupBy v1 was no longer being tested by GroupByQueryRunnerTest since #3953, now it is once again. - chunkPeriod documentation was misleading due to its checkered past. Updated it to be more accurate. * Remove unused import. * Restore buffer size.
Makes GroupBy V2 the default and updates some unit tests to work with V2