Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect result of nested groupBy query on Join of subqueries #9866

Closed
jihoonson opened this issue May 13, 2020 · 1 comment
Closed

Incorrect result of nested groupBy query on Join of subqueries #9866

jihoonson opened this issue May 13, 2020 · 1 comment

Comments

@jihoonson
Copy link
Contributor

Affected Version

0.18.x

Description

As reported in #9792, a nested groupBy query can result in an incorrect result when these conditions are met:

  • The nested groupBy is on top of a Join of subqueries
  • Inner and outer groupBys have different filters.

In this case, the Join execution engine will use the filter of the outer groupBy query when it processes the inner groupBy query. For example, given a query as below,

WITH abc AS (
  SELECT dim1, m2
  FROM druid.foo 
  WHERE "__time" >= '2001-01-02'
),
def AS(
  SELECT t1.dim1, SUM(t2.m2) AS "metricSum" 
  FROM abc AS t1 INNER JOIN abc AS t2 ON t1.dim1 = t2.dim1
  WHERE t1.dim1='def'
  GROUP BY 1
)
SELECT count(*) FROM def

Druid will make a query plan for this query as below:

 groupBy (outer)
    |
 groupBy (inner)
    |
   join
  /    \
scan  scan
 |      |
foo    foo

For this query plan, the broker will execute the two scan queries at leaf, materialize the results in memory, and then execute the join and groupBys. The join plan will be converted into a joinSegment and executed with the inner groupBy together. Due to this bug, the broker will ignore the filter t1.dim1 = 'def' on the inner groupBy query since there is no filter on the outer groupBy.

@jon-wei
Copy link
Contributor

jon-wei commented Jul 9, 2020

Fixed by #10015

@jon-wei jon-wei closed this as completed Jul 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants