Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[multistage] Filter Clause Not Pushed Down In Some Join Queries #10392

Closed
ankitsultana opened this issue Mar 7, 2023 · 1 comment
Closed
Labels
multi-stage Related to the multi-stage query engine

Comments

@ankitsultana
Copy link
Contributor

ankitsultana commented Mar 7, 2023

Repro-1: In this case the query has two not-in. With 1 not-in the filters are pushed down but not when there are two or more.

EXPLAIN PLAN FOR 
  SELECT COUNT(*) FROM userAttributes_OFFLINE 
    WHERE deviceOS = 'mac-os' 
    AND userUUID NOT IN (SELECT userUUID FROM userGroups_OFFLINE WHERE groupUUID = 'group-1') 
    AND userUUID NOT IN (SELECT userUUID FROM userGroups_OFFLINE WHERE groupUUID = 'group-2')
Execution Plan
LogicalAggregate(group=[{}], EXPR$0=[$SUM0($0)])
  LogicalExchange(distribution=[hash])
    LogicalAggregate(group=[{}], EXPR$0=[COUNT()])
      LogicalFilter(condition=[IS NOT TRUE($5)])
        LogicalJoin(condition=[=($3, $4)], joinType=[left])
          LogicalExchange(distribution=[hash[3]])
            LogicalProject(deviceOS=[$0], userUUID0=[$2], $f1=[$4], userUUID2=[$1])
              LogicalFilter(condition=[AND(=($0, 'mac-os'), IS NOT TRUE($4))])
                LogicalJoin(condition=[=($2, $3)], joinType=[left])
                  LogicalExchange(distribution=[hash[2]])
                    LogicalProject(deviceOS=[$4], userUUID=[$5], userUUID0=[$5])
                      LogicalTableScan(table=[[userAttributes_OFFLINE]])
                  LogicalExchange(distribution=[hash[0]])
                    LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
                      LogicalExchange(distribution=[hash[0]])
                        LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
                          LogicalProject(userUUID=[$3], $f1=[true])
                            LogicalFilter(condition=[=($0, 'group-1')])
                              LogicalTableScan(table=[[userGroups_OFFLINE]])
          LogicalExchange(distribution=[hash[0]])
            LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
              LogicalExchange(distribution=[hash[0]])
                LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
                  LogicalProject(userUUID=[$3], $f1=[true])
                    LogicalFilter(condition=[=($0, 'group-2')])
                      LogicalTableScan(table=[[userGroups_OFFLINE]])

Repro-2: In this case a broadcast is done on the right side, but still the filters should have been pushed down to the left side

EXPLAIN PLAN FOR
  SELECT COUNT(*) FROM userAttributes_OFFLINE 
    WHERE deviceOS = 'mac-os' 
	AND (
	  userUUID IN (SELECT userUUID FROM userGroups_OFFLINE WHERE groupUUID = 'group-1') 
	  OR userUUID IN (SELECT userUUID FROM userGroups_OFFLINE WHERE groupUUID = 'group-2')
	)


Execution Plan
LogicalAggregate(group=[{}], EXPR$0=[$SUM0($0)])
  LogicalExchange(distribution=[hash])
    LogicalAggregate(group=[{}], EXPR$0=[COUNT()])
      LogicalFilter(condition=[OR(CAST(OR(AND(IS NOT NULL($4), <>($1, 0)), AND(<($2, $1), null, <>($1, 0), IS NULL($4)))):BOOLEAN NOT NULL, CAST(OR(AND(IS NOT NULL($9), <>($5, 0)), AND(<($6, $5), null, <>($5, 0), IS NULL($9)))):BOOLEAN NOT NULL)])
        LogicalJoin(condition=[=($7, $8)], joinType=[left])
          LogicalExchange(distribution=[hash[7]])
            LogicalProject(deviceOS=[$0], $f0=[$2], $f1=[$3], userUUID0=[$4], $f10=[$5], $f00=[$6], $f11=[$7], userUUID2=[$1])
              LogicalFilter(condition=[=($0, 'mac-os')])
                LogicalJoin(condition=[true], joinType=[inner])
                  LogicalExchange(distribution=[random])
                    LogicalProject(deviceOS=[$0], userUUID=[$1], $f0=[$2], $f1=[$3], userUUID0=[$4], $f10=[$6])
                      LogicalJoin(condition=[=($4, $5)], joinType=[left])
                        LogicalExchange(distribution=[hash[4]])
                          LogicalProject(deviceOS=[$0], userUUID=[$1], $f0=[$2], $f1=[$3], userUUID0=[$1])
                            LogicalJoin(condition=[true], joinType=[inner])
                              LogicalExchange(distribution=[random])
                                LogicalProject(deviceOS=[$4], userUUID=[$5])
                                  LogicalTableScan(table=[[userAttributes_OFFLINE]])
                              LogicalExchange(distribution=[broadcast])
                                LogicalProject($f0=[$0], $f00=[$0])
                                  LogicalAggregate(group=[{}], agg#0=[$SUM0($0)])
                                    LogicalExchange(distribution=[hash])
                                      LogicalAggregate(group=[{}], agg#0=[COUNT()])
                                        LogicalProject(groupUUID=[$0], userUUID=[$3])
                                          LogicalFilter(condition=[=($0, 'group-1')])
                                            LogicalTableScan(table=[[userGroups_OFFLINE]])
                        LogicalExchange(distribution=[hash[0]])
                          LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
                            LogicalExchange(distribution=[hash[0]])
                              LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
                                LogicalProject(userUUID=[$3], $f1=[true])
                                  LogicalFilter(condition=[=($0, 'group-1')])
                                    LogicalTableScan(table=[[userGroups_OFFLINE]])
                  LogicalExchange(distribution=[broadcast])
                    LogicalProject($f0=[$0], $f00=[$0])
                      LogicalAggregate(group=[{}], agg#0=[$SUM0($0)])
                        LogicalExchange(distribution=[hash])
                          LogicalAggregate(group=[{}], agg#0=[COUNT()])
                            LogicalProject(groupUUID=[$0], userUUID=[$3])
                              LogicalFilter(condition=[=($0, 'group-2')])
                                LogicalTableScan(table=[[userGroups_OFFLINE]])
          LogicalExchange(distribution=[hash[0]])
            LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
              LogicalExchange(distribution=[hash[0]])
                LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
                  LogicalProject(userUUID=[$3], $f1=[true])
                    LogicalFilter(condition=[=($0, 'group-2')])
                      LogicalTableScan(table=[[userGroups_OFFLINE]])
@ankitsultana ankitsultana changed the title [multistage] Filter Clause Not Pushed Down When More Than 1 Not-In [multistage] Filter Clause Not Pushed Down In Some Join Queries Mar 7, 2023
@Jackie-Jiang Jackie-Jiang added the multi-stage Related to the multi-stage query engine label Apr 18, 2023
@ankitsultana
Copy link
Contributor Author

This was fixed by #10409

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
multi-stage Related to the multi-stage query engine
Projects
None yet
Development

No branches or pull requests

2 participants