Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Push down topn with left and right outer join #30128

Merged

Conversation

stephen-shelby
Copy link
Contributor

@stephen-shelby stephen-shelby commented Aug 30, 2023

Fixes #issue

What type of PR is this:

BASE PLAN:

                                    TOPN
                                     |
                             LEFT OUTER JOIN 
                                     |
             -------------------------------------------------
             |                                                |
            LEFT SCAN                                      RIGHT SCAN

OPTIMIZED PLAN

                                    TOPN
                                     |
                             LEFT OUTER JOIN 
                                     |
             -------------------------------------------------
             |                                                |
             TOPN                                             |
             |                                                |
            LEFT SCAN                                      RIGHT SCAN

Some test case

  • case 1
    SELECT * FROM ( SELECT c.*, p.p_name, p.p_brand FROM customer c LEFT JOIN part p ON c.c_custkey = p.p_partkey AND p.p_size = 32 ) AS mocktable ORDER BY mocktable.c_nationkey DESC LIMIT 20

TPCH 100G olap table

concurrence base patched
1 0.289 0.068
10 1.515 0.238
20 2.794 0.429
50 7 0.831

TPCH 1T olap table

concurrence base patched
1 1.688 0.067
10 14.692 0.236
20 29.314 0.4
50 73.629 0.793
  • CASE 2
    SELECT * FROM ( SELECT c.*, p.p_name, p.p_brand FROM customer c RIGHT JOIN part p ON c.c_custkey = p.p_partkey ) AS mocktable ORDER BY mocktable.p_brand DESC LIMIT 20

TPCH 100G olap table

concurrence base patched
1 0.565 0.121
10 3.015 0.548
20 5.923 0.975
50 15.143 2.201

TPCH 1T olap table

concurrence base patched
1 5.097 0.463
10 34.284 3.202
20 36.41 6.371
50 44.626 16.142
  • CASE 3
    select * from ( select * from ( SELECT * FROM ( SELECT c.*, p.p_name, p.p_brand FROM customer c RIGHT JOIN part p ON c.c_custkey = p.p_partkey ) AS one_level ORDER BY one_level.p_brand LIMIT 20 ) AS two_level LEFT JOIN orders o ON two_level.c_custkey = o.o_orderkey ) twice_join ORDER BY twice_join.c_nationkey limit 10;

TPCH 100G olap table

concurrence base patched
1 0.528 0.16
10 2.924 0.634
20 5.627 1.146
50 14.379 2.788

TPCH 1T olap table

concurrence base patched
1 5.202 0.529
10 35.047 3.385
20 34.313 6.745
50 49.015 16.769
  • CASE 4
    select * from ( select * from ( SELECT * FROM ( SELECT c.*, p.p_name, p.p_brand FROM customer c RIGHT JOIN part p ON c.c_custkey = p.p_partkey ) AS one_level ORDER BY one_level.p_brand LIMIT 20 ) AS two_level RIGHT JOIN orders o ON two_level.c_custkey = o.o_orderkey ) twice_join ORDER BY twice_join.o_orderdate limit 10;

TPCH 100G olap table

concurrence base patched
1 1.606 0.195
10 10.884 0.726
20 21.692 1.284
50 54.978 3.075

TPCH 1T olap table

concurrence base patched
1 17.49 0.555
10 87.895 3.481
20 71.467 7.029
50 148.5 17.342

TPCH 1T HIVE external table

base patched
case1 9.094 5.392
case2 13.037 6.365
case3 32.064 22.31
case4 39.492 26.205
  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.1
    • 3.0
    • 2.5
    • 2.4

@stephen-shelby stephen-shelby changed the title [Feature] push down topn with left and right outer join [Feature] Push down topn with left and right outer join Aug 30, 2023
@stephen-shelby stephen-shelby changed the title [Feature] Push down topn with left and right outer join [Enhancement] Push down topn with left and right outer join Aug 30, 2023
packy92
packy92 previously approved these changes Aug 31, 2023
@stdpain
Copy link
Contributor

stdpain commented Aug 31, 2023

what will happen if A colocate join B AGG order by A.x;

@stdpain
Copy link
Contributor

stdpain commented Aug 31, 2023

what's the detail plan for TOPN one stage or two stage. two stage TOPN is not as expected.

@stdpain
Copy link
Contributor

stdpain commented Aug 31, 2023

what will happen if l left join r on l.a = r.a where l.a + l.b < 10 order by l.a limit 10?

Signed-off-by: stephen <stephen5217@163.com>
Signed-off-by: stephen <stephen5217@163.com>
Signed-off-by: stephen <stephen5217@163.com>
@stdpain
Copy link
Contributor

stdpain commented Aug 31, 2023

Join may only be executed on one machine.after this change

@stephen-shelby
Copy link
Contributor Author

what will happen if l left join r on l.a = r.a where l.a + l.b < 10 order by l.a limit 10?

do you want to write l.a + r.b < 10 ?

@stephen-shelby
Copy link
Contributor Author

what will happen if l left join r on l.a = r.a where l.a + l.b < 10 order by l.a limit 10?

fixed

@stephen-shelby
Copy link
Contributor Author

Join may only be executed on one machine.after this change

yes, but I only push one layer of topn nodes.

@@ -401,6 +402,7 @@ private OptExpression logicalRuleRewrite(ConnectContext connectContext,
// After this rule, we shouldn't generate logical project operator
ruleRewriteIterative(tree, rootTaskContext, new MergeProjectWithChildRule());

ruleRewriteOnlyOnce(tree, rootTaskContext, new OuterJoinAddRedundantTopNRule());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

execute it before MergeProjectWithChildRule

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If before MergeProjctWithChildRule, it may need two patterns. I had discussed with stephen, the check process has ensure all order by cols are from the child of join. The added topN operator also without setting projection. So after the rule it OK? Has any other problem here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, just few supported scenarios now

Signed-off-by: stephen <stephen5217@163.com>
@sonarcloud
Copy link

sonarcloud bot commented Sep 1, 2023

SonarCloud Quality Gate failed.    Quality Gate failed

Bug C 5 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot E 3 Security Hotspots
Code Smell B 415 Code Smells

0.0% 0.0% Coverage
2.2% 2.2% Duplication

idea Catch issues before they fail your Quality Gate with our IDE extension sonarlint SonarLint

@wanpengfei-git
Copy link
Collaborator

[FE Incremental Coverage Report]

😍 pass : 49 / 51 (96.08%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/qe/SessionVariable.java 2 4 50.00% [1057, 1058]
🔵 com/starrocks/sql/optimizer/Optimizer.java 1 1 100.00% []
🔵 com/starrocks/sql/optimizer/rule/RuleType.java 1 1 100.00% []
🔵 com/starrocks/sql/optimizer/rule/transformation/OuterJoinAddRedundantTopNRule.java 45 45 100.00% []

@wanpengfei-git
Copy link
Collaborator

[BE Incremental Coverage Report]

😍 pass : 0 / 0 (0%)

@stephen-shelby stephen-shelby merged commit 489728d into StarRocks:main Sep 1, 2023
31 of 32 checks passed
@stephen-shelby stephen-shelby deleted the pushdown_topn_outerjoin branch September 1, 2023 10:49
Moonm3n pushed a commit to Moonm3n/starrocks that referenced this pull request Sep 2, 2023
…s#30128)

Signed-off-by: stephen <stephen5217@163.com>
Signed-off-by: Moonm3n <saxonzhan@gmail.com>
Jay-ju pushed a commit to Jay-ju/starrocks that referenced this pull request Sep 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants