[Fix](Nereids) fix column statistic derive in outer join estimation#25586
Conversation
|
run buildall |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
|
run buildall |
ee490e7 to
740123a
Compare
|
run buildall |
740123a to
2de7c1b
Compare
|
run buildall |
|
(From new machine)TeamCity pipeline, clickbench performance test result: |
|
run buildall |
|
(From new machine)TeamCity pipeline, clickbench performance test result: |
|
PR approved by at least one committer and no changes requested. |
…25586) Problem: When join estimation, upper join output slot statistic ndv would go wrong Example: we have two table: tableA (a1[ndv = 10.0]) tableB(b1[ndv = 0.0], b2[ndv = 10.0]) tableA left join tableB on A.a1 = B.b1. which B.b1 with ndv zero. the problem is after join estimation, B.b2 changed to 1.0. Reason: When estimating outer join, we can assume it behave like inner join. But we estimation then like inner join do Solved: When estimation outer join, output slot would update seperatly.
…pache#25586) Problem: When join estimation, upper join output slot statistic ndv would go wrong Example: we have two table: tableA (a1[ndv = 10.0]) tableB(b1[ndv = 0.0], b2[ndv = 10.0]) tableA left join tableB on A.a1 = B.b1. which B.b1 with ndv zero. the problem is after join estimation, B.b2 changed to 1.0. Reason: When estimating outer join, we can assume it behave like inner join. But we estimation then like inner join do Solved: When estimation outer join, output slot would update seperatly.
…pache#25586) Problem: When join estimation, upper join output slot statistic ndv would go wrong Example: we have two table: tableA (a1[ndv = 10.0]) tableB(b1[ndv = 0.0], b2[ndv = 10.0]) tableA left join tableB on A.a1 = B.b1. which B.b1 with ndv zero. the problem is after join estimation, B.b2 changed to 1.0. Reason: When estimating outer join, we can assume it behave like inner join. But we estimation then like inner join do Solved: When estimation outer join, output slot would update seperatly.
…pache#25586) Problem: When join estimation, upper join output slot statistic ndv would go wrong Example: we have two table: tableA (a1[ndv = 10.0]) tableB(b1[ndv = 0.0], b2[ndv = 10.0]) tableA left join tableB on A.a1 = B.b1. which B.b1 with ndv zero. the problem is after join estimation, B.b2 changed to 1.0. Reason: When estimating outer join, we can assume it behave like inner join. But we estimation then like inner join do Solved: When estimation outer join, output slot would update seperatly.
Proposed changes
Problem:
When join estimation, upper join output slot statistic ndv would go wrong
Example:
we have two table:
tableA (a1[ndv = 10.0]) tableB(b1[ndv = 0.0], b2[ndv = 10.0])
tableA left join tableB on A.a1 = B.b1. which B.b1 with ndv zero.
the problem is after join estimation, B.b2 changed to 1.0.
Reason:
When estimating outer join, we can assume it behave like inner join. But we estimation then like inner join do
Solved:
When estimation outer join, output slot would update seperatly.
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...