New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizer: Fix Merge In and Not Equal ranges #39676 #42229
Optimizer: Fix Merge In and Not Equal ranges #39676 #42229
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
06c368c
to
b721a9f
Compare
6f52795
to
f780f94
Compare
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the comments, looks good to me.
if _, ok := args[1].(*expression.Constant); !ok { | ||
return nil, otherPredicate | ||
} | ||
return col, notEqualPredicate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need these predicateType enum, instead we can just use v.FuncName.L as the return value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check is for more than the function name like col in (list of constants) and col <> constant.
return newPred | ||
} | ||
|
||
func indexInSlice(index int, list []int) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use slices.Contains
by importing golang.org/x/exp/slices
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Will be done in next round.
inListPredicate predicateType = 0x00 | ||
notEqualPredicate predicateType = 0x01 | ||
otherPredicate predicateType = 0x02 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inListPredicate predicateType = 0x00 | |
notEqualPredicate predicateType = 0x01 | |
otherPredicate predicateType = 0x02 | |
inListPredicate predicateType = iota | |
notEqualPredicate | |
otherPredicate |
In golang, this syntactic sugar should work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Will be done in next round.
f780f94
to
70f3f94
Compare
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
"SQL": "select f from t use index() where f <> 3 and f in (1,2,3) and f <> 1 and f <> 2 -- Multiple <> values and cover whole inlist. We keep at least one in inlist", | ||
"Plan": [ | ||
"TableReader 10.00 root data:Selection", | ||
"└─Selection 10.00 cop[tikv] in(test.t.f, 2)", | ||
" └─TableFullScan 10000.00 cop[tikv] table:t keep order:false, stats:pseudo" | ||
] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We get the wrong result in this case? If we keep f in (2)
, we should also keep f != 2
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right and we need to keep both. Good catch. Fixed.
70f3f94
to
9a968d1
Compare
9a968d1
to
efc79eb
Compare
/retest |
/merge |
This pull request has been accepted and is ready to merge. Commit hash: efc79eb
|
What problem does this PR solve?
Issue Number: ref #39676
Problem Summary:
Background
TiDB optimizer has limited predicate simplifications and does not support many cases. One example is filtering redundant conditions across <> and In for same column. For example, a <> 1 and a in (1,2,3) can be simplified to a in (2,3). Also, a <> 5 and a in (1,2,3) can be simplified to a in (1,2,3)
Solution
We added a new place holder logical rewrite optimization for predicate simplification. The general solution is
1- Process all predicates as a set of ranges for all columns
2- build equivalence classes for columns based col = col
3- Merge all ranges of each equivalence class
4- Build range predicates for each column in each equivalence class
This solution is on the optimizer roadmap and similar to what Teradata did https://docs.teradata.com/r/8mHBBLGP88~HK9Auie2QvQ/dinTQRLUmuWh_KUBr7ffdg
In this PR, we only do <> and In list merge to address the customer issue. The algorithm is simple and applied only to table scan predicates. The main idea is for each pair of conditions of a <> constant_0 and a in (constant_1, ... constant_n) we
Tests
Side effects
N/A
Documentation
N/A
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.