Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizer: Fix Merge In and Not Equal ranges #39676 #42229

Merged
merged 2 commits into from Mar 21, 2023

Conversation

ghazalfamilyusa
Copy link
Contributor

@ghazalfamilyusa ghazalfamilyusa commented Mar 14, 2023

What problem does this PR solve?

Issue Number: ref #39676

Problem Summary:

Background

TiDB optimizer has limited predicate simplifications and does not support many cases. One example is filtering redundant conditions across <> and In for same column. For example, a <> 1 and a in (1,2,3) can be simplified to a in (2,3). Also, a <> 5 and a in (1,2,3) can be simplified to a in (1,2,3)

Solution

We added a new place holder logical rewrite optimization for predicate simplification. The general solution is
1- Process all predicates as a set of ranges for all columns
2- build equivalence classes for columns based col = col
3- Merge all ranges of each equivalence class
4- Build range predicates for each column in each equivalence class
This solution is on the optimizer roadmap and similar to what Teradata did https://docs.teradata.com/r/8mHBBLGP88~HK9Auie2QvQ/dinTQRLUmuWh_KUBr7ffdg

In this PR, we only do <> and In list merge to address the customer issue. The algorithm is simple and applied only to table scan predicates. The main idea is for each pair of conditions of a <> constant_0 and a in (constant_1, ... constant_n) we

  • Remove the a <> constant_0 since it is always reduntant
  • Remove constant_0 from the in list if any. We keep at least one element in the in list and postpone the (no table access) for later on.

Tests

  • Unit test

Side effects
N/A

Documentation

N/A

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Mar 14, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • fixdb
  • winoros

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added do-not-merge/invalid-title do-not-merge/needs-linked-issue release-note-none size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 14, 2023
@ghazalfamilyusa ghazalfamilyusa changed the title Optimize:r Fix Merge In and Not Equal ranges #39676 Optimizer Fix Merge In and Not Equal ranges #39676 Mar 14, 2023
@ghazalfamilyusa ghazalfamilyusa changed the title Optimizer Fix Merge In and Not Equal ranges #39676 Optimizer: Fix Merge In and Not Equal ranges #39676 Mar 14, 2023
@ghazalfamilyusa ghazalfamilyusa marked this pull request as draft March 14, 2023 20:40
@ti-chi-bot ti-chi-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 14, 2023
@ghazalfamilyusa ghazalfamilyusa marked this pull request as ready for review March 14, 2023 21:06
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 14, 2023
@ghazalfamilyusa ghazalfamilyusa force-pushed the predicate_simplification branch 2 times, most recently from 6f52795 to f780f94 Compare March 14, 2023 21:39
@ghazalfamilyusa
Copy link
Contributor Author

/retest

Copy link
Contributor

@fixdb fixdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the comments, looks good to me.

if _, ok := args[1].(*expression.Constant); !ok {
return nil, otherPredicate
}
return col, notEqualPredicate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need these predicateType enum, instead we can just use v.FuncName.L as the return value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check is for more than the function name like col in (list of constants) and col <> constant.

return newPred
}

func indexInSlice(index int, list []int) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use slices.Contains by importing golang.org/x/exp/slices.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Will be done in next round.

Comment on lines 32 to 34
inListPredicate predicateType = 0x00
notEqualPredicate predicateType = 0x01
otherPredicate predicateType = 0x02
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
inListPredicate predicateType = 0x00
notEqualPredicate predicateType = 0x01
otherPredicate predicateType = 0x02
inListPredicate predicateType = iota
notEqualPredicate
otherPredicate

In golang, this syntactic sugar should work

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Will be done in next round.

@ghazalfamilyusa
Copy link
Contributor Author

/retest

Copy link
Contributor

@fixdb fixdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Mar 16, 2023
planner/core/rule_predicate_simplification.go Outdated Show resolved Hide resolved
Comment on lines 129 to 143
"SQL": "select f from t use index() where f <> 3 and f in (1,2,3) and f <> 1 and f <> 2 -- Multiple <> values and cover whole inlist. We keep at least one in inlist",
"Plan": [
"TableReader 10.00 root data:Selection",
"└─Selection 10.00 cop[tikv] in(test.t.f, 2)",
" └─TableFullScan 10000.00 cop[tikv] table:t keep order:false, stats:pseudo"
]
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get the wrong result in this case? If we keep f in (2), we should also keep f != 2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right and we need to keep both. Good catch. Fixed.

@ti-chi-bot ti-chi-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 20, 2023
@ghazalfamilyusa
Copy link
Contributor Author

/retest

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Mar 21, 2023
@winoros
Copy link
Member

winoros commented Mar 21, 2023

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: efc79eb

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Mar 21, 2023
@ti-chi-bot ti-chi-bot merged commit b96c55d into pingcap:master Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants