Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide settings to prefer index scan over table scan #18808

Closed
breezewish opened this issue Jul 27, 2020 · 16 comments · Fixed by #18996
Closed

Provide settings to prefer index scan over table scan #18808

breezewish opened this issue Jul 27, 2020 · 16 comments · Fixed by #18996
Labels
epic/access-path feature/accepted This feature request is accepted by product managers good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. sig/planner SIG: Planner type/feature-request Categorizes issue or PR as related to a new feature.

Comments

@breezewish
Copy link
Member

breezewish commented Jul 27, 2020

Feature Request

Is your feature request related to a problem? Please describe:

There is a notable number of stability problems caused by a SQL statement becomes slow suddenly due to a wrong execution plan that uses table scan instead of index scan. The reason is that in some cases table scan can be faster than index scan, due to no table lookup. CBO may choose wrong plan due to outdated or inaccurate statistics, which results in the risk of 10x+ slower.

Describe the feature you'd like:

Provide a global variable, a session variable (and optionally a statement hint / bind syntax) that allows optimizer to always prefer index scan over table scan, ignoring their costs (since costs can be inaccurate). The name can be tidb_opt_always_prefer_index.

Note that prefer index scan also means prefer TiKV over TiFlash when there IS corresponding index hit. So it could impact analytical queries significantly. This should be reminded in the documentation if this feature is implemented.

MongoDB provides a similar but more aggressive option: No Table Scan

This idea is previously raised in SIG planner

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Migration Strategy:

@breezewish breezewish added the type/feature-request Categorizes issue or PR as related to a new feature. label Jul 27, 2020
@SunRunAway SunRunAway added the sig/planner SIG: Planner label Jul 27, 2020
@winoros
Copy link
Member

winoros commented Jul 27, 2020

@zz-jason If we implemented this variable, should it be picked up to 4.*?

@winoros
Copy link
Member

winoros commented Jul 27, 2020

Also for implementation notes, this can be implemented in skyline pruning phase. The design doc is this. And the entry point is skylinePruning method of struct DataSource.

@winoros winoros added good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Jul 27, 2020
@winoros
Copy link
Member

winoros commented Jul 27, 2020

If we limit the affected session/server only executes the queries which are expected to return in seconds, we may treat this option as an always good option.

@winoros
Copy link
Member

winoros commented Jul 27, 2020

Implement it as a variable is the most clean way that user don't need to change any part of their production env(like SQL text or their codes) just modify the variable is enough.

If we implement it as a hint, maybe directly a hint /*+ TP_QUERY_TYPE() */ is ok, and we do some other heuristic optimization for TP type queries or just limit planner's behavior, not consider the optimization for analytical queries.

@zz-jason
Copy link
Member

duplicated with #15623

@ghost
Copy link

ghost commented Jul 27, 2020

I think having a generic setting to disable table scans is error prone. What if it is a very small table (such as a list of category definitions)?

If it is an OLTP workload, my experience has been when users ask for such a setting, what they really want is a way of refusing really expensive queries. So for example if the estimated query cost is really high return an error and don't execute the statement. Switching from an expensive table scan to a more expensive index scan (with a double read) could make this situation worse.

@breezewish
Copy link
Member Author

breezewish commented Jul 27, 2020

@nullnotnil This feature is not disabling table scan in all cases. It only disables table scan when there is an available index for current statement.

So for example if the estimated query cost is really high return an error and don't execute the statement

For users there might be no difference for: a normal SQL statement suddently begin refused to be executed , vs a normal SQL statement suddenly takes a long time to be executed -- business logic cannot continue anyway. Additionally the estimated query cost may never be correct, i.e. very low, when TiDB incorrectly selected a table scan over index scan, because that's why TiDB choose it.

This option is not a performance optimization. The execution of some sql statements can be slower after this option is enabled due to double read. It is a stability improvement though, by trading off such edge case performance (when force index scan lead to double read) with stability: you will not suffer from more than 10x slower surprise in production.

@zz-jason
Copy link
Member

@zz-jason If we implemented this variable, should it be picked up to 4.*?

Sure. We can even backport it to release 3.0.

@zz-jason
Copy link
Member

Implement it as a variable is the most clean way that user don't need to change any part of their production env(like SQL text or their codes) just modify the variable is enough.

If we implement it as a hint, maybe directly a hint /*+ TP_QUERY_TYPE() */ is ok, and we do some other heuristic optimization for TP type queries or just limit planner's behavior, not consider the optimization for analytical queries.

I prefer using an instance-level config to control this behavior:

  • Making the optimizer to choose whether to use index range scan or table full scan by default
  • If the user found that the index range scan is a better choice for their applications, they can config that value
  • Making this config able to be changed online to avoid restart the tidb-server

@zz-jason zz-jason added the feature/accepted This feature request is accepted by product managers label Jul 29, 2020
@qw4990
Copy link
Contributor

qw4990 commented Jul 30, 2020

Implement it as a variable is the most clean way that user don't need to change any part of their production env(like SQL text or their codes) just modify the variable is enough.
If we implement it as a hint, maybe directly a hint /*+ TP_QUERY_TYPE() */ is ok, and we do some other heuristic optimization for TP type queries or just limit planner's behavior, not consider the optimization for analytical queries.

I prefer using an instance-level config to control this behavior:

  • Making the optimizer to choose whether to use index range scan or table full scan by default
  • If the user found that the index range scan is a better choice for their applications, they can config that value
  • Making this config able to be changed online to avoid restart the tidb-server

Is an instance-level config enough?
If the user found that the index range scan is a better choice for their applications, they have to config this for all instances in their cluster, so should it be at cluster-level?
Besides, if we want to change it online, it's better to implement it as a SQL variable.
Considering these two points, is it better to implement it as a global SQL variable?

@breezewish
Copy link
Member Author

We are grown-up and can we have all? (session + instance + global)

@ghost
Copy link

ghost commented Jul 31, 2020

We are grown-up and can we have all? (session + instance + global)

There is a good use-case for all three:

  • Session for debugging / proof of concept
  • Instance for a canary deployment across a fleet of TiDB servers (try 5% first).
  • Global if it is an effective change

Most variables should actually follow this pattern. Global (cluster-wide) is quite risky.

@zz-jason
Copy link
Member

zz-jason commented Aug 12, 2020

setting it globally is risky.

we should allow an instance-level setting to avoid changes to the existing applications. @xiaodong-ji PTAL

@breezewish
Copy link
Member Author

breezewish commented Aug 12, 2020

@SunRunAway Can we formally provide some infrastructures / features for such A/B testing & gradually spread pattern as proposed in #18808 (comment) ? As I know there are some limitations to implement a session+instance+global variable now.

@SunRunAway
Copy link
Contributor

@SunRunAway Can we formally provide some infrastructures / features for such A/B testing & gradually spread pattern as proposed in #18808 (comment) ? As I know there are some limitations to implement a session+instance+global variable now.

see #19332

@ghost
Copy link

ghost commented Sep 11, 2020

@SunRunAway Can we formally provide some infrastructures / features for such A/B testing & gradually spread pattern as proposed in #18808 (comment) ? As I know there are some limitations to implement a session+instance+global variable now.

There are a lot of different ways to do this, we might have to spend some time gathering the use cases. For example, they might choose 5% randomly, or apply it to internal users first (most likely to give feedback). The easiest way for us to offer is per-instance. Second easiest would be user-specific variables, but I think it is easiest to keep out of scope for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic/access-path feature/accepted This feature request is accepted by product managers good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. sig/planner SIG: Planner type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants