Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new order ratio being added to 8.0.0 #16871

Merged
merged 26 commits into from
Apr 8, 2024
Merged
Changes from 5 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
bbea3d9
commit-message: Add new order ratio
terry1purcell Mar 27, 2024
2d9ab6e
Merge branch 'pingcap:master' into orderratio
terry1purcell Mar 28, 2024
13c910f
Merge branch 'pingcap:master' into orderratio
terry1purcell Mar 29, 2024
6e0a7fb
review comments march 29
terry1purcell Mar 30, 2024
f4b5b6b
Merge branch 'orderratio' of github.com:terry1purcell/docs into order…
terry1purcell Mar 30, 2024
848f50f
add tidb_opt_ordering_index_selectivity_ratio in release notes 8.0.0
hfxsd Apr 1, 2024
aeb3b8c
review comments april 1
terry1purcell Apr 1, 2024
efb99ca
Merge branch 'orderratio' of github.com:terry1purcell/docs into order…
terry1purcell Apr 1, 2024
6859d3d
Merge branch 'pingcap:master' into orderratio
terry1purcell Apr 1, 2024
2c41583
Merge branch 'orderratio' of github.com:terry1purcell/docs into order…
terry1purcell Apr 1, 2024
ec69e8a
moved to an appropriate position
hfxsd Apr 2, 2024
a5ca664
Apply suggestions from code review
hfxsd Apr 2, 2024
c2abf0b
Apply suggestions from code review
hfxsd Apr 2, 2024
8feb112
Merge branch 'pingcap:master' into orderratio
terry1purcell Apr 2, 2024
ee1e851
Merge branch 'pingcap:master' into orderratio
terry1purcell Apr 3, 2024
615e926
commit-message: Add new order ratio
terry1purcell Mar 27, 2024
6c68ade
review comments march 29
terry1purcell Mar 30, 2024
ddaa8f9
review comments april 1
terry1purcell Apr 1, 2024
f65d818
add tidb_opt_ordering_index_selectivity_ratio in release notes 8.0.0
hfxsd Apr 1, 2024
508c9a8
moved to an appropriate position
hfxsd Apr 2, 2024
7687bb0
Apply suggestions from code review
hfxsd Apr 2, 2024
082b2fe
Apply suggestions from code review
hfxsd Apr 2, 2024
1109ce5
review comments april 3
terry1purcell Apr 3, 2024
8ff0114
review comments april 3 merge
terry1purcell Apr 3, 2024
fcb5ef4
Merge branch 'pingcap:master' into orderratio
terry1purcell Apr 7, 2024
3952f10
review comments april 7
terry1purcell Apr 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
102 changes: 102 additions & 0 deletions system-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -4284,6 +4284,108 @@ mysql> desc select count(distinct a) from test.t;
+----------------------------------+---------+-----------+----------------------+-------------------------------------+
```

### tidb_opt_ordering_index_selectivity_ratio <span class="version-mark">New in v8.0.0</span>

- Scope: SESSION | GLOBAL
- Persists to cluster: Yes
- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): Yes
- Type: Float
- Default value: `-1`
- Range: `[-1, 1]`
- This variable is used to influence the estimated number of rows for an index that matches a SQL statements `ORDER BY` when there are `ORDER BY` and `LIMIT` clauses with filter conditions that aren't covered by the index.
terry1purcell marked this conversation as resolved.
Show resolved Hide resolved
- This addresses the same query patterns as variable [tidb_opt_ordering_index_selectivity_threshold](#tidb_opt_ordering_index_selectivity_threshold-new-in-v700).
terry1purcell marked this conversation as resolved.
Show resolved Hide resolved
terry1purcell marked this conversation as resolved.
Show resolved Hide resolved
- It differs in implementation by applying a ratio or percentage of the possible range that the qualified rows will be found.
- Value -1 (default) and any other value below zero disables this ratio, that enables optimizer to estimate target rows. Any value between 0 and 1 applies a ratio of 0% to 100% (meaning 0.5 = 50%).
terry1purcell marked this conversation as resolved.
Show resolved Hide resolved
- In the following examples, table `t` has a total of 1,000,000 rows. The same query is used, but different values for `tidb_opt_ordering_index_selectivity_ratio` are used. The query in the example has a WHERE clause predicate that qualifies a small percentage of the rows (9000 out of 1,000,000). There is an index that supports the `ORDER BY a` (index ia), but the filtering on b does not appear in this index. Based upon the data distribution, the row matching the WHERE clause and LIMIT 1 could be found as the 1st row accessed when scanning the non-filtering index, or at worst, after nearly all of the rows have been processed.
terry1purcell marked this conversation as resolved.
Show resolved Hide resolved
- An index hint is used in each example to demonstrate the impact to estRows. The impact to the final plan choice is dependent on the availability and cost of other plans.
terry1purcell marked this conversation as resolved.
Show resolved Hide resolved
- The first example uses the default -1, which uses the existing estimation formula. The default behavior is that a small percentage of the rows are estimated to be scanned before a row is found that qualifies from filtering outside of that index.
terry1purcell marked this conversation as resolved.
Show resolved Hide resolved

```sql
> SET SESSION tidb_opt_ordering_index_selectivity_ratio = -1;

>EXPLAIN SELECT * FROM t USE INDEX (ia) WHERE b <= 9000 ORDER BY a LIMIT 1;
+-----------------------------------+---------+-----------+-----------------------+---------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------------+---------+-----------+-----------------------+---------------------------------+
| Limit_12 | 1.00 | root | | offset:0, count:1 |
| └─Projection_22 | 1.00 | root | | test.t.a, test.t.b, test.t.c |
| └─IndexLookUp_21 | 1.00 | root | | |
| ├─IndexFullScan_18(Build) | 109.20 | cop[tikv] | table:t, index:ia(a) | keep order:true |
| └─Selection_20(Probe) | 1.00 | cop[tikv] | | le(test.t.b, 9000) |
| └─TableRowIDScan_19 | 109.20 | cop[tikv] | table:t | keep order:false |
+-----------------------------------+---------+-----------+-----------------------+---------------------------------+
```
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

- The 2nd example uses 0, which assumes that 0% of the rows will be scanned before the qualified rows are found.
terry1purcell marked this conversation as resolved.
Show resolved Hide resolved

```sql
>SET SESSION tidb_opt_ordering_index_selectivity_ratio = 0;

>EXPLAIN SELECT * FROM t USE INDEX (ia) WHERE b <= 9000 ORDER BY a LIMIT 1;
+-----------------------------------+---------+-----------+-----------------------+---------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------------+---------+-----------+-----------------------+---------------------------------+
| Limit_12 | 1.00 | root | | offset:0, count:1 |
| └─Projection_22 | 1.00 | root | | test.t.a, test.t.b, test.t.c |
| └─IndexLookUp_21 | 1.00 | root | | |
| ├─IndexFullScan_18(Build) | 1.00 | cop[tikv] | table:t, index:ia(a) | keep order:true |
| └─Selection_20(Probe) | 1.00 | cop[tikv] | | le(test.t.b, 9000) |
| └─TableRowIDScan_19 | 1.00 | cop[tikv] | table:t | keep order:false |
+-----------------------------------+---------+-----------+-----------------------+---------------------------------+
```
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

- The 3rd example uses 0.1, meaning that 10% of the possible range is estimated to be scanned. Given the strong filtering of less than 1% of the rows qualified, the worst case is that 99% of the rows need to be scanned before that 1% of the rows are found. 10% of that 99% is approximately 9.9%, which is reflected in the estRows.
terry1purcell marked this conversation as resolved.
Show resolved Hide resolved

```sql
>SET SESSION tidb_opt_ordering_index_selectivity_ratio = 0.1;

>EXPLAIN SELECT * FROM t USE INDEX (ia) WHERE b <= 9000 ORDER BY a LIMIT 1;
+-----------------------------------+----------+-----------+-----------------------+---------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------------+----------+-----------+-----------------------+---------------------------------+
| Limit_12 | 1.00 | root | | offset:0, count:1 |
| └─Projection_22 | 1.00 | root | | test.t.a, test.t.b, test.t.c |
| └─IndexLookUp_21 | 1.00 | root | | |
| ├─IndexFullScan_18(Build) | 99085.21 | cop[tikv] | table:t, index:ia(a) | keep order:true |
| └─Selection_20(Probe) | 1.00 | cop[tikv] | | le(test.t.b, 9000) |
| └─TableRowIDScan_19 | 99085.21 | cop[tikv] | table:t | keep order:false |
+-----------------------------------+----------+-----------+-----------------------+---------------------------------+
```
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

- The 4th example uses 1.0, which means 100% of the possible range is estimated to be scanned.
terry1purcell marked this conversation as resolved.
Show resolved Hide resolved

```sql
>SET SESSION tidb_opt_ordering_index_selectivity_ratio = 1;

>EXPLAIN SELECT * FROM t USE INDEX (ia) WHERE b <= 9000 ORDER BY a LIMIT 1;
+-----------------------------------+-----------+-----------+-----------------------+---------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------------+-----------+-----------+-----------------------+---------------------------------+
| Limit_12 | 1.00 | root | | offset:0, count:1 |
| └─Projection_22 | 1.00 | root | | test.t.a, test.t.b, test.t.c |
| └─IndexLookUp_21 | 1.00 | root | | |
| ├─IndexFullScan_18(Build) | 990843.14 | cop[tikv] | table:t, index:ia(a) | keep order:true |
| └─Selection_20(Probe) | 1.00 | cop[tikv] | | le(test.t.b, 9000) |
| └─TableRowIDScan_19 | 990843.14 | cop[tikv] | table:t | keep order:false |
+-----------------------------------+-----------+-----------+-----------------------+---------------------------------+
```
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

- The 5th example also uses 1.0, but adds a predicate on a that limits the worst case scan range since `WHERE a <= 9000` is matching on the index such that approximately 9000 rows would qualify in total. Given that there is a filtering predicate on b that is not in the index, all of the approximately 9000 rows are considered to be scanned before a qualified row for `b <= 9000` is found.
terry1purcell marked this conversation as resolved.
Show resolved Hide resolved

```sql
>EXPLAIN SELECT * FROM t USE INDEX (ia) WHERE a <= 9000 AND b <= 9000 ORDER BY a LIMIT 1;
+------------------------------------+---------+-----------+-----------------------+------------------------------------+
| id | estRows | task | access object | operator info |
+------------------------------------+---------+-----------+-----------------------+------------------------------------+
| Limit_12 | 1.00 | root | | offset:0, count:1 |
| └─Projection_22 | 1.00 | root | | test.t.a, test.t.b, test.t.c |
| └─IndexLookUp_21 | 1.00 | root | | |
| ├─IndexRangeScan_18(Build) | 9074.99 | cop[tikv] | table:t, index:ia(a) | range:[-inf,9000], keep order:true |
| └─Selection_20(Probe) | 1.00 | cop[tikv] | | le(test.t.b, 9000) |
| └─TableRowIDScan_19 | 9074.99 | cop[tikv] | table:t | keep order:false |
+------------------------------------+---------+-----------+-----------------------+------------------------------------+
```
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

### tidb_opt_prefer_range_scan <span class="version-mark">New in v5.0</span>

- Scope: SESSION | GLOBAL
Expand Down