Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new order ratio being added to 8.0.0 #16871

Merged
merged 26 commits into from
Apr 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
bbea3d9
commit-message: Add new order ratio
terry1purcell Mar 27, 2024
2d9ab6e
Merge branch 'pingcap:master' into orderratio
terry1purcell Mar 28, 2024
13c910f
Merge branch 'pingcap:master' into orderratio
terry1purcell Mar 29, 2024
6e0a7fb
review comments march 29
terry1purcell Mar 30, 2024
f4b5b6b
Merge branch 'orderratio' of github.com:terry1purcell/docs into order…
terry1purcell Mar 30, 2024
848f50f
add tidb_opt_ordering_index_selectivity_ratio in release notes 8.0.0
hfxsd Apr 1, 2024
aeb3b8c
review comments april 1
terry1purcell Apr 1, 2024
efb99ca
Merge branch 'orderratio' of github.com:terry1purcell/docs into order…
terry1purcell Apr 1, 2024
6859d3d
Merge branch 'pingcap:master' into orderratio
terry1purcell Apr 1, 2024
2c41583
Merge branch 'orderratio' of github.com:terry1purcell/docs into order…
terry1purcell Apr 1, 2024
ec69e8a
moved to an appropriate position
hfxsd Apr 2, 2024
a5ca664
Apply suggestions from code review
hfxsd Apr 2, 2024
c2abf0b
Apply suggestions from code review
hfxsd Apr 2, 2024
8feb112
Merge branch 'pingcap:master' into orderratio
terry1purcell Apr 2, 2024
ee1e851
Merge branch 'pingcap:master' into orderratio
terry1purcell Apr 3, 2024
615e926
commit-message: Add new order ratio
terry1purcell Mar 27, 2024
6c68ade
review comments march 29
terry1purcell Mar 30, 2024
ddaa8f9
review comments april 1
terry1purcell Apr 1, 2024
f65d818
add tidb_opt_ordering_index_selectivity_ratio in release notes 8.0.0
hfxsd Apr 1, 2024
508c9a8
moved to an appropriate position
hfxsd Apr 2, 2024
7687bb0
Apply suggestions from code review
hfxsd Apr 2, 2024
082b2fe
Apply suggestions from code review
hfxsd Apr 2, 2024
1109ce5
review comments april 3
terry1purcell Apr 3, 2024
8ff0114
review comments april 3 merge
terry1purcell Apr 3, 2024
fcb5ef4
Merge branch 'pingcap:master' into orderratio
terry1purcell Apr 7, 2024
3952f10
review comments april 7
terry1purcell Apr 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions releases/release-8.0.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -333,6 +333,7 @@ Quick access: [Quick start](https://docs.pingcap.com/tidb/v8.0/quick-start-with-
| [`tidb_enable_fast_create_table`](/system-variables.md#tidb_enable_fast_create_table-new-in-v800) | Newly added | Controls whether to enable [TiDB Accerates Table Creation](/accelerated-table-creation.md). Set the value to `ON` to enable it and `OFF` to disable it. The default value is `ON`. When this variable is enabled, TiDB accelerates table creation by using [`CREATE TABLE`](/sql-statements/sql-statement-create-table.md). |
| [`tidb_load_binding_timeout`](/system-variables.md#tidb_load_binding_timeout-new-in-v800) | Newly added | Controls the timeout of loading bindings. If the execution time of loading bindings exceeds this value, the loading will stop. |
| [`tidb_low_resolution_tso_update_interval`](/system-variables.md#tidb_low_resolution_tso_update_interval-new-in-v800) | Newly added | Controls the interval for updating TiDB [cache timestamp](/system-variables.md#tidb_low_resolution_tso). |
| [`tidb_opt_ordering_index_selectivity_ratio`](/system-variables.md#tidb_opt_ordering_index_selectivity_ratio-new-in-v800) | Newly added | Controls the estimated number of rows for an index that matches the SQL statement `ORDER BY` when there are `ORDER BY` and `LIMIT` clauses in a SQL statement, but some filter conditions not covered by the index. The default value is `-1`, which means to disable this system variable. |
terry1purcell marked this conversation as resolved.
Show resolved Hide resolved
| [`tidb_opt_use_invisible_indexes`](/system-variables.md#tidb_opt_use_invisible_indexes-new-in-v800) | Newly added | Controls whether the optimizer can select [invisible indexes](/sql-statements/sql-statement-create-index.md#invisible-index) for query optimization in the current session. When the variable is set to `ON`, the optimizer can select invisible indexes for query optimization in the session. |
| [`tidb_schema_cache_size`](/system-variables.md#tidb_schema_cache_size-new-in-v800) | Newly added | Controls the upper limit of memory that can be used for caching the schema information to avoid occupying too much memory. When this feature is enabled, the LRU algorithm is used to cache the required tables, effectively reducing the memory occupied by the schema information. |

Expand Down
104 changes: 104 additions & 0 deletions system-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -4243,6 +4243,110 @@ mysql> desc select count(distinct a) from test.t;
- The real-time statistics are the total number of rows and the number of modified rows that are automatically updated based on DML statements. When this variable is set to `moderate` (default), TiDB generates the execution plan based on real-time statistics. When this variable is set to `determinate`, TiDB does not use real-time statistics for generating the execution plan, which will make execution plans more stable.
- For long-term stable OLTP workload, or if the user is affirmative on the existing execution plans, it is recommended to use the `determinate` mode to reduce the possibility of unexpected execution plan changes. Additionally, you can use the [`LOCK STATS`](/sql-statements/sql-statement-lock-stats.md) to prevent the statistics from being modified and further stabilize the execution plan.

### tidb_opt_ordering_index_selectivity_ratio <span class="version-mark">New in v8.0.0</span>

- Scope: SESSION | GLOBAL
- Persists to cluster: Yes
- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): Yes
- Type: Float
- Default value: `-1`
- Range: `[-1, 1]`
- This variable controls the estimated number of rows for an index that matches the SQL statement `ORDER BY` when there are `ORDER BY` and `LIMIT` clauses in a SQL statement, but does not cover some filter conditions.
terry1purcell marked this conversation as resolved.
Show resolved Hide resolved
- This addresses the same query patterns as the system variable [tidb_opt_ordering_index_selectivity_threshold](#tidb_opt_ordering_index_selectivity_threshold-new-in-v700).
- It differs in implementation by applying a ratio or percentage of the possible range that the qualified rows will be found.
terry1purcell marked this conversation as resolved.
Show resolved Hide resolved
- A value of `-1` (default) or less than `0` disables this ratio. Any value between `0` and `1` applies a ratio of 0% to 100% (for example, `0.5` corresponds to `50%`).
- In the following examples, the table `t` has a total of 1,000,000 rows. The same query is used, but different values for `tidb_opt_ordering_index_selectivity_ratio` are used. The query in the example contains a `WHERE` clause predicate that qualifies a small percentage of rows (9,000 out of 1,000,000). There is an index that supports the `ORDER BY a` (index `ia`), but the filter on `b` is not included in this index. Depending on the actual data distribution, the rows matching the `WHERE` clause and `LIMIT 1` might be found as the first row accessed when scanning the non-filtering index, or at worst, after nearly all the rows have been processed.
- Each example uses an index hint to demonstrate the impact on estRows. The final plan selection depends on the availability and cost of other plans.
- The first example uses the default value `-1`, which uses the existing estimation formula. By default, a small percentage of rows are scanned for estimation before the qualified rows are found.

```sql
> SET SESSION tidb_opt_ordering_index_selectivity_ratio = -1;

> EXPLAIN SELECT * FROM t USE INDEX (ia) WHERE b <= 9000 ORDER BY a LIMIT 1;
+-----------------------------------+---------+-----------+-----------------------+---------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------------+---------+-----------+-----------------------+---------------------------------+
| Limit_12 | 1.00 | root | | offset:0, count:1 |
| └─Projection_22 | 1.00 | root | | test.t.a, test.t.b, test.t.c |
| └─IndexLookUp_21 | 1.00 | root | | |
| ├─IndexFullScan_18(Build) | 109.20 | cop[tikv] | table:t, index:ia(a) | keep order:true |
| └─Selection_20(Probe) | 1.00 | cop[tikv] | | le(test.t.b, 9000) |
| └─TableRowIDScan_19 | 109.20 | cop[tikv] | table:t | keep order:false |
+-----------------------------------+---------+-----------+-----------------------+---------------------------------+
```

- The second example uses `0`, which assumes that 0% of rows will be scanned before the qualified rows are found.

```sql
> SET SESSION tidb_opt_ordering_index_selectivity_ratio = 0;

> EXPLAIN SELECT * FROM t USE INDEX (ia) WHERE b <= 9000 ORDER BY a LIMIT 1;
+-----------------------------------+---------+-----------+-----------------------+---------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------------+---------+-----------+-----------------------+---------------------------------+
| Limit_12 | 1.00 | root | | offset:0, count:1 |
| └─Projection_22 | 1.00 | root | | test.t.a, test.t.b, test.t.c |
| └─IndexLookUp_21 | 1.00 | root | | |
| ├─IndexFullScan_18(Build) | 1.00 | cop[tikv] | table:t, index:ia(a) | keep order:true |
| └─Selection_20(Probe) | 1.00 | cop[tikv] | | le(test.t.b, 9000) |
| └─TableRowIDScan_19 | 1.00 | cop[tikv] | table:t | keep order:false |
+-----------------------------------+---------+-----------+-----------------------+---------------------------------+
```

- The third example uses `0.1`, which assumes that 10% of rows will be scanned before the qualified rows are found. This condition is highly selective, with only 1% of rows meeting the condition. Therefore, in the worst-case scenario, it might be necessary to scan 99% of rows before finding the 1% that qualify. 10% of that 99% is approximately 9.9%, which is reflected in the estRows.

```sql
> SET SESSION tidb_opt_ordering_index_selectivity_ratio = 0.1;

> EXPLAIN SELECT * FROM t USE INDEX (ia) WHERE b <= 9000 ORDER BY a LIMIT 1;
+-----------------------------------+----------+-----------+-----------------------+---------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------------+----------+-----------+-----------------------+---------------------------------+
| Limit_12 | 1.00 | root | | offset:0, count:1 |
| └─Projection_22 | 1.00 | root | | test.t.a, test.t.b, test.t.c |
| └─IndexLookUp_21 | 1.00 | root | | |
| ├─IndexFullScan_18(Build) | 99085.21 | cop[tikv] | table:t, index:ia(a) | keep order:true |
| └─Selection_20(Probe) | 1.00 | cop[tikv] | | le(test.t.b, 9000) |
| └─TableRowIDScan_19 | 99085.21 | cop[tikv] | table:t | keep order:false |
+-----------------------------------+----------+-----------+-----------------------+---------------------------------+
```

- The fourth example uses `1.0`, which assumes that 100% of rows will be scanned before the qualified rows are found.

```sql
> SET SESSION tidb_opt_ordering_index_selectivity_ratio = 1;

> EXPLAIN SELECT * FROM t USE INDEX (ia) WHERE b <= 9000 ORDER BY a LIMIT 1;
+-----------------------------------+-----------+-----------+-----------------------+---------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------------+-----------+-----------+-----------------------+---------------------------------+
| Limit_12 | 1.00 | root | | offset:0, count:1 |
| └─Projection_22 | 1.00 | root | | test.t.a, test.t.b, test.t.c |
| └─IndexLookUp_21 | 1.00 | root | | |
| ├─IndexFullScan_18(Build) | 990843.14 | cop[tikv] | table:t, index:ia(a) | keep order:true |
| └─Selection_20(Probe) | 1.00 | cop[tikv] | | le(test.t.b, 9000) |
| └─TableRowIDScan_19 | 990843.14 | cop[tikv] | table:t | keep order:false |
+-----------------------------------+-----------+-----------+-----------------------+---------------------------------+
```

- The fifth example also uses `1.0`, but adds a predicate on `a`, limiting the scan range in the worst-case scenario. This is because `WHERE a <= 9000` matches the index, with approximately 9,000 rows would qualify. Given that the filter predicate on `b` is not in the index, all the approximately 9,000 rows are considered to be scanned before finding a row that matches `b <= 9000`.

```sql
> SET SESSION tidb_opt_ordering_index_selectivity_ratio = 1;

> EXPLAIN SELECT * FROM t USE INDEX (ia) WHERE a <= 9000 AND b <= 9000 ORDER BY a LIMIT 1;
+------------------------------------+---------+-----------+-----------------------+------------------------------------+
| id | estRows | task | access object | operator info |
+------------------------------------+---------+-----------+-----------------------+------------------------------------+
| Limit_12 | 1.00 | root | | offset:0, count:1 |
| └─Projection_22 | 1.00 | root | | test.t.a, test.t.b, test.t.c |
| └─IndexLookUp_21 | 1.00 | root | | |
| ├─IndexRangeScan_18(Build) | 9074.99 | cop[tikv] | table:t, index:ia(a) | range:[-inf,9000], keep order:true |
| └─Selection_20(Probe) | 1.00 | cop[tikv] | | le(test.t.b, 9000) |
| └─TableRowIDScan_19 | 9074.99 | cop[tikv] | table:t | keep order:false |
+------------------------------------+---------+-----------+-----------------------+------------------------------------+
```

### tidb_opt_ordering_index_selectivity_threshold <span class="version-mark">New in v7.0.0</span>

- Scope: SESSION | GLOBAL
Expand Down