planner: introduce a configurable rule to adjust the optimizer's tendency between `large IndexLookup` and `large FullScan` to make index selection more stable #45132

qw4990 · 2023-07-03T10:26:25Z

Enhancement

The current optimizer's cost model tends to avoid using large IndexLookup, because it may trigger massive requests, which is resource-consuming and may slow down the whole system.

But this tendency is not always correct, it can lead to some unreasonable large FullScan plans:

create table t (a int, b int, key(a));
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (2, 2);  -- 100 : 1

insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
analyze table t;

mysql> explain select * from t where a=2;
+-------------------------+------------+-----------+---------------+------------------+
| id                      | estRows    | task      | access object | operator info    |
+-------------------------+------------+-----------+---------------+------------------+
| TableReader_7           | 16305.14   | root      |               | data:Selection_6 |
| └─Selection_6           | 16305.14   | cop[tikv] |               | eq(test.t.a, 2)  |
|   └─TableFullScan_5     | 1654784.00 | cop[tikv] | table:t       | keep order:false |
+-------------------------+------------+-----------+---------------+------------------+
3 rows in set (0.00 sec)

mysql> explain select /*+ use_index(t, a) */ * from t where a=2;
+-------------------------------+----------+-----------+---------------------+-------------------------------+
| id                            | estRows  | task      | access object       | operator info                 |
+-------------------------------+----------+-----------+---------------------+-------------------------------+
| IndexLookUp_7                 | 16305.14 | root      |                     |                               |
| ├─IndexRangeScan_5(Build)     | 16305.14 | cop[tikv] | table:t, index:a(a) | range:[2,2], keep order:false |
| └─TableRowIDScan_6(Probe)     | 16305.14 | cop[tikv] | table:t             | keep order:false              |
+-------------------------------+----------+-----------+---------------------+-------------------------------+
3 rows in set (0.01 sec)

In the case above, the optimizer selects a 1654784-row FullScan instead of a 16305-row IndexLookup.
We believe in most cases 16305-row IndexLookup should be better than 1654784-row FullScan, but the optimizer tends to use FullScan here.

To make index selection more stable, we decided to introduce a rule to adjust this tendency.
This rule is based on the estRows of FullScan and IndexLookup, if their ratio is larger than a threshold (e.g. FullScan-Rows / IndexLookup-Rows > 100, we bypass the cost model and use IndexLookup directly.

The text was updated successfully, but these errors were encountered:

) close #45132

…gcap#46559) close pingcap#45132

qw4990 added type/enhancement sig/planner SIG: Planner labels Jul 3, 2023

qw4990 assigned qw4990 and time-and-fate Jul 3, 2023

qw4990 mentioned this issue Aug 31, 2023

planner: introduce a new empirical rule into the Skyline pruning #46559

Merged

12 tasks

ti-chi-bot bot closed this as completed in #46559 Sep 1, 2023

ti-chi-bot bot pushed a commit that referenced this issue Sep 1, 2023

planner: introduce a new empirical rule into the Skyline pruning (#46559

80da849

) close #45132

yibin87 pushed a commit to yibin87/tidb that referenced this issue Oct 31, 2023

planner: introduce a new empirical rule into the Skyline pruning (pin…

5c477bd

…gcap#46559) close pingcap#45132

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

planner: introduce a configurable rule to adjust the optimizer's tendency between `large IndexLookup` and `large FullScan` to make index selection more stable #45132

planner: introduce a configurable rule to adjust the optimizer's tendency between `large IndexLookup` and `large FullScan` to make index selection more stable #45132

qw4990 commented Jul 3, 2023 •

edited

planner: introduce a configurable rule to adjust the optimizer's tendency between large IndexLookup and large FullScan to make index selection more stable #45132

planner: introduce a configurable rule to adjust the optimizer's tendency between large IndexLookup and large FullScan to make index selection more stable #45132

Comments

qw4990 commented Jul 3, 2023 • edited

Enhancement

planner: introduce a configurable rule to adjust the optimizer's tendency between `large IndexLookup` and `large FullScan` to make index selection more stable #45132

planner: introduce a configurable rule to adjust the optimizer's tendency between `large IndexLookup` and `large FullScan` to make index selection more stable #45132

qw4990 commented Jul 3, 2023 •

edited