Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: introduce a configurable rule to adjust the optimizer's tendency between large IndexLookup and large FullScan to make index selection more stable #45132

Closed
qw4990 opened this issue Jul 3, 2023 · 0 comments · Fixed by #46559
Assignees

Comments

@qw4990
Copy link
Contributor

qw4990 commented Jul 3, 2023

Enhancement

The current optimizer's cost model tends to avoid using large IndexLookup, because it may trigger massive requests, which is resource-consuming and may slow down the whole system.

But this tendency is not always correct, it can lead to some unreasonable large FullScan plans:

create table t (a int, b int, key(a));
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1), (1, 1);
insert into t values (2, 2);  -- 100 : 1

insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
insert into t select * from t;
analyze table t;

mysql> explain select * from t where a=2;
+-------------------------+------------+-----------+---------------+------------------+
| id                      | estRows    | task      | access object | operator info    |
+-------------------------+------------+-----------+---------------+------------------+
| TableReader_7           | 16305.14   | root      |               | data:Selection_6 |
| └─Selection_6           | 16305.14   | cop[tikv] |               | eq(test.t.a, 2)  |
|   └─TableFullScan_5     | 1654784.00 | cop[tikv] | table:t       | keep order:false |
+-------------------------+------------+-----------+---------------+------------------+
3 rows in set (0.00 sec)

mysql> explain select /*+ use_index(t, a) */ * from t where a=2;
+-------------------------------+----------+-----------+---------------------+-------------------------------+
| id                            | estRows  | task      | access object       | operator info                 |
+-------------------------------+----------+-----------+---------------------+-------------------------------+
| IndexLookUp_7                 | 16305.14 | root      |                     |                               |
| ├─IndexRangeScan_5(Build)     | 16305.14 | cop[tikv] | table:t, index:a(a) | range:[2,2], keep order:false |
| └─TableRowIDScan_6(Probe)     | 16305.14 | cop[tikv] | table:t             | keep order:false              |
+-------------------------------+----------+-----------+---------------------+-------------------------------+
3 rows in set (0.01 sec)

In the case above, the optimizer selects a 1654784-row FullScan instead of a 16305-row IndexLookup.
We believe in most cases 16305-row IndexLookup should be better than 1654784-row FullScan, but the optimizer tends to use FullScan here.

To make index selection more stable, we decided to introduce a rule to adjust this tendency.
This rule is based on the estRows of FullScan and IndexLookup, if their ratio is larger than a threshold (e.g. FullScan-Rows / IndexLookup-Rows > 100, we bypass the cost model and use IndexLookup directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants