-
Notifications
You must be signed in to change notification settings - Fork 705
index: add document for partial index #21903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -53,6 +53,7 @@ IndexOption ::= | |||||||||
| 'COMMENT' stringLit | ||||||||||
| ("VISIBLE" | "INVISIBLE") | ||||||||||
| ("GLOBAL" | "LOCAL") | ||||||||||
| 'WHERE' Expression | ||||||||||
|
||||||||||
IndexTypeName ::= | ||||||||||
'BTREE' | ||||||||||
|
@@ -366,6 +367,138 @@ See [Index Selection - Use multi-valued indexes](/choose-index.md#use-multi-valu | |||||||||
- If a table uses multi-valued indexes, you cannot back up, replicate, or import the table using BR, TiCDC, or TiDB Lightning to a TiDB cluster earlier than v6.6.0. | ||||||||||
- For a query with complex conditions, TiDB might not be able to select multi-valued indexes. For information on the condition patterns supported by multi-valued indexes, refer to [Use multi-valued indexes](/choose-index.md#use-multi-valued-indexes). | ||||||||||
|
||||||||||
## Partial indexes | ||||||||||
|
||||||||||
A partial index is an index built on a subset of rows in a table, defined by a conditional expression (called the predicate of the partial index). The index contains entries only for those rows that satisfy the predicate. | ||||||||||
|
||||||||||
### Create partial indexes | ||||||||||
|
||||||||||
You can create partial indexes by adding a `WHERE` clause to the index definition. For example: | ||||||||||
|
||||||||||
```sql | ||||||||||
CREATE TABLE t1 (c1 INT, c2 INT, c3 TEXT); | ||||||||||
CREATE INDEX idx1 ON t1 (c1) WHERE c2 > 10; | ||||||||||
``` | ||||||||||
|
||||||||||
You can also create partial indexes using `ALTER TABLE`: | ||||||||||
|
||||||||||
```sql | ||||||||||
ALTER TABLE t1 ADD INDEX idx2 (c1, c2) WHERE c3 = 'abc'; | ||||||||||
``` | ||||||||||
|
||||||||||
Or specify the partial index when creating the table: | ||||||||||
|
||||||||||
```sql | ||||||||||
CREATE TABLE t2 ( | ||||||||||
id INT PRIMARY KEY, | ||||||||||
status VARCHAR(20), | ||||||||||
created_at DATETIME, | ||||||||||
INDEX idx_active_status (status) WHERE status = 'active' | ||||||||||
); | ||||||||||
``` | ||||||||||
|
||||||||||
### Usage examples | ||||||||||
|
||||||||||
The following examples demonstrate how to use partial indexes effectively: | ||||||||||
|
||||||||||
```sql | ||||||||||
-- Create a table with user data | ||||||||||
CREATE TABLE users ( | ||||||||||
id INT PRIMARY KEY AUTO_INCREMENT, | ||||||||||
name VARCHAR(100), | ||||||||||
status varchar(20), | ||||||||||
created_at DATETIME, | ||||||||||
score INT | ||||||||||
); | ||||||||||
|
||||||||||
-- Create partial indexes for common query patterns | ||||||||||
CREATE INDEX idx_active_users ON users (name) WHERE status = 'active'; | ||||||||||
CREATE INDEX idx_high_score_users ON users (created_at) WHERE score > 1000; | ||||||||||
CREATE INDEX idx_pending_status ON users (created_at) WHERE status = 'pending'; | ||||||||||
``` | ||||||||||
|
||||||||||
Then the following queries can use the partial index: | ||||||||||
|
||||||||||
``` | ||||||||||
mysql> explain SELECT * FROM users WHERE status = 'active' AND name = 'John'; | ||||||||||
+-------------------------------+---------+-----------+-------------------------------------------+-------------------------------------------------------+ | ||||||||||
| id | estRows | task | access object | operator info | | ||||||||||
+-------------------------------+---------+-----------+-------------------------------------------+-------------------------------------------------------+ | ||||||||||
| IndexLookUp_9 | 1.00 | root | | | | ||||||||||
| ├─IndexRangeScan_6(Build) | 10.00 | cop[tikv] | table:users, index:idx_active_users(name) | range:["John","John"], keep order:false, stats:pseudo | | ||||||||||
| └─Selection_8(Probe) | 1.00 | cop[tikv] | | eq(test.users.status, "active") | | ||||||||||
| └─TableRowIDScan_7 | 10.00 | cop[tikv] | table:users | keep order:false, stats:pseudo | | ||||||||||
+-------------------------------+---------+-----------+-------------------------------------------+-------------------------------------------------------+ | ||||||||||
4 rows in set (0.00 sec) | ||||||||||
|
||||||||||
mysql> explain SELECT * FROM users WHERE status = 'active' ORDER BY name; | ||||||||||
+-------------------------------+----------+-----------+-------------------------------------------+---------------------------------+ | ||||||||||
| id | estRows | task | access object | operator info | | ||||||||||
+-------------------------------+----------+-----------+-------------------------------------------+---------------------------------+ | ||||||||||
| IndexLookUp_18 | 10.00 | root | | | | ||||||||||
| ├─IndexFullScan_15(Build) | 10000.00 | cop[tikv] | table:users, index:idx_active_users(name) | keep order:true, stats:pseudo | | ||||||||||
| └─Selection_17(Probe) | 10.00 | cop[tikv] | | eq(test.users.status, "active") | | ||||||||||
| └─TableRowIDScan_16 | 10000.00 | cop[tikv] | table:users | keep order:false, stats:pseudo | | ||||||||||
+-------------------------------+----------+-----------+-------------------------------------------+---------------------------------+ | ||||||||||
4 rows in set (0.00 sec) | ||||||||||
|
||||||||||
mysql> explain SELECT * FROM users WHERE score > 10000 ORDER BY created_at; | ||||||||||
+-------------------------------+----------+-----------+-----------------------------------------------------+--------------------------------+ | ||||||||||
| id | estRows | task | access object | operator info | | ||||||||||
+-------------------------------+----------+-----------+-----------------------------------------------------+--------------------------------+ | ||||||||||
| IndexLookUp_18 | 3333.33 | root | | | | ||||||||||
| ├─IndexFullScan_15(Build) | 10000.00 | cop[tikv] | table:users, index:idx_high_score_users(created_at) | keep order:true, stats:pseudo | | ||||||||||
| └─Selection_17(Probe) | 3333.33 | cop[tikv] | | gt(test.users.score, 10000) | | ||||||||||
| └─TableRowIDScan_16 | 10000.00 | cop[tikv] | table:users | keep order:false, stats:pseudo | | ||||||||||
+-------------------------------+----------+-----------+-----------------------------------------------------+--------------------------------+ | ||||||||||
4 rows in set (0.00 sec) | ||||||||||
|
||||||||||
mysql> explain SELECT * FROM users WHERE status = 'pending'; | ||||||||||
+-------------------------+----------+-----------+---------------+----------------------------------+ | ||||||||||
| id | estRows | task | access object | operator info | | ||||||||||
+-------------------------+----------+-----------+---------------+----------------------------------+ | ||||||||||
| TableReader_8 | 10.00 | root | | data:Selection_7 | | ||||||||||
| └─Selection_7 | 10.00 | cop[tikv] | | eq(test.users.status, "pending") | | ||||||||||
| └─TableFullScan_6 | 10000.00 | cop[tikv] | table:users | keep order:false, stats:pseudo | | ||||||||||
+-------------------------+----------+-----------+---------------+----------------------------------+ | ||||||||||
3 rows in set (0.00 sec) | ||||||||||
``` | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This example correctly shows that the optimizer might not choose the partial index. To help users understand why, I suggest adding a note explaining that since the query doesn't filter or order by
Suggested change
|
||||||||||
|
||||||||||
If the predicates in query don't meet the index definition, the index will not be chosen even with hint: | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The phrase "don't meet the index definition" is a bit vague. For better clarity, I suggest rephrasing to explain that the query's
Suggested change
|
||||||||||
|
||||||||||
``` | ||||||||||
mysql> explain SELECT * FROM users use index(idx_high_score_users) WHERE score > 100 ORDER BY created_at; | ||||||||||
+---------------------------+----------+-----------+---------------+--------------------------------+ | ||||||||||
| id | estRows | task | access object | operator info | | ||||||||||
+---------------------------+----------+-----------+---------------+--------------------------------+ | ||||||||||
| Sort_5 | 3333.33 | root | | test.users.created_at | | ||||||||||
| └─TableReader_11 | 3333.33 | root | | data:Selection_10 | | ||||||||||
| └─Selection_10 | 3333.33 | cop[tikv] | | gt(test.users.score, 100) | | ||||||||||
| └─TableFullScan_9 | 10000.00 | cop[tikv] | table:users | keep order:false, stats:pseudo | | ||||||||||
+---------------------------+----------+-----------+---------------+--------------------------------+ | ||||||||||
``` | ||||||||||
|
||||||||||
### When to use partial indexes | ||||||||||
|
||||||||||
Partial indexes are particularly useful in the following scenarios: | ||||||||||
|
||||||||||
- **Selective filtering**: When you frequently query a small subset of rows based on specific conditions | ||||||||||
- **Conditional uniqueness**: When you need unique constraints that only apply under certain conditions | ||||||||||
|
||||||||||
### Limitations | ||||||||||
|
||||||||||
- The `WHERE` clause in partial indexes supports basic comparison operators (`=`, `!=`, `<`, `<=`, `>`, `>=`) and `IN` predicates with constant values | ||||||||||
- The types of the column and constant value should be the same | ||||||||||
- The predicate can only reference columns from the same table | ||||||||||
- Partial indexes cannot be used on expression indexes | ||||||||||
|
||||||||||
### Performance benefits | ||||||||||
|
||||||||||
Partial indexes offer several advantages: | ||||||||||
|
||||||||||
1. **Reduced storage**: Only rows matching the predicate are indexed, saving storage space | ||||||||||
2. **Faster DML**: It'll be faster to maintain the index of a subset of data during INSERT, UPDATE, and DELETE operations | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The use of "It'll" is informal. I suggest rephrasing for a more professional tone and improved clarity, and adding backticks to DML operations as per the style guide.
Suggested change
|
||||||||||
|
||||||||||
## Invisible index | ||||||||||
|
||||||||||
By default, invisible indexes are indexes that are ignored by the query optimizer: | ||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current phrasing "Then the following queries can use the partial index:" is slightly misleading, as the last example (
SELECT * FROM users WHERE status = 'pending';
) does not use the partial index. To improve clarity, I suggest explaining that for a partial index to be used, the query must benefit from the indexed columns.