diff --git a/TOC.md b/TOC.md index 0b0a72ab9675e..0a43f3a6e098a 100644 --- a/TOC.md +++ b/TOC.md @@ -106,6 +106,8 @@ - [User-Defined Variables](/reference/sql/language-structure/user-defined-variables.md) - [Expression Syntax](/reference/sql/language-structure/expression-syntax.md) - [Comment Syntax](/reference/sql/language-structure/comment-syntax.md) + + Attributes + - [`AUTO_RANDOM`](/reference/sql/attributes/auto-random.md) + Data Types - [Overview](/reference/sql/data-types/overview.md) - [Default Values](/reference/sql/data-types/default-values.md) @@ -510,4 +512,4 @@ - [RC3](/releases/rc3.md) - [RC2](/releases/rc2.md) - [RC1](/releases/rc1.md) -+ [Glossary](/glossary.md) \ No newline at end of file ++ [Glossary](/glossary.md) diff --git a/reference/best-practices/high-concurrency.md b/reference/best-practices/high-concurrency.md index 42c452bc0465f..fe6eae9b2fa8b 100644 --- a/reference/best-practices/high-concurrency.md +++ b/reference/best-practices/high-concurrency.md @@ -176,7 +176,9 @@ You can see that the apparent hotspot problem has been resolved now. In this case, the table is simple. In other cases, you might also need to consider the hotspot problem of index. For more details on how to pre-split the index Region, refer to [Split Region](/reference/sql/statements/split-region.md). -## Complex hotspot problem +## Complex hotspot problems + +**Problem one:** If a table does not have a primary key, or the primary key is not the `Int` type and you do not want to generate a randomly distributed primary key ID, TiDB provides an implicit `_tidb_rowid` column as the row ID. Generally, when you do not use the `SHARD_ROW_ID_BITS` parameter, the values of the `_tidb_rowid` column are also monotonically increasing, which might causes hotspots too. Refer to [`SHARD_ROW_ID_BITS` description](/reference/configuration/tidb-server/tidb-specific-variables.md#shard_row_id_bits) for more details. @@ -201,6 +203,12 @@ create table t (a int, b int) shard_row_id_bits = 4 pre_split_regions=ยท3; When data starts to be written into table `t`, the data is written into the pre-split 8 Regions, which avoids the hotspot problem that might be caused if only one Region exists after table creation. +**Problem two:** + +If a table's primary key is an integer type, and if the table uses `AUTO_INCREMENT` to ensure the uniqueness of the primary key (not necessarily continuous or incremental), you cannot use `SHARD_ROW_ID_BITS` to scatter the hotspot on this table because TiDB directly uses the row values of the primary key as `_tidb_rowid`. + +To address the problem in this scenario, you can replace `AUTO_INCREMENT` with [`AUTO_RANDOM`](/reference/sql/attributes/auto-random.md) (a column attribute) when inserting data. Then TiDB automatically assigns values to the integer primary key column, which eliminates the continuity of the row ID and scatters the hotspot. + ## Parameter configuration In v2.1, the [latch mechanism](/reference/configuration/tidb-server/configuration-file.md#txn-local-latches) is introduced in TiDB to identify transaction conflicts in advance in scenarios where write conflicts frequently appear. The aim is to reduce the retry of transaction commits in TiDB and TiKV caused by write conflicts. Generally, batch tasks use the data already stored in TiDB, so the write conflicts of transaction do not exist. In this situation, you can disable the latch in TiDB to reduce memory allocation for small objects: diff --git a/reference/configuration/tidb-server/configuration-file.md b/reference/configuration/tidb-server/configuration-file.md index bb4844f87d00f..314b3fce74cfe 100644 --- a/reference/configuration/tidb-server/configuration-file.md +++ b/reference/configuration/tidb-server/configuration-file.md @@ -390,8 +390,8 @@ Configuration related to the status of TiDB service ### `report-status` -- Enables or disables the HTTP API service -- Default value: true +- Enables or disables the HTTP API service. +- Default value: `true` ### `record-db-qps` @@ -404,10 +404,20 @@ Configurations related to the `events_statement_summary_by_digest` table ### max-stmt-count -- The maximum number of SQL categories allowed to be saved in the `events_statement_summary_by_digest` table -- Default value: 100 +- The maximum number of SQL categories allowed to be saved in the `events_statement_summary_by_digest` table. +- Default value: `100` ### max-sql-length -- The longest display length for the `DIGEST_TEXT` and `QUERY_SAMPLE_TEXT` columns in the `events_statement_summary_by_digest` table -- Default value: 4096 +- The longest display length for the `DIGEST_TEXT` and `QUERY_SAMPLE_TEXT` columns in the `events_statement_summary_by_digest` table. +- Default value: `4096` + +## experimental + +The `experimental` section describes configurations related to the experimental features of TiDB. This section is introduced since v3.1.0. + +### `allow-auto-random` New in v3.1.0 + +- Determines whether to allow using `AUTO_RANDOM`. +- Default value: `false` +- By default, TiDB does not support using `AUTO_RANDOM`. When the value is `true`, you cannot set `alter-primary-key` to `true` at the same time. diff --git a/reference/sql/attributes/auto-random.md b/reference/sql/attributes/auto-random.md new file mode 100644 index 0000000000000..79d8c5b9f27a1 --- /dev/null +++ b/reference/sql/attributes/auto-random.md @@ -0,0 +1,108 @@ +--- +title: AUTO_RANDOM +summary: Learn the AUTO_RANDOM attribute. +category: reference +--- + +# AUTO_RANDOM New in v3.1.0 + +> **Warning:** +> +> `AUTO_RANDOM` is still an experimental feature. It is **NOT** recommended that you use this attribute in the production environment. In later TiDB versions, the syntax or semantics of `AUTO_RANDOM` might change. + +Before using the `AUTO_RANDOM` attribute, set `allow-auto-random = true` in the `experimental` section of the TiDB configuration file. Refer to [`allow-auto-random`](/reference/configuration/tidb-server/configuration-file.md#allow-auto-random) for details. + +## User scenario + +When you write data intensively into TiDB and TiDB has the table with a primary key of the auto-increment integer type, hotspot issue might occur. To solve the hotspot issue, you can use the `AUTO_RANDOM` attribute. Refer to [Highly Concurrent Write Best Practices](/reference/best-practices/high-concurrency.md#complex-hotspot-problems) for details. + +Take the following created table as an example: + +{{< copyable "sql" >}} + +```sql +create table t (a int primary key auto_increment, b varchar(255)) +``` + +On this `t` table, you execute a large number of `INSERT` statements that do not specify the values of the primary key as below: + +{{< copyable "sql" >}} + +```sql +insert into t(b) values ('a'), ('b'), ('c') +``` + +In the above statement, values of the primary key (column `a`) are not specified, so TiDB uses the continuous auto-increment row values as the row IDs, which might cause write hotspot in a single TiKV node and affect the performance. To avoid such performance decrease, you can specify the `AUTO_RANDOM` attribute rather than the `AUTO_INCREMENT` attribute for the column `a` when you create the table. See the follow examples: + +{{< copyable "sql" >}} + +```sql +create table t (a int primary key auto_random, b varchar(255)) +``` + +or + +{{< copyable "sql" >}} + +```sql +create table t (a int auto_random, b varchar(255), primary key (a)) +``` + +Then execute the `INSERT` statement such as `INSERT INTO t(b) values...`. Now the results will be as follows: + ++ If the `INSERT` statement does not specify the values of the integer primary key column (column `a`), TiDB automatically assigns values to this column. These values are not necessarily auto-increment or continuous but are unique, which avoids the hotspot problem caused by continuous row IDs. ++ If the `INSERT` statement explicitly specifies the values of the integer primary key column, TiDB saves these values, which works similarly to the `AUTO_INCREMENT` attribute. + +TiDB automatically assigns values in the following way: + +The highest five digits of the row value in binary (namely, shard bits) are determined by the starting time of the current transaction. The remaining digits are assigned values in an auto-increment order. + +To use different number of shard bits, append a pair of parentheses to `AUTO_RANDOM` and specify the desired number of shard bits in the parentheses. See the following example: + +{{< copyable "sql" >}} + +```sql +create table t (a int primary key auto_random(3), b varchar(255)) +``` + +In the above `CREATE TABLE` statement, `3` shard bits are specified. The range of the number of shard bits is `[1, field_max_bits)`. `field_max_bits` is the length of bits occupied by the primary key column. + +For tables with the `AUTO_RANDOM` attribute, the value of the corresponding `TIDB_ROW_ID_SHARDING_INFO` column in the `information_schema.tables` system table is `PK_AUTO_RANDOM_BITS=x`. `x` is the number of shard bits. + +## Compatibility + +TiDB supports parsing the version comment syntax. See the following example: + +{{< copyable "sql" >}} + +```sql +create table t (a int primary key /*T!30100 auto_random */) +``` + +{{< copyable "sql" >}} + +```sql +create table t (a int primary key auto_random) +``` + +The above two statements have the same meaning. + +In the result of `show create table`, the `AUTO_RANDOM` attribute is commented out. This comment includes a version number (for example, `/*T!30100 auto_random */`). Here `30100` indicates that the `AUTO_RANDOM` attribute is introduced in v3.1.0. TiDB of a lower version ignores the `AUTO_RANDOM` attribute in the above comment. + +This attribute supports forward compatibility, namely, downgrade compatibility. TiDB earlier than v3.1.0 ignores the `AUTO_RANDOM` attribute of a table (with the above comment), so TiDB of earlier versions can also use the table with the attribute. + +## Restrictions + +Pay attention to the following restrictions when you use `AUTO_RANDOM`: + +- Specify this attribute for the primary key column **ONLY** of integer type. Otherwise, an error might occur. Refer to [Notes for `alter-primary-key`](#notes-for-alter-primary-key) for exception. +- You cannot use `ALTER TABLE` to modify the `AUTO_RANDOM` attribute, including adding or removing this attribute. +- You cannot change the column type of the primary key column that is specified with `AUTO_RANDOM` attribute. +- You cannot specify `AUTO_RANDOM` and `AUTO_INCREMENT` for the same column at the same time. +- You cannot specify `AUTO_RANDOM` and `DEFAULT` (the default value of a column) for the same column at the same time. +- It is **not** recommended that you explicitly specify a value for the column with the `AUTO_RANDOM` attribute when you insert data. Otherwise, the numeral values that can be automatically assigned for this table might be used up in advance. + +### Notes for `alter-primary-key` + +- When `alter-primary-key = true`, the `AUTO_RANDOM` attribute is not supported even if the primary key is the integer type. +- In the configuration file, `alter-primary-key`and `allow-auto-random` cannot be set to `true` at the same time. diff --git a/reference/sql/language-structure/comment-syntax.md b/reference/sql/language-structure/comment-syntax.md index 7dfad3fe94917..befc1d9a7c379 100644 --- a/reference/sql/language-structure/comment-syntax.md +++ b/reference/sql/language-structure/comment-syntax.md @@ -83,7 +83,7 @@ In TiDB, you can also use another version: SELECT STRAIGHT_JOIN col1 FROM table1,table2 WHERE ... ``` -If the server version number is specified in the comment, for example, `/*!50110 KEY_BLOCK_SIZE=1024 */`, in MySQL it means that the contents in this comment is processed only when the MySQL version is or higher than 5.1.10. But in TiDB, the version number does not work and all contents in the comment are processed. +If the server version number is specified in the comment, for example, `/*!50110 KEY_BLOCK_SIZE=1024 */`, in MySQL it means that the contents in this comment is processed only when the MySQL version is or higher than 5.1.10. But in TiDB, the MySQL version number does not work and all contents in the comment are processed. TiDB has its own comment syntax for the version number. The format of this syntax is `/*T!30100 XXX */`. Another type of comment is specially treated as the Hint optimizer: diff --git a/reference/sql/language-structure/keywords-and-reserved-words.md b/reference/sql/language-structure/keywords-and-reserved-words.md index fe649661ca64b..d608512c30d87 100644 --- a/reference/sql/language-structure/keywords-and-reserved-words.md +++ b/reference/sql/language-structure/keywords-and-reserved-words.md @@ -50,6 +50,7 @@ The following list shows the keywords and reserved words in TiDB. Most of the re - ASC (R) - ASCII - AUTO_INCREMENT +- AUTO_RANDOM - AVG - AVG_ROW_LENGTH diff --git a/reference/sql/statements/create-table.md b/reference/sql/statements/create-table.md index d222e3698f178..85d49d977df93 100644 --- a/reference/sql/statements/create-table.md +++ b/reference/sql/statements/create-table.md @@ -96,7 +96,8 @@ The `FULLTEXT` and `FOREIGN KEY` in `create_definition` are currently only suppo ```sql column_definition: data_type [NOT NULL | NULL] [DEFAULT default_value] - [AUTO_INCREMENT] [UNIQUE [KEY] | [PRIMARY] KEY] + [AUTO_INCREMENT | AUTO_RANDOM [(length)]] + [UNIQUE [KEY] | [PRIMARY] KEY] [COMMENT 'string'] [reference_definition] | data_type [GENERATED ALWAYS] AS (expression)