From 75fad64af816983345ff4aafedc29e47111e3b8b Mon Sep 17 00:00:00 2001 From: Ran Date: Fri, 9 Jun 2023 09:40:47 +0800 Subject: [PATCH 1/2] This is an automated cherry-pick of #13734 Signed-off-by: ti-chi-bot --- tiflash/use-tiflash-mpp-mode.md | 192 ++++++++++++++++++++++++++++++++ 1 file changed, 192 insertions(+) create mode 100644 tiflash/use-tiflash-mpp-mode.md diff --git a/tiflash/use-tiflash-mpp-mode.md b/tiflash/use-tiflash-mpp-mode.md new file mode 100644 index 0000000000000..ee11423ed79e2 --- /dev/null +++ b/tiflash/use-tiflash-mpp-mode.md @@ -0,0 +1,192 @@ +--- +title: Use TiFlash MPP Mode +summary: Learn the MPP mode of TiFlash and how to use it. +--- + +# Use TiFlash MPP Mode + +This document introduces the MPP mode of TiFlash and how to use it. + +TiFlash supports using the MPP mode to execute queries, which introduces cross-node data exchange (data shuffle process) into the computation. TiDB automatically determines whether to select the MPP mode using the optimizer's cost estimation. You can change the selection strategy by modifying the values of [`tidb_allow_mpp`](/system-variables.md#tidb_allow_mpp-new-in-v50) and [`tidb_enforce_mpp`](/system-variables.md#tidb_enforce_mpp-new-in-v51). + +## Control whether to select the MPP mode + +The `tidb_allow_mpp` variable controls whether TiDB can select the MPP mode to execute queries. The `tidb_enforce_mpp` variable controls whether the optimizer's cost estimation is ignored and the MPP mode of TiFlash is forcibly used to execute queries. + +The results corresponding to all values of these two variables are as follows: + +| | tidb_allow_mpp=off | tidb_allow_mpp=on (by default) | +| ---------------------- | -------------------- | -------------------------------- | +| tidb_enforce_mpp=off (by default) | The MPP mode is not used. | The optimizer selects the MPP mode based on cost estimation. (by default)| +| tidb_enforce_mpp=on | The MPP mode is not used. | TiDB ignores the cost estimation and selects the MPP mode. | + +For example, if you do not want to use the MPP mode, you can execute the following statements: + +{{< copyable "sql" >}} + +```sql +set @@session.tidb_allow_mpp=0; +``` + +If you want TiDB's cost-based optimizer to automatically decide whether to use the MPP mode (by default), you can execute the following statements: + +{{< copyable "sql" >}} + +```sql +set @@session.tidb_allow_mpp=1; +set @@session.tidb_enforce_mpp=0; +``` + +If you want TiDB to ignore the optimizer's cost estimation and to forcibly select the MPP mode, you can execute the following statements: + +{{< copyable "sql" >}} + +```sql +set @@session.tidb_allow_mpp=1; +set @@session.tidb_enforce_mpp=1; +``` + + + +The initial value of the `tidb_enforce_mpp` session variable is equal to the [`enforce-mpp`](/tidb-configuration-file.md#enforce-mpp) configuration value of this tidb-server instance (which is `false` by default). If multiple tidb-server instances in a TiDB cluster only perform analytical queries and you want to make sure that the MPP mode is used on these instances, you can change their [`enforce-mpp`](/tidb-configuration-file.md#enforce-mpp) configuration values to `true`. + + + +> **Note:** +> +> When `tidb_enforce_mpp=1` takes effect, the TiDB optimizer will ignore the cost estimation to choose the MPP mode. However, if other factors block the MPP mode, TiDB will not select the MPP mode. These factors include the absence of TiFlash replica, unfinished replication of TiFlash replicas, and statements containing operators or functions that are not supported by the MPP mode. +> +> If TiDB optimizer cannot select the MPP mode due to reasons other than cost estimation, when you use the `EXPLAIN` statement to check out the execution plan, a warning is returned to explain the reason. For example: +> +> ```sql +> set @@session.tidb_enforce_mpp=1; +> create table t(a int); +> explain select count(*) from t; +> show warnings; +> ``` +> +> ``` +> +---------+------+-----------------------------------------------------------------------------+ +> | Level | Code | Message | +> +---------+------+-----------------------------------------------------------------------------+ +> | Warning | 1105 | MPP mode may be blocked because there aren't tiflash replicas of table `t`. | +> +---------+------+-----------------------------------------------------------------------------+ +> ``` + +## Algorithm support for the MPP mode + +The MPP mode supports these physical algorithms: Broadcast Hash Join, Shuffled Hash Join, Shuffled Hash Aggregation, Union All, TopN, and Limit. The optimizer automatically determines which algorithm to be used in a query. To check the specific query execution plan, you can execute the `EXPLAIN` statement. If the result of the `EXPLAIN` statement shows ExchangeSender and ExchangeReceiver operators, it indicates that the MPP mode has taken effect. + +The following statement takes the table structure in the TPC-H test set as an example: + +```sql +explain select count(*) from customer c join nation n on c.c_nationkey=n.n_nationkey; ++------------------------------------------+------------+-------------------+---------------+----------------------------------------------------------------------------+ +| id | estRows | task | access object | operator info | ++------------------------------------------+------------+-------------------+---------------+----------------------------------------------------------------------------+ +| HashAgg_23 | 1.00 | root | | funcs:count(Column#16)->Column#15 | +| └─TableReader_25 | 1.00 | root | | data:ExchangeSender_24 | +| └─ExchangeSender_24 | 1.00 | batchCop[tiflash] | | ExchangeType: PassThrough | +| └─HashAgg_12 | 1.00 | batchCop[tiflash] | | funcs:count(1)->Column#16 | +| └─HashJoin_17 | 3000000.00 | batchCop[tiflash] | | inner join, equal:[eq(tpch.nation.n_nationkey, tpch.customer.c_nationkey)] | +| ├─ExchangeReceiver_21(Build) | 25.00 | batchCop[tiflash] | | | +| │ └─ExchangeSender_20 | 25.00 | batchCop[tiflash] | | ExchangeType: Broadcast | +| │ └─TableFullScan_18 | 25.00 | batchCop[tiflash] | table:n | keep order:false | +| └─TableFullScan_22(Probe) | 3000000.00 | batchCop[tiflash] | table:c | keep order:false | ++------------------------------------------+------------+-------------------+---------------+----------------------------------------------------------------------------+ +9 rows in set (0.00 sec) +``` + +In the example execution plan, the `ExchangeReceiver` and `ExchangeSender` operators are included. The execution plan indicates that after the `nation` table is read, the `ExchangeSender` operator broadcasts the table to each node, the `HashJoin` and `HashAgg` operations are performed on the `nation` table and the `customer` table, and then the results are returned to TiDB. + +TiFlash provides the following two global/session variables to control whether to use Broadcast Hash Join: + +- [`tidb_broadcast_join_threshold_size`](/system-variables.md#tidb_broadcast_join_threshold_count-new-in-v50): The unit of the value is bytes. If the table size (in the unit of bytes) is less than the value of the variable, the Broadcast Hash Join algorithm is used. Otherwise, the Shuffled Hash Join algorithm is used. +- [`tidb_broadcast_join_threshold_count`](/system-variables.md#tidb_broadcast_join_threshold_count-new-in-v50): The unit of the value is rows. If the objects of the join operation belong to a subquery, the optimizer cannot estimate the size of the subquery result set, so the size is determined by the number of rows in the result set. If the estimated number of rows in the subquery is less than the value of this variable, the Broadcast Hash Join algorithm is used. Otherwise, the Shuffled Hash Join algorithm is used. + +## Access partitioned tables in the MPP mode + +To access partitioned tables in the MPP mode, you need to enable [dynamic pruning mode](https://docs.pingcap.com/tidb/stable/partitioned-table#dynamic-pruning-mode) first. + +Example: + +```sql +mysql> DROP TABLE if exists test.employees; +Query OK, 0 rows affected, 1 warning (0.00 sec) + +mysql> CREATE TABLE test.employees +(id int(11) NOT NULL, + fname varchar(30) DEFAULT NULL, + lname varchar(30) DEFAULT NULL, + hired date NOT NULL DEFAULT '1970-01-01', + separated date DEFAULT '9999-12-31', + job_code int DEFAULT NULL, + store_id int NOT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin +PARTITION BY RANGE (store_id) +(PARTITION p0 VALUES LESS THAN (6), + PARTITION p1 VALUES LESS THAN (11), + PARTITION p2 VALUES LESS THAN (16), + PARTITION p3 VALUES LESS THAN (MAXVALUE)); +Query OK, 0 rows affected (0.10 sec) + +mysql> ALTER table test.employees SET tiflash replica 1; +Query OK, 0 rows affected (0.09 sec) + +mysql> SET tidb_partition_prune_mode=static; +Query OK, 0 rows affected (0.00 sec) + +mysql> explain SELECT count(*) FROM test.employees; ++----------------------------------+----------+-------------------+-------------------------------+-----------------------------------+ +| id | estRows | task | access object | operator info | ++----------------------------------+----------+-------------------+-------------------------------+-----------------------------------+ +| HashAgg_18 | 1.00 | root | | funcs:count(Column#10)->Column#9 | +| └─PartitionUnion_20 | 4.00 | root | | | +| ├─StreamAgg_35 | 1.00 | root | | funcs:count(Column#12)->Column#10 | +| │ └─TableReader_36 | 1.00 | root | | data:StreamAgg_26 | +| │ └─StreamAgg_26 | 1.00 | batchCop[tiflash] | | funcs:count(1)->Column#12 | +| │ └─TableFullScan_34 | 10000.00 | batchCop[tiflash] | table:employees, partition:p0 | keep order:false, stats:pseudo | +| ├─StreamAgg_52 | 1.00 | root | | funcs:count(Column#14)->Column#10 | +| │ └─TableReader_53 | 1.00 | root | | data:StreamAgg_43 | +| │ └─StreamAgg_43 | 1.00 | batchCop[tiflash] | | funcs:count(1)->Column#14 | +| │ └─TableFullScan_51 | 10000.00 | batchCop[tiflash] | table:employees, partition:p1 | keep order:false, stats:pseudo | +| ├─StreamAgg_69 | 1.00 | root | | funcs:count(Column#16)->Column#10 | +| │ └─TableReader_70 | 1.00 | root | | data:StreamAgg_60 | +| │ └─StreamAgg_60 | 1.00 | batchCop[tiflash] | | funcs:count(1)->Column#16 | +| │ └─TableFullScan_68 | 10000.00 | batchCop[tiflash] | table:employees, partition:p2 | keep order:false, stats:pseudo | +| └─StreamAgg_86 | 1.00 | root | | funcs:count(Column#18)->Column#10 | +| └─TableReader_87 | 1.00 | root | | data:StreamAgg_77 | +| └─StreamAgg_77 | 1.00 | batchCop[tiflash] | | funcs:count(1)->Column#18 | +| └─TableFullScan_85 | 10000.00 | batchCop[tiflash] | table:employees, partition:p3 | keep order:false, stats:pseudo | ++----------------------------------+----------+-------------------+-------------------------------+-----------------------------------+ +18 rows in set (0,00 sec) + +mysql> SET tidb_partition_prune_mode=dynamic; +Query OK, 0 rows affected (0.00 sec) + +mysql> explain SELECT count(*) FROM test.employees; ++------------------------------+----------+--------------+-----------------+---------------------------------------------------------+ +| id | estRows | task | access object | operator info | ++------------------------------+----------+--------------+-----------------+---------------------------------------------------------+ +| HashAgg_17 | 1.00 | root | | funcs:count(Column#11)->Column#9 | +| └─TableReader_19 | 1.00 | root | partition:all | data:ExchangeSender_18 | +| └─ExchangeSender_18 | 1.00 | mpp[tiflash] | | ExchangeType: PassThrough | +| └─HashAgg_8 | 1.00 | mpp[tiflash] | | funcs:count(1)->Column#11 | +| └─TableFullScan_16 | 10000.00 | mpp[tiflash] | table:employees | keep order:false, stats:pseudo, PartitionTableScan:true | ++------------------------------+----------+--------------+-----------------+---------------------------------------------------------+ +5 rows in set (0,00 sec) +``` + +## Known issues of MPP + +In the current version, TiFlash uses the `start_ts` of a query as the unique key of the query. In most cases, the `start_ts` of each query can uniquely identify a query, but in the following cases, different queries have the same `start_ts`: + +- All queries in the same transaction have the same `start_ts`. +- When you use [`tidb_snapshot`](/system-variables.md#tidb_snapshot) to read data at a specific historical time point, the same time point is manually specified. +- When [Stale Read](/stale-read.md) is enabled, the same time point is manually specified. + +When `start_ts` cannot uniquely represent the MPP query, if TiFlash detects that different queries have the same `start_ts` at a given time, TiFlash might report an error. Typical error cases are as follows: + +- When multiple queries with the same `start_ts` are sent to TiFlash at the same time, you might encounter the `task has been registered` error. +- When multiple simple queries with `LIMIT` are executed continuously in the same transaction, once the `LIMIT` condition is met, TiDB sends a cancel request to TiFlash to cancel the query. This request also uses `start_ts` to identify the query to be canceled. If there are other queries with the same `start_ts` in TiFlash, these queries might be canceled by mistake. An example of this issue can be found in [#43426](https://github.com/pingcap/tidb/issues/43426). + +This issue is fixed in TiDB v6.6.0. It is recommended to use the [latest LTS version](https://docs.pingcap.com/tidb/stable). From 7fc0fe7449aac18ec03db6bea3f5e1b0b750b80e Mon Sep 17 00:00:00 2001 From: Ran Date: Fri, 9 Jun 2023 09:48:45 +0800 Subject: [PATCH 2/2] resolve conflict --- tiflash/use-tiflash-mpp-mode.md | 192 -------------------------------- tiflash/use-tiflash.md | 17 ++- 2 files changed, 16 insertions(+), 193 deletions(-) delete mode 100644 tiflash/use-tiflash-mpp-mode.md diff --git a/tiflash/use-tiflash-mpp-mode.md b/tiflash/use-tiflash-mpp-mode.md deleted file mode 100644 index ee11423ed79e2..0000000000000 --- a/tiflash/use-tiflash-mpp-mode.md +++ /dev/null @@ -1,192 +0,0 @@ ---- -title: Use TiFlash MPP Mode -summary: Learn the MPP mode of TiFlash and how to use it. ---- - -# Use TiFlash MPP Mode - -This document introduces the MPP mode of TiFlash and how to use it. - -TiFlash supports using the MPP mode to execute queries, which introduces cross-node data exchange (data shuffle process) into the computation. TiDB automatically determines whether to select the MPP mode using the optimizer's cost estimation. You can change the selection strategy by modifying the values of [`tidb_allow_mpp`](/system-variables.md#tidb_allow_mpp-new-in-v50) and [`tidb_enforce_mpp`](/system-variables.md#tidb_enforce_mpp-new-in-v51). - -## Control whether to select the MPP mode - -The `tidb_allow_mpp` variable controls whether TiDB can select the MPP mode to execute queries. The `tidb_enforce_mpp` variable controls whether the optimizer's cost estimation is ignored and the MPP mode of TiFlash is forcibly used to execute queries. - -The results corresponding to all values of these two variables are as follows: - -| | tidb_allow_mpp=off | tidb_allow_mpp=on (by default) | -| ---------------------- | -------------------- | -------------------------------- | -| tidb_enforce_mpp=off (by default) | The MPP mode is not used. | The optimizer selects the MPP mode based on cost estimation. (by default)| -| tidb_enforce_mpp=on | The MPP mode is not used. | TiDB ignores the cost estimation and selects the MPP mode. | - -For example, if you do not want to use the MPP mode, you can execute the following statements: - -{{< copyable "sql" >}} - -```sql -set @@session.tidb_allow_mpp=0; -``` - -If you want TiDB's cost-based optimizer to automatically decide whether to use the MPP mode (by default), you can execute the following statements: - -{{< copyable "sql" >}} - -```sql -set @@session.tidb_allow_mpp=1; -set @@session.tidb_enforce_mpp=0; -``` - -If you want TiDB to ignore the optimizer's cost estimation and to forcibly select the MPP mode, you can execute the following statements: - -{{< copyable "sql" >}} - -```sql -set @@session.tidb_allow_mpp=1; -set @@session.tidb_enforce_mpp=1; -``` - - - -The initial value of the `tidb_enforce_mpp` session variable is equal to the [`enforce-mpp`](/tidb-configuration-file.md#enforce-mpp) configuration value of this tidb-server instance (which is `false` by default). If multiple tidb-server instances in a TiDB cluster only perform analytical queries and you want to make sure that the MPP mode is used on these instances, you can change their [`enforce-mpp`](/tidb-configuration-file.md#enforce-mpp) configuration values to `true`. - - - -> **Note:** -> -> When `tidb_enforce_mpp=1` takes effect, the TiDB optimizer will ignore the cost estimation to choose the MPP mode. However, if other factors block the MPP mode, TiDB will not select the MPP mode. These factors include the absence of TiFlash replica, unfinished replication of TiFlash replicas, and statements containing operators or functions that are not supported by the MPP mode. -> -> If TiDB optimizer cannot select the MPP mode due to reasons other than cost estimation, when you use the `EXPLAIN` statement to check out the execution plan, a warning is returned to explain the reason. For example: -> -> ```sql -> set @@session.tidb_enforce_mpp=1; -> create table t(a int); -> explain select count(*) from t; -> show warnings; -> ``` -> -> ``` -> +---------+------+-----------------------------------------------------------------------------+ -> | Level | Code | Message | -> +---------+------+-----------------------------------------------------------------------------+ -> | Warning | 1105 | MPP mode may be blocked because there aren't tiflash replicas of table `t`. | -> +---------+------+-----------------------------------------------------------------------------+ -> ``` - -## Algorithm support for the MPP mode - -The MPP mode supports these physical algorithms: Broadcast Hash Join, Shuffled Hash Join, Shuffled Hash Aggregation, Union All, TopN, and Limit. The optimizer automatically determines which algorithm to be used in a query. To check the specific query execution plan, you can execute the `EXPLAIN` statement. If the result of the `EXPLAIN` statement shows ExchangeSender and ExchangeReceiver operators, it indicates that the MPP mode has taken effect. - -The following statement takes the table structure in the TPC-H test set as an example: - -```sql -explain select count(*) from customer c join nation n on c.c_nationkey=n.n_nationkey; -+------------------------------------------+------------+-------------------+---------------+----------------------------------------------------------------------------+ -| id | estRows | task | access object | operator info | -+------------------------------------------+------------+-------------------+---------------+----------------------------------------------------------------------------+ -| HashAgg_23 | 1.00 | root | | funcs:count(Column#16)->Column#15 | -| └─TableReader_25 | 1.00 | root | | data:ExchangeSender_24 | -| └─ExchangeSender_24 | 1.00 | batchCop[tiflash] | | ExchangeType: PassThrough | -| └─HashAgg_12 | 1.00 | batchCop[tiflash] | | funcs:count(1)->Column#16 | -| └─HashJoin_17 | 3000000.00 | batchCop[tiflash] | | inner join, equal:[eq(tpch.nation.n_nationkey, tpch.customer.c_nationkey)] | -| ├─ExchangeReceiver_21(Build) | 25.00 | batchCop[tiflash] | | | -| │ └─ExchangeSender_20 | 25.00 | batchCop[tiflash] | | ExchangeType: Broadcast | -| │ └─TableFullScan_18 | 25.00 | batchCop[tiflash] | table:n | keep order:false | -| └─TableFullScan_22(Probe) | 3000000.00 | batchCop[tiflash] | table:c | keep order:false | -+------------------------------------------+------------+-------------------+---------------+----------------------------------------------------------------------------+ -9 rows in set (0.00 sec) -``` - -In the example execution plan, the `ExchangeReceiver` and `ExchangeSender` operators are included. The execution plan indicates that after the `nation` table is read, the `ExchangeSender` operator broadcasts the table to each node, the `HashJoin` and `HashAgg` operations are performed on the `nation` table and the `customer` table, and then the results are returned to TiDB. - -TiFlash provides the following two global/session variables to control whether to use Broadcast Hash Join: - -- [`tidb_broadcast_join_threshold_size`](/system-variables.md#tidb_broadcast_join_threshold_count-new-in-v50): The unit of the value is bytes. If the table size (in the unit of bytes) is less than the value of the variable, the Broadcast Hash Join algorithm is used. Otherwise, the Shuffled Hash Join algorithm is used. -- [`tidb_broadcast_join_threshold_count`](/system-variables.md#tidb_broadcast_join_threshold_count-new-in-v50): The unit of the value is rows. If the objects of the join operation belong to a subquery, the optimizer cannot estimate the size of the subquery result set, so the size is determined by the number of rows in the result set. If the estimated number of rows in the subquery is less than the value of this variable, the Broadcast Hash Join algorithm is used. Otherwise, the Shuffled Hash Join algorithm is used. - -## Access partitioned tables in the MPP mode - -To access partitioned tables in the MPP mode, you need to enable [dynamic pruning mode](https://docs.pingcap.com/tidb/stable/partitioned-table#dynamic-pruning-mode) first. - -Example: - -```sql -mysql> DROP TABLE if exists test.employees; -Query OK, 0 rows affected, 1 warning (0.00 sec) - -mysql> CREATE TABLE test.employees -(id int(11) NOT NULL, - fname varchar(30) DEFAULT NULL, - lname varchar(30) DEFAULT NULL, - hired date NOT NULL DEFAULT '1970-01-01', - separated date DEFAULT '9999-12-31', - job_code int DEFAULT NULL, - store_id int NOT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin -PARTITION BY RANGE (store_id) -(PARTITION p0 VALUES LESS THAN (6), - PARTITION p1 VALUES LESS THAN (11), - PARTITION p2 VALUES LESS THAN (16), - PARTITION p3 VALUES LESS THAN (MAXVALUE)); -Query OK, 0 rows affected (0.10 sec) - -mysql> ALTER table test.employees SET tiflash replica 1; -Query OK, 0 rows affected (0.09 sec) - -mysql> SET tidb_partition_prune_mode=static; -Query OK, 0 rows affected (0.00 sec) - -mysql> explain SELECT count(*) FROM test.employees; -+----------------------------------+----------+-------------------+-------------------------------+-----------------------------------+ -| id | estRows | task | access object | operator info | -+----------------------------------+----------+-------------------+-------------------------------+-----------------------------------+ -| HashAgg_18 | 1.00 | root | | funcs:count(Column#10)->Column#9 | -| └─PartitionUnion_20 | 4.00 | root | | | -| ├─StreamAgg_35 | 1.00 | root | | funcs:count(Column#12)->Column#10 | -| │ └─TableReader_36 | 1.00 | root | | data:StreamAgg_26 | -| │ └─StreamAgg_26 | 1.00 | batchCop[tiflash] | | funcs:count(1)->Column#12 | -| │ └─TableFullScan_34 | 10000.00 | batchCop[tiflash] | table:employees, partition:p0 | keep order:false, stats:pseudo | -| ├─StreamAgg_52 | 1.00 | root | | funcs:count(Column#14)->Column#10 | -| │ └─TableReader_53 | 1.00 | root | | data:StreamAgg_43 | -| │ └─StreamAgg_43 | 1.00 | batchCop[tiflash] | | funcs:count(1)->Column#14 | -| │ └─TableFullScan_51 | 10000.00 | batchCop[tiflash] | table:employees, partition:p1 | keep order:false, stats:pseudo | -| ├─StreamAgg_69 | 1.00 | root | | funcs:count(Column#16)->Column#10 | -| │ └─TableReader_70 | 1.00 | root | | data:StreamAgg_60 | -| │ └─StreamAgg_60 | 1.00 | batchCop[tiflash] | | funcs:count(1)->Column#16 | -| │ └─TableFullScan_68 | 10000.00 | batchCop[tiflash] | table:employees, partition:p2 | keep order:false, stats:pseudo | -| └─StreamAgg_86 | 1.00 | root | | funcs:count(Column#18)->Column#10 | -| └─TableReader_87 | 1.00 | root | | data:StreamAgg_77 | -| └─StreamAgg_77 | 1.00 | batchCop[tiflash] | | funcs:count(1)->Column#18 | -| └─TableFullScan_85 | 10000.00 | batchCop[tiflash] | table:employees, partition:p3 | keep order:false, stats:pseudo | -+----------------------------------+----------+-------------------+-------------------------------+-----------------------------------+ -18 rows in set (0,00 sec) - -mysql> SET tidb_partition_prune_mode=dynamic; -Query OK, 0 rows affected (0.00 sec) - -mysql> explain SELECT count(*) FROM test.employees; -+------------------------------+----------+--------------+-----------------+---------------------------------------------------------+ -| id | estRows | task | access object | operator info | -+------------------------------+----------+--------------+-----------------+---------------------------------------------------------+ -| HashAgg_17 | 1.00 | root | | funcs:count(Column#11)->Column#9 | -| └─TableReader_19 | 1.00 | root | partition:all | data:ExchangeSender_18 | -| └─ExchangeSender_18 | 1.00 | mpp[tiflash] | | ExchangeType: PassThrough | -| └─HashAgg_8 | 1.00 | mpp[tiflash] | | funcs:count(1)->Column#11 | -| └─TableFullScan_16 | 10000.00 | mpp[tiflash] | table:employees | keep order:false, stats:pseudo, PartitionTableScan:true | -+------------------------------+----------+--------------+-----------------+---------------------------------------------------------+ -5 rows in set (0,00 sec) -``` - -## Known issues of MPP - -In the current version, TiFlash uses the `start_ts` of a query as the unique key of the query. In most cases, the `start_ts` of each query can uniquely identify a query, but in the following cases, different queries have the same `start_ts`: - -- All queries in the same transaction have the same `start_ts`. -- When you use [`tidb_snapshot`](/system-variables.md#tidb_snapshot) to read data at a specific historical time point, the same time point is manually specified. -- When [Stale Read](/stale-read.md) is enabled, the same time point is manually specified. - -When `start_ts` cannot uniquely represent the MPP query, if TiFlash detects that different queries have the same `start_ts` at a given time, TiFlash might report an error. Typical error cases are as follows: - -- When multiple queries with the same `start_ts` are sent to TiFlash at the same time, you might encounter the `task has been registered` error. -- When multiple simple queries with `LIMIT` are executed continuously in the same transaction, once the `LIMIT` condition is met, TiDB sends a cancel request to TiFlash to cancel the query. This request also uses `start_ts` to identify the query to be canceled. If there are other queries with the same `start_ts` in TiFlash, these queries might be canceled by mistake. An example of this issue can be found in [#43426](https://github.com/pingcap/tidb/issues/43426). - -This issue is fixed in TiDB v6.6.0. It is recommended to use the [latest LTS version](https://docs.pingcap.com/tidb/stable). diff --git a/tiflash/use-tiflash.md b/tiflash/use-tiflash.md index be94259049cba..6bd08a2072985 100644 --- a/tiflash/use-tiflash.md +++ b/tiflash/use-tiflash.md @@ -98,7 +98,7 @@ Before TiFlash replicas are added, each TiKV instance performs a full table scan ```sql SET CONFIG tikv `server.snap-max-write-bytes-per-sec` = '300MiB'; ``` - + After adjusting the preceding configurations, you cannot observe the acceleration for now, as the replication speed is still restricted by the PD limit globally. 2. Use [PD Control](https://docs.pingcap.com/tidb/stable/pd-control) to progressively ease the new replica speed limit. @@ -499,6 +499,21 @@ TiFlash provides the following two global/session variables to control whether t - [`tidb_broadcast_join_threshold_size`](/system-variables.md#tidb_broadcast_join_threshold_count-new-in-v50): The unit of the value is bytes. If the table size (in the unit of bytes) is less than the value of the variable, the Broadcast Hash Join algorithm is used. Otherwise, the Shuffled Hash Join algorithm is used. - [`tidb_broadcast_join_threshold_count`](/system-variables.md#tidb_broadcast_join_threshold_count-new-in-v50): The unit of the value is rows. If the objects of the join operation belong to a subquery, the optimizer cannot estimate the size of the subquery result set, so the size is determined by the number of rows in the result set. If the estimated number of rows in the subquery is less than the value of this variable, the Broadcast Hash Join algorithm is used. Otherwise, the Shuffled Hash Join algorithm is used. +### Known issues of MPP + +In the current version, TiFlash uses the `start_ts` of a query as the unique key of the query. In most cases, the `start_ts` of each query can uniquely identify a query, but in the following cases, different queries have the same `start_ts`: + +- All queries in the same transaction have the same `start_ts`. +- When you use [`tidb_snapshot`](/system-variables.md#tidb_snapshot) to read data at a specific historical time point, the same time point is manually specified. +- When [Stale Read](/stale-read.md) is enabled, the same time point is manually specified. + +When `start_ts` cannot uniquely represent the MPP query, if TiFlash detects that different queries have the same `start_ts` at a given time, TiFlash might report an error. Typical error cases are as follows: + +- When multiple queries with the same `start_ts` are sent to TiFlash at the same time, you might encounter the `task has been registered` error. +- When multiple simple queries with `LIMIT` are executed continuously in the same transaction, once the `LIMIT` condition is met, TiDB sends a cancel request to TiFlash to cancel the query. This request also uses `start_ts` to identify the query to be canceled. If there are other queries with the same `start_ts` in TiFlash, these queries might be canceled by mistake. An example of this issue can be found in [#43426](https://github.com/pingcap/tidb/issues/43426). + +This issue is fixed in TiDB v6.6.0. It is recommended to use the [latest LTS version](https://docs.pingcap.com/tidb/stable). + ## Data validation ### User scenarios