Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion configure-time-zone.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,4 +108,4 @@ In this example, no matter how you adjust the value of the time zone, the value
> **Note:**
>
> - Time zone is involved during the conversion of the value of Timestamp and Datetime, which is handled based on the current `time_zone` of the session.
> - For data migration, you need to pay special attention to the time zone setting of the master database and the slave database.
> - For data migration, you need to pay special attention to the time zone setting of the primary database and the secondary database.
6 changes: 3 additions & 3 deletions faq/tidb-faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -649,9 +649,9 @@ This is because the address in the startup parameter has been registered in the

To solve this problem, use the [`store delete`](https://github.com/pingcap/pd/tree/55db505e8f35e8ab4e00efd202beb27a8ecc40fb/tools/pd-ctl#store-delete--label--weight-store_id----jqquery-string) function to delete the previous store and then restart TiKV.

#### TiKV master and slave use the same compression algorithm, why the results are different?
#### TiKV leader replicas and follower replicas use the same compression algorithm. Why the amount of disk space occupied is different?

Currently, some files of TiKV master have a higher compression rate, which depends on the underlying data distribution and RocksDB implementation. It is normal that the data size fluctuates occasionally. The underlying storage engine adjusts data as needed.
TiKV stores data in the LSM tree, in which each layer has a different compression algorithm. If two replicas of the same data are located in different layers in two TiKV nodes, the two replicas might occupy different space.

#### What are the features of TiKV block cache?

Expand Down Expand Up @@ -755,7 +755,7 @@ At the beginning, many users tend to do a benchmark test or a comparison test be
#### What's the relationship between the TiDB cluster capacity (QPS) and the number of nodes? How does TiDB compare to MySQL?

- Within 10 nodes, the relationship between TiDB write capacity (Insert TPS) and the number of nodes is roughly 40% linear increase. Because MySQL uses single-node write, its write capacity cannot be scaled.
- In MySQL, the read capacity can be increased by adding slave, but the write capacity cannot be increased except using sharding, which has many problems.
- In MySQL, the read capacity can be increased by adding replicas, but the write capacity cannot be increased except using sharding, which has many problems.
- In TiDB, both the read and write capacity can be easily increased by adding more nodes.

#### The performance test of MySQL and TiDB by our DBA shows that the performance of a standalone TiDB is not as good as MySQL
Expand Down
24 changes: 12 additions & 12 deletions functions-and-operators/expressions-pushed-down.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ aliases: ['/docs/v3.0/functions-and-operators/expressions-pushed-down/','/docs/v

# List of Expressions for Pushdown

When TiDB reads data from TiKV, TiDB tries to push down some expressions (including calculations of functions or operators) to be processed to TiKV. This reduces the amount of transferred data and offloads processing from a single TiDB node. This document introduces the expressions that TiDB already supports pushing down and how to prohibit specific expressions from being pushed down using blacklist.
When TiDB reads data from TiKV, TiDB tries to push down some expressions (including calculations of functions or operators) to be processed to TiKV. This reduces the amount of transferred data and offloads processing from a single TiDB node. This document introduces the expressions that TiDB already supports pushing down and how to prohibit specific expressions from being pushed down using blocklist.

## Supported expressions for pushdown

Expand All @@ -19,31 +19,31 @@ When TiDB reads data from TiKV, TiDB tries to push down some expressions (includ
| [JSON functions](/functions-and-operators/json-functions.md) | [JSON_TYPE(json_val)][json_type],<br/> [JSON_EXTRACT(json_doc, path[, path] ...)][json_extract],<br/> [JSON_UNQUOTE(json_val)][json_unquote],<br/> [JSON_OBJECT(key, val[, key, val] ...)][json_object],<br/> [JSON_ARRAY([val[, val] ...])][json_array],<br/> [JSON_MERGE(json_doc, json_doc[, json_doc] ...)][json_merge],<br/> [JSON_SET(json_doc, path, val[, path, val] ...)][json_set],<br/> [JSON_INSERT(json_doc, path, val[, path, val] ...)][json_insert],<br/> [JSON_REPLACE(json_doc, path, val[, path, val] ...)][json_replace],<br/> [JSON_REMOVE(json_doc, path[, path] ...)][json_remove] |
| [Date and time functions](/functions-and-operators/date-and-time-functions.md) | [`DATE_FORMAT()`](https://dev.mysql.com/doc/refman/5.7/en/date-and-time-functions.html#function_date-format) |

## Blacklist specific expressions
## Blocklist specific expressions

If unexpected behavior occurs during the calculation of a function caused by its pushdown, you can quickly restore the application by blacklisting that function. Specifically, you can prohibit an expression from being pushed down by adding the corresponding functions or operator to the blacklist `mysql.expr_pushdown_blacklist`.
If unexpected behavior occurs during the calculation of a function caused by its pushdown, you can quickly restore the application by blocklisting that function. Specifically, you can prohibit an expression from being pushed down by adding the corresponding functions or operator to the blocklist `mysql.expr_pushdown_blacklist`.

### Add to the blacklist
### Add to the blocklist

To add one or more functions or operators to the blacklist, perform the following steps:
To add one or more functions or operators to the blocklist, perform the following steps:

1. Insert the function or operator name to `mysql.expr_pushdown_blacklist`.

2. Execute the `admin reload expr_pushdown_blacklist;` command.

### Remove from the blacklist
### Remove from the blocklist

To remove one or more functions or operators from the blacklist, perform the following steps:
To remove one or more functions or operators from the blocklist, perform the following steps:

1. Delete the function or operator name in `mysql.expr_pushdown_blacklist`.

2. Execute the `admin reload expr_pushdown_blacklist;` command.

### Blacklist usage examples
### blocklist usage examples

The following example demonstrates how to add the `<` and `>` operators to the blacklist, then remove `>` from the blacklist.
The following example demonstrates how to add the `<` and `>` operators to the blocklist, then remove `>` from the blocklist.

You can see whether the blacklist takes effect by checking the results returned by `EXPLAIN` statement (See [Understanding `EXPLAIN` results](/query-execution-plan.md)).
You can see whether the blocklist takes effect by checking the results returned by `EXPLAIN` statement (See [Understanding `EXPLAIN` results](/query-execution-plan.md)).

```sql
tidb> create table t(a int);
Expand Down Expand Up @@ -97,8 +97,8 @@ tidb> explain select * from t where a < 2 and a > 2;
> **Note:**
>
> - `admin reload expr_pushdown_blacklist` only takes effect on the TiDB server that executes this SQL statement. To make it apply to all TiDB servers, execute the SQL statement on each TiDB server.
> - The feature of blacklisting specific expressions is supported in TiDB 3.0.0 or later versions.
> - TiDB 3.0.3 or earlier versions does not support adding some of the operators (such as ">", "+", "is null") to the blacklist by using their original names. You need to use their aliases (case-sensitive) instead, as shown in the following table:
> - The feature of blocklisting specific expressions is supported in TiDB 3.0.0 or later versions.
> - TiDB 3.0.3 or earlier versions does not support adding some of the operators (such as ">", "+", "is null") to the blocklist by using their original names. You need to use their aliases (case-sensitive) instead, as shown in the following table:

| Operator Name | Aliases |
| :-------- | :---------- |
Expand Down
10 changes: 5 additions & 5 deletions geo-redundancy-deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,23 +49,23 @@ However, the disadvantage is that if the 2 DCs within the same city goes down, w

## 2-DC + Binlog replication deployment solution

The 2-DC + Binlog replication is similar to the MySQL Master-Slave solution. 2 complete sets of TiDB clusters (each complete set of the TiDB cluster includes TiDB, PD and TiKV) are deployed in 2 DCs, one acts as the Master and one as the Slave. Under normal circumstances, the Master DC handles all the requests and the data written to the Master DC is asynchronously written to the Slave DC via Binlog.
The 2-DC + Binlog replication is similar to the MySQL Source-Replica solution. 2 complete sets of TiDB clusters (each complete set of the TiDB cluster includes TiDB, PD and TiKV) are deployed in 2 DCs, one acts as the primary and one as the secondary. Under normal circumstances, the primary DC handles all the requests and the data written to the primary DC is asynchronously written to the secondary DC via Binlog.

![Data Replication in 2-DC in 2 Cities Deployment](/media/deploy-binlog.png)

If the Master DC goes down, the requests can be switched to the slave cluster. Similar to MySQL, some data might be lost. But different from MySQL, this solution can ensure the high availability within the same DC: if some nodes within the DC are down, the online workloads won’t be impacted and no manual efforts are needed because the cluster will automatically re-elect leaders to provide services.
If the primary DC goes down, the requests can be switched to the secondary cluster. Similar to MySQL, some data might be lost. But different from MySQL, this solution can ensure the high availability within the same DC: if some nodes within the DC are down, the online workloads won’t be impacted and no manual efforts are needed because the cluster will automatically re-elect leaders to provide services.

![2-DC as a Mutual Backup Deployment](/media/deploy-backup.png)

Some of our production users also adopt the 2-DC multi-active solution, which means:

1. The application requests are separated and dispatched into 2 DCs.
2. Each DC has 1 cluster and each cluster has two databases: A Master database to serve part of the application requests and a Slave database to act as the backup of the other DC’s Master database. Data written into the Master database is replicated via Binlog to the Slave database in the other DC, forming a loop of backup.
2. Each DC has 1 cluster and each cluster has two databases: A primary database to serve part of the application requests and a secondary database to act as the backup of the other DC’s primary database. Data written into the primary database is replicated via Binlog to the secondary database in the other DC, forming a loop of backup.

Please be noted that for the 2-DC + Binlog replication solution, data is asynchronously replicated via Binlog. If the network latency between 2 DCs is too high, the data in the Slave cluster will fall much behind of the Master cluster. If the Master cluster goes down, some data will be lost and it cannot be guaranteed the lost data is within 5 minutes.
Please be noted that for the 2-DC + Binlog replication solution, data is asynchronously replicated via Binlog. If the network latency between 2 DCs is too high, the data in the secondary cluster will fall much behind of the primary cluster. If the primary cluster goes down, some data will be lost and it cannot be guaranteed the lost data is within 5 minutes.

## Overall analysis for HA and DR

For the 3-DC deployment solution and 3-DC in 2 cities solution, we can guarantee that the cluster will automatically recover, no human interference is needed and that the data is strongly consistent even if any one of the 3 DCs goes down. All the scheduling policies are to tune the performance, but availability is the top 1 priority instead of performance in case of an outage.

For 2-DC + Binlog replication solution, we can guarantee that the cluster will automatically recover, no human interference is needed and that the data is strongly consistent even if any some of the nodes within the Master cluster go down. When the entire Master cluster goes down, manual efforts will be needed to switch to the Slave and some data will be lost. The amount of the lost data depends on the network latency and is decided by the network condition.
For 2-DC + Binlog replication solution, we can guarantee that the cluster will automatically recover, no human interference is needed and that the data is strongly consistent even if any some of the nodes within the primary cluster go down. When the entire primary cluster goes down, manual efforts will be needed to switch to the secondary and some data will be lost. The amount of the lost data depends on the network latency and is decided by the network condition.
4 changes: 2 additions & 2 deletions key-features.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ Failure and self-healing operations are also transparent to applications. TiDB s

The storage in TiKV is automatically rebalanced to match changes in your workload. For example, if part of your data is more frequently accessed, this hotspot will be detected and may trigger the data to be rebalanced among other TiKV servers. Chunks of data ("Regions" in TiDB terminology) will automatically be split or merged as needed.

This helps remove some of the headaches associated with maintaining a large database cluster and also leads to better utilization over traditional master-slave read-write splitting that is commonly used with MySQL deployments.
This helps remove some of the headaches associated with maintaining a large database cluster and also leads to better utilization over traditional source-replica read-write splitting that is commonly used with MySQL deployments.

## Deployment and orchestration with Ansible, Kubernetes, Docker

Expand Down Expand Up @@ -96,4 +96,4 @@ TiDB has been released under the Apache 2.0 license since its initial launch in

TiDB implements the _Online, Asynchronous Schema Change_ algorithm first described in [Google's F1 paper](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41376.pdf).

In simplified terms, this means that TiDB is able to make changes to the schema across its distributed architecture without blocking either read or write operations. There is no need to use an external schema change tool or flip between masters and slaves as is common in large MySQL deployments.
In simplified terms, this means that TiDB is able to make changes to the schema across its distributed architecture without blocking either read or write operations. There is no need to use an external schema change tool or flip between sources and replicas as is common in large MySQL deployments.
12 changes: 6 additions & 6 deletions migrate-from-aurora-mysql-database.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,20 +106,20 @@ mysql-instances:
-
# ID of the upstream instance or the replication group. Refer to the configuration of `source_id` in the `inventory.ini` file or configuration of `source-id` in the `dm-master.toml` file.
source-id: "mysql-replica-01"
# The configuration item name of the black and white lists of the schema or table to be replicated, used to quote the global black and white lists configuration. For global configuration, see the `black-white-list` below.
black-white-list: "global"
# The configuration item name of the block and allow lists of the schema or table to be replicated, used to quote the global block and allow lists configuration. For global configuration, see the `block-allow-list` below.
block-allow-list: "global"
# The configuration item name of Mydumper, used to quote the global Mydumper configuration.
mydumper-config-name: "global"

-
source-id: "mysql-replica-02"
black-white-list: "global"
block-allow-list: "global"
mydumper-config-name: "global"

# The global configuration of black and white lists. Each instance can quote it by the configuration item name.
black-white-list:
# The global configuration of block and allow lists. Each instance can quote it by the configuration item name.
block-allow-list:
global:
do-tables: # The white list of the upstream table to be replicated
do-tables: # The allow list of the upstream table to be replicated
- db-name: "test_db" # The database name of the table to be replicated
tbl-name: "test_table" # The name of the table to be replicated

Expand Down
2 changes: 1 addition & 1 deletion online-deployment-using-ansible.md
Original file line number Diff line number Diff line change
Expand Up @@ -628,7 +628,7 @@ To enable the following control variables, use the capitalized `True`. To disabl
| tidb_version | the version of TiDB, configured by default in TiDB Ansible branches |
| process_supervision | the supervision way of processes, systemd by default, supervise optional |
| timezone | the global default time zone configured when a new TiDB cluster bootstrap is initialized; you can edit it later using the global `time_zone` system variable and the session `time_zone` system variable as described in [Time Zone Support](/configure-time-zone.md); the default value is `Asia/Shanghai` and see [the list of time zones](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) for more optional values |
| enable_firewalld | to enable the firewall, closed by default; to enable it, add the ports in [network requirements](/hardware-and-software-requirements.md#network-requirements) to the white list |
| enable_firewalld | to enable the firewall, closed by default; to enable it, add the ports in [network requirements](/hardware-and-software-requirements.md#network-requirements) to the allowlist |
| enable_ntpd | to monitor the NTP service of the managed node, True by default; do not close it |
| set_hostname | to edit the hostname of the managed node based on the IP, False by default |
| enable_binlog | whether to deploy Pump and enable the binlog, False by default, dependent on the Kafka cluster; see the `zookeeper_addrs` variable |
Expand Down
2 changes: 1 addition & 1 deletion releases/release-3.0-ga.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ On June 28, 2019, TiDB 3.0 GA is released. The corresponding TiDB Ansible versio
- Improve the performance of `admin show ddl jobs` by supporting scanning data in reverse order
- Add the `split table region` statement to manually split the table Region to alleviate hotspot issues
- Add the `split index region` statement to manually split the index Region to alleviate hotspot issues
- Add a blacklist to prohibit pushing down expressions to Coprocessor
- Add a blocklist to prohibit pushing down expressions to Coprocessor
- Optimize the `Expensive Query` log to print the SQL query in the log when it exceeds the configured limit of execution time or memory
+ DDL
- Support migrating from character set `utf8` to `utf8mb4`
Expand Down
2 changes: 1 addition & 1 deletion releases/release-3.0.0-rc.3.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ On June 21, 2019, TiDB 3.0.0-rc.3 is released. The corresponding TiDB Ansible ve
- Add the `split table region` statement to manually split the table Region to alleviate the hotspot issue [#10765](https://github.com/pingcap/tidb/pull/10765)
- Add the `split index region` statement to manually split the index Region to alleviate the hotspot issue [#10764](https://github.com/pingcap/tidb/pull/10764)
- Fix the incorrect execution issue when you execute multiple statements such as `create user`, `grant`, or `revoke` consecutively [#10737](https://github.com/pingcap/tidb/pull/10737)
- Add a blacklist to prohibit pushing down expressions to Coprocessor [#10791](https://github.com/pingcap/tidb/pull/10791)
- Add a blocklist to prohibit pushing down expressions to Coprocessor [#10791](https://github.com/pingcap/tidb/pull/10791)
- Add the feature of printing the `expensive query` log when a query exceeds the memory configuration limit [#10849](https://github.com/pingcap/tidb/pull/10849)
- Add the `bind-info-lease` configuration item to control the update time of the modified binding execution plan [#10727](https://github.com/pingcap/tidb/pull/10727)
- Fix the OOM issue in high concurrent scenarios caused by the failure to quickly release Coprocessor resources, resulted from the `execdetails.ExecDetails` pointer [#10832](https://github.com/pingcap/tidb/pull/10832)
Expand Down
Loading