diff --git a/configure-time-zone.md b/configure-time-zone.md index 73e439684255e..adec7bec4ab42 100644 --- a/configure-time-zone.md +++ b/configure-time-zone.md @@ -69,4 +69,4 @@ In this example, no matter how you adjust the value of the time zone, the value > **Note:** > > - Time zone is involved during the conversion of the value of Timestamp and Datetime, which is handled based on the current `time_zone` of the session. -> - For data migration, you need to pay special attention to the time zone setting of the master database and the slave database. +> - For data migration, you need to pay special attention to the time zone setting of the primary database and the secondary database. diff --git a/faq/tidb-faq.md b/faq/tidb-faq.md index 2cbb9185ff901..7d1a7e65045bd 100644 --- a/faq/tidb-faq.md +++ b/faq/tidb-faq.md @@ -649,9 +649,9 @@ This is because the address in the startup parameter has been registered in the To solve this problem, use the [`store delete`](https://github.com/pingcap/pd/tree/55db505e8f35e8ab4e00efd202beb27a8ecc40fb/tools/pd-ctl#store-delete--label--weight-store_id----jqquery-string) function to delete the previous store and then restart TiKV. -#### TiKV master and slave use the same compression algorithm, why the results are different? +#### TiKV leader replicas and follower replicas use the same compression algorithm. Why the amount of disk space occupied is different? -Currently, some files of TiKV master have a higher compression rate, which depends on the underlying data distribution and RocksDB implementation. It is normal that the data size fluctuates occasionally. The underlying storage engine adjusts data as needed. +TiKV stores data in the LSM tree, in which each layer has a different compression algorithm. If two replicas of the same data are located in different layers in two TiKV nodes, the two replicas might occupy different space. #### What are the features of TiKV block cache? @@ -755,7 +755,7 @@ At the beginning, many users tend to do a benchmark test or a comparison test be #### What's the relationship between the TiDB cluster capacity (QPS) and the number of nodes? How does TiDB compare to MySQL? - Within 10 nodes, the relationship between TiDB write capacity (Insert TPS) and the number of nodes is roughly 40% linear increase. Because MySQL uses single-node write, its write capacity cannot be scaled. -- In MySQL, the read capacity can be increased by adding slave, but the write capacity cannot be increased except using sharding, which has many problems. +- In MySQL, the read capacity can be increased by adding replicas, but the write capacity cannot be increased except using sharding, which has many problems. - In TiDB, both the read and write capacity can be easily increased by adding more nodes. #### The performance test of MySQL and TiDB by our DBA shows that the performance of a standalone TiDB is not as good as MySQL diff --git a/functions-and-operators/expressions-pushed-down.md b/functions-and-operators/expressions-pushed-down.md index e4ceb7a910239..c46ae713d3e95 100644 --- a/functions-and-operators/expressions-pushed-down.md +++ b/functions-and-operators/expressions-pushed-down.md @@ -6,7 +6,7 @@ aliases: ['/docs/v3.1/functions-and-operators/expressions-pushed-down/','/docs/v # List of Expressions for Pushdown -When TiDB reads data from TiKV, TiDB tries to push down some expressions (including calculations of functions or operators) to be processed to TiKV. This reduces the amount of transferred data and offloads processing from a single TiDB node. This document introduces the expressions that TiDB already supports pushing down and how to prohibit specific expressions from being pushed down using blacklist. +When TiDB reads data from TiKV, TiDB tries to push down some expressions (including calculations of functions or operators) to be processed to TiKV. This reduces the amount of transferred data and offloads processing from a single TiDB node. This document introduces the expressions that TiDB already supports pushing down and how to prohibit specific expressions from being pushed down using blocklist. ## Supported expressions for pushdown @@ -19,31 +19,31 @@ When TiDB reads data from TiKV, TiDB tries to push down some expressions (includ | [JSON functions](/functions-and-operators/json-functions.md) | [JSON_TYPE(json_val)][json_type],
[JSON_EXTRACT(json_doc, path[, path] ...)][json_extract],
[JSON_UNQUOTE(json_val)][json_unquote],
[JSON_OBJECT(key, val[, key, val] ...)][json_object],
[JSON_ARRAY([val[, val] ...])][json_array],
[JSON_MERGE(json_doc, json_doc[, json_doc] ...)][json_merge],
[JSON_SET(json_doc, path, val[, path, val] ...)][json_set],
[JSON_INSERT(json_doc, path, val[, path, val] ...)][json_insert],
[JSON_REPLACE(json_doc, path, val[, path, val] ...)][json_replace],
[JSON_REMOVE(json_doc, path[, path] ...)][json_remove] | | [Date and time functions](/functions-and-operators/date-and-time-functions.md) | [`DATE_FORMAT()`](https://dev.mysql.com/doc/refman/5.7/en/date-and-time-functions.html#function_date-format) | -## Blacklist specific expressions +## Blocklist specific expressions -If unexpected behavior occurs during the calculation of a function caused by its pushdown, you can quickly restore the application by blacklisting that function. Specifically, you can prohibit an expression from being pushed down by adding the corresponding functions or operator to the blacklist `mysql.expr_pushdown_blacklist`. +If unexpected behavior occurs during the calculation of a function caused by its pushdown, you can quickly restore the application by blocklisting that function. Specifically, you can prohibit an expression from being pushed down by adding the corresponding functions or operator to the blocklist `mysql.expr_pushdown_blacklist`. -### Add to the blacklist +### Add to the blocklist -To add one or more functions or operators to the blacklist, perform the following steps: +To add one or more functions or operators to the blocklist, perform the following steps: 1. Insert the function or operator name to `mysql.expr_pushdown_blacklist`. 2. Execute the `admin reload expr_pushdown_blacklist;` command. -### Remove from the blacklist +### Remove from the blocklist -To remove one or more functions or operators from the blacklist, perform the following steps: +To remove one or more functions or operators from the blocklist, perform the following steps: 1. Delete the function or operator name in `mysql.expr_pushdown_blacklist`. 2. Execute the `admin reload expr_pushdown_blacklist;` command. -### Blacklist usage examples +### blocklist usage examples -The following example demonstrates how to add the `<` and `>` operators to the blacklist, then remove `>` from the blacklist. +The following example demonstrates how to add the `<` and `>` operators to the blocklist, then remove `>` from the blocklist. -You can see whether the blacklist takes effect by checking the results returned by `EXPLAIN` statement (See [Understanding `EXPLAIN` results](/query-execution-plan.md)). +You can see whether the blocklist takes effect by checking the results returned by `EXPLAIN` statement (See [Understanding `EXPLAIN` results](/query-execution-plan.md)). ```sql tidb> create table t(a int); @@ -98,7 +98,7 @@ tidb> explain select * from t where a < 2 and a > 2; > > - `admin reload expr_pushdown_blacklist` only takes effect on the TiDB server that executes this SQL statement. To make it apply to all TiDB servers, execute the SQL statement on each TiDB server. > - The feature of blacklisting specific expressions is supported in TiDB 3.0.0 or later versions. -> - TiDB 3.0.3 or earlier versions does not support adding some of the operators (such as ">", "+", "is null") to the blacklist by using their original names. You need to use their aliases (case-sensitive) instead, as shown in the following table: +> - TiDB 3.0.3 or earlier versions does not support adding some of the operators (such as ">", "+", "is null") to the blocklist by using their original names. You need to use their aliases (case-sensitive) instead, as shown in the following table: | Operator Name | Aliases | | :-------- | :---------- | diff --git a/geo-redundancy-deployment.md b/geo-redundancy-deployment.md index 1119a9886dacb..abe9f32fcfd41 100644 --- a/geo-redundancy-deployment.md +++ b/geo-redundancy-deployment.md @@ -49,23 +49,23 @@ However, the disadvantage is that if the 2 DCs within the same city goes down, w ## 2-DC + Binlog replication deployment solution -The 2-DC + Binlog replication is similar to the MySQL Master-Slave solution. 2 complete sets of TiDB clusters (each complete set of the TiDB cluster includes TiDB, PD and TiKV) are deployed in 2 DCs, one acts as the Master and one as the Slave. Under normal circumstances, the Master DC handles all the requests and the data written to the Master DC is asynchronously written to the Slave DC via Binlog. +The 2-DC + Binlog replication is similar to the MySQL Source-Replica solution. 2 complete sets of TiDB clusters (each complete set of the TiDB cluster includes TiDB, PD and TiKV) are deployed in 2 DCs, one acts as the primary and one as the secondary. Under normal circumstances, the primary DC handles all the requests and the data written to the primary DC is asynchronously written to the secondary DC via Binlog. ![Data Replication in 2-DC in 2 Cities Deployment](/media/deploy-binlog.png) -If the Master DC goes down, the requests can be switched to the slave cluster. Similar to MySQL, some data might be lost. But different from MySQL, this solution can ensure the high availability within the same DC: if some nodes within the DC are down, the online workloads won’t be impacted and no manual efforts are needed because the cluster will automatically re-elect leaders to provide services. +If the primary DC goes down, the requests can be switched to the secondary cluster. Similar to MySQL, some data might be lost. But different from MySQL, this solution can ensure the high availability within the same DC: if some nodes within the DC are down, the online workloads won’t be impacted and no manual efforts are needed because the cluster will automatically re-elect leaders to provide services. ![2-DC as a Mutual Backup Deployment](/media/deploy-backup.png) Some of our production users also adopt the 2-DC multi-active solution, which means: 1. The application requests are separated and dispatched into 2 DCs. -2. Each DC has 1 cluster and each cluster has two databases: A Master database to serve part of the application requests and a Slave database to act as the backup of the other DC’s Master database. Data written into the Master database is replicated via Binlog to the Slave database in the other DC, forming a loop of backup. +2. Each DC has 1 cluster and each cluster has two databases: A primary database to serve part of the application requests and a secondary database to act as the backup of the other DC’s primary database. Data written into the primary database is replicated via Binlog to the secondary database in the other DC, forming a loop of backup. -Please be noted that for the 2-DC + Binlog replication solution, data is asynchronously replicated via Binlog. If the network latency between 2 DCs is too high, the data in the Slave cluster will fall much behind of the Master cluster. If the Master cluster goes down, some data will be lost and it cannot be guaranteed the lost data is within 5 minutes. +Please be noted that for the 2-DC + Binlog replication solution, data is asynchronously replicated via Binlog. If the network latency between 2 DCs is too high, the data in the secondary cluster will fall much behind of the primary cluster. If the primary cluster goes down, some data will be lost and it cannot be guaranteed the lost data is within 5 minutes. ## Overall analysis for HA and DR For the 3-DC deployment solution and 3-DC in 2 cities solution, we can guarantee that the cluster will automatically recover, no human interference is needed and that the data is strongly consistent even if any one of the 3 DCs goes down. All the scheduling policies are to tune the performance, but availability is the top 1 priority instead of performance in case of an outage. -For 2-DC + Binlog replication solution, we can guarantee that the cluster will automatically recover, no human interference is needed and that the data is strongly consistent even if any some of the nodes within the Master cluster go down. When the entire Master cluster goes down, manual efforts will be needed to switch to the Slave and some data will be lost. The amount of the lost data depends on the network latency and is decided by the network condition. +For 2-DC + Binlog replication solution, we can guarantee that the cluster will automatically recover, no human interference is needed and that the data is strongly consistent even if any some of the nodes within the primary cluster go down. When the entire primary cluster goes down, manual efforts will be needed to switch to the secondary and some data will be lost. The amount of the lost data depends on the network latency and is decided by the network condition. diff --git a/key-features.md b/key-features.md index 88a22dcf7067d..f3eeb08334563 100644 --- a/key-features.md +++ b/key-features.md @@ -52,7 +52,7 @@ Failure and self-healing operations are also transparent to applications. TiDB s The storage in TiKV is automatically rebalanced to match changes in your workload. For example, if part of your data is more frequently accessed, this hotspot will be detected and may trigger the data to be rebalanced among other TiKV servers. Chunks of data ("Regions" in TiDB terminology) will automatically be split or merged as needed. -This helps remove some of the headaches associated with maintaining a large database cluster and also leads to better utilization over traditional master-slave read-write splitting that is commonly used with MySQL deployments. +This helps remove some of the headaches associated with maintaining a large database cluster and also leads to better utilization over traditional source-replica read-write splitting that is commonly used with MySQL deployments. ## Deployment and orchestration with Ansible, Kubernetes, Docker diff --git a/migrate-from-aurora-mysql-database.md b/migrate-from-aurora-mysql-database.md index 3eea26b4b65ed..359b388949d0f 100644 --- a/migrate-from-aurora-mysql-database.md +++ b/migrate-from-aurora-mysql-database.md @@ -106,20 +106,20 @@ mysql-instances: - # ID of the upstream instance or the replication group. Refer to the configuration of `source_id` in the `inventory.ini` file or configuration of `source-id` in the `dm-master.toml` file. source-id: "mysql-replica-01" - # The configuration item name of the black and white lists of the schema or table to be replicated, used to quote the global black and white lists configuration. For global configuration, see the `black-white-list` below. - black-white-list: "global" + # The configuration item name of the block and allow lists of the schema or table to be replicated, used to quote the global block and allow lists configuration. For global configuration, see the `block-allow-list` below. + block-allow-list: "global" # The configuration item name of Mydumper, used to quote the global Mydumper configuration. mydumper-config-name: "global" - source-id: "mysql-replica-02" - black-white-list: "global" + block-allow-list: "global" mydumper-config-name: "global" -# The global configuration of black and white lists. Each instance can quote it by the configuration item name. -black-white-list: +# The global configuration of block and allow lists. Each instance can quote it by the configuration item name. +block-allow-list: global: - do-tables: # The white list of the upstream table to be replicated + do-tables: # The allow list of the upstream table to be replicated - db-name: "test_db" # The database name of the table to be replicated tbl-name: "test_table" # The name of the table to be replicated diff --git a/online-deployment-using-ansible.md b/online-deployment-using-ansible.md index 3aac2e6f17563..bf5f7c4908f50 100644 --- a/online-deployment-using-ansible.md +++ b/online-deployment-using-ansible.md @@ -628,7 +628,7 @@ To enable the following control variables, use the capitalized `True`. To disabl | tidb_version | the version of TiDB, configured by default in TiDB Ansible branches | | process_supervision | the supervision way of processes, systemd by default, supervise optional | | timezone | the global default time zone configured when a new TiDB cluster bootstrap is initialized; you can edit it later using the global `time_zone` system variable and the session `time_zone` system variable as described in [Time Zone Support](/configure-time-zone.md); the default value is `Asia/Shanghai` and see [the list of time zones](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones) for more optional values | -| enable_firewalld | to enable the firewall, closed by default; to enable it, add the ports in [network requirements](/hardware-and-software-requirements.md#network-requirements) to the white list | +| enable_firewalld | to enable the firewall, closed by default; to enable it, add the ports in [network requirements](/hardware-and-software-requirements.md#network-requirements) to the allowlist | | enable_ntpd | to monitor the NTP service of the managed node, True by default; do not close it | | set_hostname | to edit the hostname of the managed node based on the IP, False by default | | enable_binlog | whether to deploy Pump and enable the binlog, False by default, dependent on the Kafka cluster; see the `zookeeper_addrs` variable | diff --git a/releases/release-3.0-ga.md b/releases/release-3.0-ga.md index 5d9bffd5d0701..d71583aeb1723 100644 --- a/releases/release-3.0-ga.md +++ b/releases/release-3.0-ga.md @@ -62,7 +62,7 @@ On June 28, 2019, TiDB 3.0 GA is released. The corresponding TiDB Ansible versio - Improve the performance of `admin show ddl jobs` by supporting scanning data in reverse order - Add the `split table region` statement to manually split the table Region to alleviate hotspot issues - Add the `split index region` statement to manually split the index Region to alleviate hotspot issues - - Add a blacklist to prohibit pushing down expressions to Coprocessor + - Add a blocklist to prohibit pushing down expressions to Coprocessor - Optimize the `Expensive Query` log to print the SQL query in the log when it exceeds the configured limit of execution time or memory + DDL - Support migrating from character set `utf8` to `utf8mb4` diff --git a/releases/release-3.0.0-rc.3.md b/releases/release-3.0.0-rc.3.md index 9019e4415ffd9..c05763fc7aafc 100644 --- a/releases/release-3.0.0-rc.3.md +++ b/releases/release-3.0.0-rc.3.md @@ -36,7 +36,7 @@ On June 21, 2019, TiDB 3.0.0-rc.3 is released. The corresponding TiDB Ansible ve - Add the `split table region` statement to manually split the table Region to alleviate the hotspot issue [#10765](https://github.com/pingcap/tidb/pull/10765) - Add the `split index region` statement to manually split the index Region to alleviate the hotspot issue [#10764](https://github.com/pingcap/tidb/pull/10764) - Fix the incorrect execution issue when you execute multiple statements such as `create user`, `grant`, or `revoke` consecutively [#10737](https://github.com/pingcap/tidb/pull/10737) - - Add a blacklist to prohibit pushing down expressions to Coprocessor [#10791](https://github.com/pingcap/tidb/pull/10791) + - Add a blocklist to prohibit pushing down expressions to Coprocessor [#10791](https://github.com/pingcap/tidb/pull/10791) - Add the feature of printing the `expensive query` log when a query exceeds the memory configuration limit [#10849](https://github.com/pingcap/tidb/pull/10849) - Add the `bind-info-lease` configuration item to control the update time of the modified binding execution plan [#10727](https://github.com/pingcap/tidb/pull/10727) - Fix the OOM issue in high concurrent scenarios caused by the failure to quickly release Coprocessor resources, resulted from the `execdetails.ExecDetails` pointer [#10832](https://github.com/pingcap/tidb/pull/10832) diff --git a/releases/release-3.0.4.md b/releases/release-3.0.4.md index 3d14f181fadd4..39354f1f80afb 100644 --- a/releases/release-3.0.4.md +++ b/releases/release-3.0.4.md @@ -63,7 +63,7 @@ TiDB Ansible version: 3.0.4 - Support using aliases for tables in the point queries (for example, `select * from t tmp where a = "aa"`) [#12282](https://github.com/pingcap/tidb/pull/12282) - Fix the error occurred when not handling negative values as unsigned when inserting negative numbers into BIT type columns [#12423](https://github.com/pingcap/tidb/pull/12423) - Fix the incorrectly rounding of time (for example, `2019-09-11 11:17:47.999999666` should be rounded to `2019-09-11 11:17:48`.) [#12258](https://github.com/pingcap/tidb/pull/12258) - - Refine the usage of expression blacklist (for example, `<` is equivalent to `It`.) [#11975](https://github.com/pingcap/tidb/pull/11975) + - Refine the usage of expression blocklist (for example, `<` is equivalent to `It`.) [#11975](https://github.com/pingcap/tidb/pull/11975) - Add the database prefix to the message of non-existing function error (for example, `[expression:1305]FUNCTION test.std_samp does not exist`) [#12111](https://github.com/pingcap/tidb/pull/12111) - Server - Add the `Prev_stmt` field in slow query logs to output the previous statement when the last statement is `COMMIT` [#12180](https://github.com/pingcap/tidb/pull/12180) diff --git a/sql-mode.md b/sql-mode.md index f2699eba61750..399d4f9b16d55 100644 --- a/sql-mode.md +++ b/sql-mode.md @@ -31,7 +31,7 @@ Ensure that you have `SUPER` privilege when setting SQL mode at `GLOBAL` level, | `IGNORE_SPACE` | If this mode is enabled, the system ignores space. For example: "user" and "user " are the same. (full support)| | `ONLY_FULL_GROUP_BY` | If a non-aggregated column that is referred to in `SELECT`, `HAVING`, or `ORDER BY` is absent in `GROUP BY`, this SQL statement is invalid, because it is abnormal for a column to be absent in `GROUP BY` but displayed by query. (full support) | | `NO_UNSIGNED_SUBTRACTION` | Does not mark the result as `UNSIGNED` if an operand has no symbol in subtraction. (full support)| -| `NO_DIR_IN_CREATE` | Ignores all `INDEX DIRECTORY` and `DATA DIRECTORY` directives when a table is created. This option is only useful for slave replication servers (syntax support only) | +| `NO_DIR_IN_CREATE` | Ignores all `INDEX DIRECTORY` and `DATA DIRECTORY` directives when a table is created. This option is only useful for secondary replication servers (syntax support only) | | `NO_KEY_OPTIONS` | When you use the `SHOW CREATE TABLE` statement, MySQL-specific syntaxes such as `ENGINE` are not exported. Consider this option when migrating across DB types using mysqldump. (syntax support only)| | `NO_FIELD_OPTIONS` | When you use the `SHOW CREATE TABLE` statement, MySQL-specific syntaxes such as `ENGINE` are not exported. Consider this option when migrating across DB types using mysqldump. (syntax support only) | | `NO_TABLE_OPTIONS` | When you use the `SHOW CREATE TABLE` statement, MySQL-specific syntaxes such as `ENGINE` are not exported. Consider this option when migrating across DB types using mysqldump. (syntax support only)| diff --git a/sql-statements/sql-statement-recover-table.md b/sql-statements/sql-statement-recover-table.md index 692cfcdcb91bc..07de3bb90e0be 100644 --- a/sql-statements/sql-statement-recover-table.md +++ b/sql-statements/sql-statement-recover-table.md @@ -32,7 +32,7 @@ RECOVER TABLE BY JOB ddl_job_id > > - Binglog version is 3.0.1 or later. > - TiDB 3.0 is used both in the upstream cluster and the downstream cluster. -> - The GC life time of the slave cluster must be longer than that of the master cluster. However, as latency occurs during data replication between upstream and downstream databases, data recovery might fail in the downstream. +> - The GC life time of the secondary cluster must be longer than that of the primary cluster. However, as latency occurs during data replication between upstream and downstream databases, data recovery might fail in the downstream. ### Troubleshoot errors during TiDB Binlog replication diff --git a/syncer-overview.md b/syncer-overview.md index af6d29fe5ab76..b786c1c340296 100644 --- a/syncer-overview.md +++ b/syncer-overview.md @@ -78,7 +78,7 @@ Usage of syncer: -safe-mode to specify and enable the safe mode to make Syncer reentrant -server-id int - to specify MySQL slave sever-id (default 101) + to specify MySQL replica sever-id (default 101) -status-addr string to specify Syncer metrics (default :8271), such as `--status-addr 127:0.0.1:8271` -timezone string @@ -361,7 +361,7 @@ Before replicating data using Syncer, check the following items: > **Note:** > - > If there is a master-slave replication structure between the upstream MySQL/MariaDB servers, then choose the following version. + > If there is a source/replica replication structure between the upstream MySQL/MariaDB servers, then choose the following version. > > - 5.7.1 < MySQL version < 8.0 > - MariaDB version >= 10.1.3 diff --git a/tidb-binlog/tidb-binlog-faq.md b/tidb-binlog/tidb-binlog-faq.md index 7511030803f22..29c681a104035 100644 --- a/tidb-binlog/tidb-binlog-faq.md +++ b/tidb-binlog/tidb-binlog-faq.md @@ -122,9 +122,9 @@ If the data in the downstream is not affected, you can redeploy Drainer on the n 2. To restore the latest data of the backup file, use Reparo to set `start-tso` = {snapshot timestamp of the full backup + 1} and `end-ts` = 0 (or you can specify a point in time). -## How to redeploy Drainer when enabling `ignore-error` in Master-Slave replication triggers a critical error? +## How to redeploy Drainer when enabling `ignore-error` in Primary-Secondary replication triggers a critical error? -If a critical error is trigged when TiDB fails to write binlog after enabling `ignore-error`, TiDB stops writing binlog and binlog data loss occurs. To resume replication, perform the following steps: +If a critical error is triggered when TiDB fails to write binlog after enabling `ignore-error`, TiDB stops writing binlog and binlog data loss occurs. To resume replication, perform the following steps: 1. Stop the Drainer instance. diff --git a/tidb-lightning/tidb-lightning-table-filter.md b/tidb-lightning/tidb-lightning-table-filter.md deleted file mode 100644 index e8592ce094216..0000000000000 --- a/tidb-lightning/tidb-lightning-table-filter.md +++ /dev/null @@ -1,129 +0,0 @@ ---- -title: TiDB Lightning Table Filter -summary: Use black and white lists to filter out tables, ignoring them during import. -aliases: ['/docs/v3.1/tidb-lightning/tidb-lightning-table-filter/','/docs/v3.1/reference/tools/tidb-lightning/table-filter/'] ---- - -# TiDB Lightning Table Filter - -TiDB Lightning supports setting up black and white lists to ignore certain databases and tables. This can be used to skip cache tables, or manually partition the data source on a shared storage to allow multiple Lightning instances work together without interfering each other. - -The filtering rule is similar to MySQL `replication-rules-db`/`replication-rules-table`. - -## Filtering databases - -```toml -[black-white-list] -do-dbs = ["pattern1", "pattern2", "pattern3"] -ignore-dbs = ["pattern4", "pattern5"] -``` - -* If the `do-dbs` array in the `[black-white-list]` section is not empty, - * If the name of a database matches *any* pattern in the `do-dbs` array, the database is included. - * Otherwise, the database is skipped. -* Otherwise, if the name matches *any* pattern in the `ignore-dbs` array, the database is skipped. -* If a database’s name matches *both* the `do-dbs` and `ignore-dbs` arrays, the database is included. - -The pattern can either be a simple name, or a regular expression in [Go dialect](https://golang.org/pkg/regexp/syntax/#hdr-syntax) if it starts with a `~` character. - -> **Note:** -> -> The system databases `INFORMATION_SCHEMA`, `PERFORMANCE_SCHEMA`, `mysql` and `sys` are always black-listed regardless of the table filter settings. - -## Filtering tables - -```toml -[[black-white-list.do-tables]] -db-name = "db-pattern-1" -tbl-name = "table-pattern-1" - -[[black-white-list.do-tables]] -db-name = "db-pattern-2" -tbl-name = "table-pattern-2" - -[[black-white-list.do-tables]] -db-name = "db-pattern-3" -tbl-name = "table-pattern-3" - -[[black-white-list.ignore-tables]] -db-name = "db-pattern-4" -tbl-name = "table-pattern-4" - -[[black-white-list.ignore-tables]] -db-name = "db-pattern-5" -tbl-name = "table-pattern-5" -``` - -* If the `do-tables` array is not empty, - * If the qualified name of a table matched *any* pair of patterns in the `do-tables` array, the table is included. - * Otherwise, the table is skipped -* Otherwise, if the qualified name matched *any* pair of patterns in the `ignore-tables` array, the table is skipped. -* If a table’s qualified name matched *both* the `do-tables` and `ignore-tables` arrays, the table is included. - -Note that the database filtering rules are applied before Lightning considers the table filtering rules. This means if a database is ignored by `ignore-dbs`, all tables inside this database are not considered even if they matches any `do-tables` array. - -## Example - -To illustrate how these rules work, suppose the data source contains the following tables: - -``` -`logs`.`messages_2016` -`logs`.`messages_2017` -`logs`.`messages_2018` -`forum`.`users` -`forum`.`messages` -`forum_backup_2016`.`messages` -`forum_backup_2017`.`messages` -`forum_backup_2018`.`messages` -`admin`.`secrets` -``` - -Using this configuration: - -```toml -[black-white-list] -do-dbs = [ - "forum_backup_2018", # rule A - "~^(logs|forum)$", # rule B -] -ignore-dbs = [ - "~^forum_backup_", # rule C -] - -[[black-white-list.do-tables]] # rule D -db-name = "logs" -tbl-name = "~_2018$" - -[[black-white-list.ignore-tables]] # rule E -db-name = "~.*" -tbl-name = "~^messages.*" - -[[black-white-list.do-tables]] # rule F -db-name = "~^forum.*" -tbl-name = "messages" -``` - -First apply the database rules: - -| Database | Outcome | -|---------------------------|--------------------------------------------| -| `` `logs` `` | Included by rule B | -| `` `forum` `` | Included by rule B | -| `` `forum_backup_2016` `` | Skipped by rule C | -| `` `forum_backup_2017` `` | Skipped by rule C | -| `` `forum_backup_2018` `` | Included by rule A (rule C will not apply) | -| `` `admin` `` | Skipped since `do-dbs` is not empty and this does not match any pattern | - -Then apply the table rules: - -| Table | Outcome | -|--------------------------------------|--------------------------------------------| -| `` `logs`.`messages_2016` `` | Skipped by rule E | -| `` `logs`.`messages_2017` `` | Skipped by rule E | -| `` `logs`.`messages_2018` `` | Included by rule D (rule E will not apply) | -| `` `forum`.`users` `` | Skipped, since `do-tables` is not empty and this does not match any pattern | -| `` `forum`.`messages` `` | Included by rule F (rule E will not apply) | -| `` `forum_backup_2016`.`messages` `` | Skipped, since database is already skipped | -| `` `forum_backup_2017`.`messages` `` | Skipped, since database is already skipped | -| `` `forum_backup_2018`.`messages` `` | Included by rule F (rule E will not apply) | -| `` `admin`.`secrets` `` | Skipped, since database is already skipped |