Skip to content
Permalink
Browse files

*: change "synchronize" to "replicate" for migration related descript…

…ion (#1203)

* *: change wording for data migration related description

* rename a file

* capitalize some words

* fix an alias

* fix a dead link

* fix the dead link

* fix a typo

* capitalize two words

* try to fix a possible dead link

* capitalize a word

* add an alias and capitalize some words

* fix wording

* update wrong wording

* capitalize two words

* update the cases of some words

* resolve morgan's comments

* toc: remove unnecessary /
  • Loading branch information...
yikeke authored and lilin90 committed Jun 15, 2019
1 parent 87fceb3 commit 159bdf98d32ccb1d7c5fbff7a44ee908e31d91c1
Showing with 761 additions and 761 deletions.
  1. +2 −2 TOC.md
  2. +1 −1 dev/architecture.md
  3. +1 −1 dev/etc/Drainer.json
  4. +1 −1 dev/etc/Syncer.json
  5. +11 −11 dev/faq/tidb.md
  6. +6 −6 dev/how-to/deploy/data-migration-with-ansible.md
  7. +4 −4 dev/how-to/get-started/data-migration.md
  8. +6 −6 dev/how-to/migrate/incrementally-from-mysql.md
  9. +1 −1 dev/how-to/migrate/overview.md
  10. +9 −9 dev/how-to/troubleshoot/data-migration.md
  11. +13 −13 dev/reference/tools/data-migration/cluster-operations.md
  12. +5 −5 dev/reference/tools/data-migration/configure/overview.md
  13. +8 −8 dev/reference/tools/data-migration/configure/task-configuration-file.md
  14. +22 −22 dev/reference/tools/data-migration/deploy.md
  15. +6 −6 dev/reference/tools/data-migration/dm-worker-intro.md
  16. +18 −18 dev/reference/tools/data-migration/features/manually-handling-sharding-ddl-locks.md
  17. +34 −34 dev/reference/tools/data-migration/features/overview.md
  18. +31 −31 dev/reference/tools/data-migration/manage-tasks.md
  19. +1 −1 dev/reference/tools/data-migration/monitor.md
  20. +24 −24 dev/reference/tools/data-migration/overview.md
  21. +14 −14 dev/reference/tools/data-migration/query-status.md
  22. +16 −16 dev/reference/tools/data-migration/relay-log.md
  23. +14 −14 dev/reference/tools/data-migration/usage-scenarios/shard-merge.md
  24. +18 −18 ...eference/tools/data-migration/usage-scenarios/{simple-synchronization.md → simple-replication.md}
  25. +3 −3 dev/reference/tools/loader.md
  26. +3 −3 dev/reference/tools/sync-diff-inspector.md
  27. +1 −1 dev/reference/tools/syncer.md
  28. +1 −1 dev/reference/tools/tidb-binlog/binlog-slave-client.md
  29. +6 −6 dev/reference/tools/tidb-binlog/deploy.md
  30. +4 −4 dev/reference/tools/tidb-binlog/monitor.md
  31. +5 −5 dev/reference/tools/tidb-binlog/overview.md
  32. +8 −8 dev/reference/tools/tidb-binlog/tidb-binlog-kafka.md
  33. +6 −6 dev/reference/tools/tidb-binlog/tidb-binlog-local.md
  34. +3 −3 dev/reference/tools/tidb-binlog/upgrade.md
  35. +1 −1 dev/releases/2.0ga.md
  36. +1 −1 dev/releases/2.1.8.md
  37. +1 −1 dev/releases/201.md
  38. +1 −1 dev/releases/21rc1.md
  39. +1 −1 dev/releases/2rc3.md
  40. +2 −2 dev/releases/3.0.0-rc.1.md
  41. +5 −5 v1.0/FAQ.md
  42. +1 −1 v1.0/etc/Drainer.json
  43. +3 −3 v1.0/op-guide/migration-overview.md
  44. +12 −12 v1.0/op-guide/migration.md
  45. +3 −3 v1.0/tools/loader.md
  46. +43 −43 v1.0/tools/syncer.md
  47. +8 −8 v1.0/tools/tidb-binlog-kafka.md
  48. +6 −6 v1.0/tools/tidb-binlog.md
  49. +10 −10 v2.0/FAQ.md
  50. +1 −1 v2.0/_index.md
  51. +1 −1 v2.0/etc/Drainer.json
  52. +3 −3 v2.0/op-guide/migration-overview.md
  53. +12 −12 v2.0/op-guide/migration.md
  54. +1 −1 v2.0/releases/2.0ga.md
  55. +1 −1 v2.0/releases/21rc1.md
  56. +1 −1 v2.0/releases/2rc3.md
  57. +3 −3 v2.0/tools/loader.md
  58. +43 −43 v2.0/tools/syncer.md
  59. +8 −8 v2.0/tools/tidb-binlog-kafka.md
  60. +6 −6 v2.0/tools/tidb-binlog.md
  61. +9 −9 v2.1/FAQ.md
  62. +1 −1 v2.1/README.md
  63. +1 −1 v2.1/architecture.md
  64. +1 −1 v2.1/etc/Drainer.json
  65. +6 −6 v2.1/op-guide/cross-dc-deployment.md
  66. +3 −3 v2.1/op-guide/migration-overview.md
  67. +12 −12 v2.1/op-guide/migration.md
  68. +1 −1 v2.1/releases/2.0ga.md
  69. +1 −1 v2.1/releases/21rc1.md
  70. +1 −1 v2.1/releases/2rc3.md
  71. +1 −1 v2.1/sql/admin.md
  72. +1 −1 v2.1/tools/binlog-slave-client.md
  73. +15 −15 v2.1/tools/data-migration-cluster-operations.md
  74. +34 −34 v2.1/tools/data-migration-manage-task.md
  75. +18 −18 v2.1/tools/data-migration-overview.md
  76. +18 −18 v2.1/tools/data-migration-practice.md
  77. +10 −10 v2.1/tools/data-migration-troubleshooting.md
  78. +5 −5 v2.1/tools/dm-configuration-file-overview.md
  79. +2 −2 v2.1/tools/dm-monitor.md
  80. +22 −22 v2.1/tools/dm-sharding-solution.md
  81. +5 −5 v2.1/tools/dm-task-config-argument-description.md
  82. +7 −7 v2.1/tools/dm-task-configuration-file-intro.md
  83. +6 −6 v2.1/tools/dm-worker-intro.md
  84. +3 −3 v2.1/tools/loader.md
  85. +3 −3 v2.1/tools/sync-diff-inspector.md
  86. +43 −43 v2.1/tools/syncer.md
  87. +15 −15 v2.1/tools/tidb-binlog-cluster.md
  88. +8 −8 v2.1/tools/tidb-binlog-kafka.md
  89. +4 −4 v2.1/tools/tidb-binlog-monitor.md
  90. +6 −6 v2.1/tools/tidb-binlog.md
  91. +10 −10 v2.1/tools/troubleshooting-sharding-ddl-locks.md
  92. +5 −5 v2.1/tools/upgrade-loader-or-syncer-to-dm.md
4 TOC.md
@@ -266,13 +266,13 @@
- [Black and White Lists](dev/reference/tools/data-migration/features/overview.md#black-and-white-table-lists)
- [Binlog Event Filter](dev/reference/tools/data-migration/features/overview.md#binlog-event-filter)
- [Column Mapping](dev/reference/tools/data-migration/features/overview.md#column-mapping)
- [Synchronization Delay Monitoring](dev/reference/tools/data-migration/features/overview.md#synchronization-delay-monitoring)
- [Replication Delay Monitoring](dev/reference/tools/data-migration/features/overview.md#replication-delay-monitoring)
+ Sharding Support
- [Introduction](dev/reference/tools/data-migration/features/shard-merge.md)
- [Restrictions](dev/reference/tools/data-migration/features/shard-merge.md#restrictions)
- [Handle Sharding DDL Locks Manually](dev/reference/tools/data-migration/features/manually-handling-sharding-ddl-locks.md)
+ Usage Scenarios
- [Simple Scenario](dev/reference/tools/data-migration/usage-scenarios/simple-synchronization.md)
- [Simple Scenario](dev/reference/tools/data-migration/usage-scenarios/simple-replication.md)
- [Shard Merge Scenario](dev/reference/tools/data-migration/usage-scenarios/shard-merge.md)
- [Deploy](dev/reference/tools/data-migration/deploy.md)
+ Configure
@@ -45,4 +45,4 @@ The TiKV server is responsible for storing data. From an external view, TiKV is

## TiSpark

TiSpark deals with the complex OLAP requirements. TiSpark makes Spark SQL directly run on the storage layer of the TiDB cluster, combines the advantages of the distributed TiKV cluster, and integrates into the big data ecosystem. With TiSpark, TiDB can support both OLTP and OLAP scenarios in one cluster, so the users never need to worry about data synchronization.
TiSpark deals with the complex OLAP requirements. TiSpark makes Spark SQL directly run on the storage layer of the TiDB cluster, combines the advantages of the distributed TiKV cluster, and integrates into the big data ecosystem. With TiSpark, TiDB can support both OLTP and OLAP scenarios in one cluster, so the users never need to worry about data replication.
@@ -661,7 +661,7 @@
"thresholds": [],
"timeFrom": null,
"timeShift": null,
"title": "synchronization delay",
"title": "replication delay",
"tooltip": {
"msResolution": false,
"shared": true,
@@ -765,7 +765,7 @@
"dashLength": 10,
"dashes": false,
"datasource": "${DS_TIDB-CLUSTER}",
"description": "he total number of SQL statements that Syncer skips when the upstream synchronizes binlog files with the downstream; you can configure the format of SQL statements skipped by Syncer using the `skip-ddls` and `skip-dmls` parameters in the `syncer.toml` file.",
"description": "The total number of SQL statements that Syncer skips when the upstream replicates binlog files with the downstream; you can configure the format of SQL statements skipped by Syncer using the `skip-ddls` and `skip-dmls` parameters in the `syncer.toml` file.",
"fill": 1,
"id": 3,
"legend": {
@@ -620,7 +620,7 @@ TiKV implements the Column Family (CF) feature of RocksDB. By default, the KV da

#### If a node is down, will the service be affected? If yes, how long?

TiDB uses Raft to synchronize data among multiple replicas and guarantees the strong consistency of data. If one replica goes wrong, the other replicas can guarantee data security. The default number of replicas in each Region is 3. Based on the Raft protocol, a leader is elected in each Region, and if a single leader fails, a follower is soon elected as Region leader after a maximum of 2 * lease time (lease time is 10 seconds).
TiDB uses Raft to replicate data among multiple replicas and guarantees the strong consistency of data. If one replica goes wrong, the other replicas can guarantee data security. The default number of replicas in each Region is 3. Based on the Raft protocol, a leader is elected in each Region, and if a single leader fails, a follower is soon elected as Region leader after a maximum of 2 * lease time (lease time is 10 seconds).

#### What are the TiKV scenarios that take up high I/O, memory, CPU, and exceed the parameter configuration?

@@ -834,37 +834,37 @@ Download and import [Syncer Json](https://github.com/pingcap/docs/blob/master/de

Restart Prometheus.

##### Is there a current solution to synchronizing data from TiDB to other databases like HBase and Elasticsearch?
##### Is there a current solution to replicating data from TiDB to other databases like HBase and Elasticsearch?

No. Currently, the data synchronization depends on the application itself.
No. Currently, the data replication depends on the application itself.

##### Does Syncer support synchronizing only some of the tables when Syncer is synchronizing data?
##### Does Syncer support replicating only some of the tables when Syncer is replicating data?

Yes. For details, see [Syncer User Guide](/dev/reference/tools/syncer.md)

##### Do frequent DDL operations affect the synchronization speed of Syncer?
##### Do frequent DDL operations affect the replication speed of Syncer?

Frequent DDL operations may affect the synchronization speed. For Sycner, DDL operations are executed serially. When DDL operations are executed during data synchronization, data will be synchronized serially and thus the synchronization speed will be slowed down.
Frequent DDL operations may affect the replication speed. For Sycner, DDL operations are executed serially. When DDL operations are executed during data replication, data will be replicated serially and thus the replication speed will be slowed down.

##### If the machine that Syncer is in is broken and the directory of the `syncer.meta` file is lost, what should I do?

When you synchronize data using Syncer GTID, the `syncer.meta` file is constantly updated during the synchronization process. The current version of Syncer does not contain the design for high availability. The `syncer.meta` configuration file of Syncer is directly stored on the hard disks, which is similar to other tools in the MySQL ecosystem, such as mydumper.
When you replicate data using Syncer GTID, the `syncer.meta` file is constantly updated during the replication process. The current version of Syncer does not contain the design for high availability. The `syncer.meta` configuration file of Syncer is directly stored on the hard disks, which is similar to other tools in the MySQL ecosystem, such as mydumper.

Two solutions:

- Put the `syncer.meta` file in a relatively secure disk. For example, use disks with RAID 1.
- Restore the location information of history synchronization according to the monitoring data that Syncer reports to Prometheus regularly. But the location information might be inaccurate due to the delay when a large amount of data is synchronized.
- Restore the location information of history replication according to the monitoring data that Syncer reports to Prometheus regularly. But the location information might be inaccurate due to the delay when a large amount of data is replicated.

##### If the downstream TiDB data is not consistent with the MySQL data during the synchronization process of Syncer, will DML operations cause exits?
##### If the downstream TiDB data is not consistent with the MySQL data during the replication process of Syncer, will DML operations cause exits?

- If the data exists in the upstream MySQL but does not exist in the downstream TiDB, when the upstream MySQL performs the `UPDATE` or `DELETE` operation on this row of data, Syncer will not report an error and the synchronization process will not exit, and this row of data does not exist in the downstream.
- If the data exists in the upstream MySQL but does not exist in the downstream TiDB, when the upstream MySQL performs the `UPDATE` or `DELETE` operation on this row of data, Syncer will not report an error and the replication process will not exit, and this row of data does not exist in the downstream.
- If a conflict exists in the primary key indexes or the unique indexes in the downstream, preforming the `UPDATE` operation will cause an exit and performing the `INSERT` operation will not cause an exit.

### Migrate the traffic

#### How to migrate the traffic quickly?

It is recommended to build a multi-source MySQL -> TiDB real-time synchronization environment using Syncer tool. You can migrate the read and write traffic in batches by editing the network configuration as needed. Deploy a stable network LB (HAproxy, LVS, F5, DNS, etc.) on the upper layer, in order to implement seamless migration by directly editing the network configuration.
It is recommended to build a multi-source MySQL -> TiDB real-time replication environment using Syncer tool. You can migrate the read and write traffic in batches by editing the network configuration as needed. Deploy a stable network LB (HAproxy, LVS, F5, DNS, etc.) on the upper layer, in order to implement seamless migration by directly editing the network configuration.

#### Is there a limit for the total write and read capacity in TiDB?

@@ -316,7 +316,7 @@ VjX8cEeTX+qcvZ3bPaO4h0C80pe/1aU=

## Step 8: Edit variables in the `inventory.ini` file

This step shows how to edit the variable of the deployment directory, how to configure the relay log synchronization position and the relay log GTID synchronization mode, and explains the global variables in the `inventory.ini` file.
This step shows how to make configuration changes to the `inventory.ini` file.

### Configure the deployment directory

@@ -336,7 +336,7 @@ Edit the `deploy_dir` variable to configure the deployment directory.
dm-master ansible_host=172.16.10.71 deploy_dir=/data1/deploy
```

### Configure the relay log synchronization position
### Configure the relay log position

When you start DM-worker for the first time, you need to configure `relay_binlog_name` to specify the position where DM-worker starts to pull the corresponding upstream MySQL or MariaDB binlog.

@@ -349,15 +349,15 @@ dm-worker2 ansible_host=172.16.10.73 source_id="mysql-replica-02" server_id=102

> **Note:**
>
> If `relay_binlog_name` is not set, DM-worker pulls the binlog starting from the earliest existing binlog file of the upstream MySQL or MariaDB. In this case, it can take a long period of time to pull the latest binlog for the data synchronization task.
> If `relay_binlog_name` is not set, DM-worker pulls the binlog starting from the earliest existing binlog file of the upstream MySQL or MariaDB. In this event, it may take a significant amount of time to retrieve all of the binlog files.
### Enable the relay log GTID synchronization mode
### Enable the relay log GTID replication mode

In a DM cluster, the relay log processing unit of DM-worker communicates with the upstream MySQL or MariaDB to pull its binlog to the local file system.

You can enable the relay log GTID synchronization mode by configuring the following items. Currently, DM supports MySQL GTID and MariaDB GTID.
You can enable the relay log GTID replication mode by configuring the following items. Currently, DM supports MySQL GTID and MariaDB GTID.

- `enable_gtid`: to enable the relay log GTID synchronization mode to deal with scenarios like master-slave switch
- `enable_gtid`: to enable the GTID mode. This helps improve the handling of replication topology changes, such as a switch between master and slave
- `relay_binlog_gtid`: to specify the position where DM-worker starts to pull the corresponding upstream MySQL or MariaDB binlog

```yaml
@@ -8,7 +8,7 @@ category: how-to

TiDB DM (Data Migration) is a platform that supports migrating large, complex, production data sets from MySQL or MariaDB to TiDB.

DM supports creating and importing an initial dump of data, as well as keeping data synchronized during migration by reading and applying binary logs from the source data store. DM can migrate sharded topologies from in-production databases by merging tables from multiple separate upstream MySQL/MariaDB instances/clusters. In addition to its use for migrations, DM is often used on an ongoing basis by existing MySQL or MariaDB users who deploy a TiDB cluster as a slave, to either provide improved horizontal scalability or run real-time analytical workloads on TiDB without needing to manage an ETL pipeline.
DM supports creating and importing an initial dump of data, as well as keeping data replicated during migration by reading and applying binary logs from the source data store. DM can migrate sharded topologies from in-production databases by merging tables from multiple separate upstream MySQL/MariaDB instances/clusters. In addition to its use for migrations, DM is often used on an ongoing basis by existing MySQL or MariaDB users who deploy a TiDB cluster as a slave, to either provide improved horizontal scalability or run real-time analytical workloads on TiDB without needing to manage an ETL pipeline.

In this tutorial, we'll see how to migrate a sharded table from multiple upstream MySQL instances. We'll do this a couple of different ways. First, we'll merge several tables/shards that do not conflict; that is, they're partitioned using a scheme that does not result in conflicting unique key values. Then, we'll merge several tables that **do** have conflicting unique key values.

@@ -24,8 +24,8 @@ This tutorial assumes you're using a new, clean CentOS 7 instance. You can virtu

The TiDB DM (Data Migration) platform consists of 3 components: DM-master, DM-worker, and dmctl.

* DM-master manages and schedules the operation of data synchronization tasks.
* DM-worker executes specific data synchronization tasks.
* DM-master manages and schedules the operation of data replication tasks.
* DM-worker executes specific data replication tasks.
* dmctl is the command line tool used to control the DM cluster.

Individual tasks are defined in .yaml files that are read by dmctl and submitted to DM-master. DM-master then informs each instance of DM-worker of its responsibilities for a given task.
@@ -427,7 +427,7 @@ Expect this output:
6328 b294504229c668e750dfcc4ea9617f0a 3309
```

As long as the DM master and workers are running the "dmtest1" task, they'll continue to keep the downstream TiDB server synchronized with the upstream MySQL server instances.
As long as the DM master and workers are running the "dmtest1" task, they'll continue to keep the downstream TiDB server replicated with the upstream MySQL server instances.


### Overlapping shards
@@ -11,9 +11,9 @@ The [previous guide](/dev/how-to/migrate/from-mysql.md) introduces how to import

Syncer can be [downloaded as part of Enterprise Tools](/dev/reference/tools/download.md).

Assuming the data from `t1` and `t2` is already imported to TiDB using `mydumper`/`loader`. Now we hope that any updates to these two tables are synchronized to TiDB in real time.
Assuming the data from `t1` and `t2` is already imported to TiDB using `mydumper`/`loader`. Now we hope that any updates to these two tables are replicated to TiDB in real time.

### Obtain the position to synchronize
### Obtain the position to replicate

The data exported from MySQL contains a metadata file which includes the position information. Take the following metadata information as an example:
```
@@ -26,7 +26,7 @@ SHOW MASTER STATUS:
Finished dump at: 2017-04-28 10:48:11
```
The position information (`Pos: 930143241`) needs to be stored in the `syncer.meta` file for `syncer` to synchronize:
The position information (`Pos: 930143241`) needs to be stored in the `syncer.meta` file for `syncer` to replicate:

```bash
# cat syncer.meta
@@ -36,7 +36,7 @@ binlog-pos = 930143241

> **Note:**
>
> The `syncer.meta` file only needs to be configured once when it is first used. The position will be automatically updated when binlog is synchronized.
> The `syncer.meta` file only needs to be configured once when it is first used. The position will be automatically updated when binlog is replicated.
### Start `syncer`

@@ -177,7 +177,7 @@ mysql> select * from t1;
+----+------+
```

`syncer` outputs the current synchronized data statistics every 30 seconds:
`syncer` outputs the current replicated data statistics every 30 seconds:

```bash
2017/06/08 01:18:51 syncer.go:934: [info] [syncer]total events = 15, total tps = 130, recent tps = 4,
@@ -188,4 +188,4 @@ master-binlog = (ON.000001, 11992), master-binlog-gtid=53ea0ed1-9bf8-11e6-8bea-6
syncer-binlog = (ON.000001, 2504), syncer-binlog-gtid = 53ea0ed1-9bf8-11e6-8bea-64006a897c73:1-35
```

You can see that by using `syncer`, the updates in MySQL are automatically synchronized in TiDB.
You can see that by using `syncer`, the updates in MySQL are automatically replicated in TiDB.
@@ -16,7 +16,7 @@ Migrations will often make use of the following tools. The following is a brief
- [`mydumper`](/dev/reference/tools/mydumper.md) exports data from MySQL. It is recommended over using mysqldump.
- [`loader`](/dev/reference/tools/loader.md) imports data in mydumper format into TiDB.
- [`syncer`](/dev/reference/tools/syncer.md) acts like a MySQL replication slave and pushes data from MySQL into TiDB.
- [DM](/dev/reference/tools/data-migration/overview.md) (Data Migration) integrates the functions of mydumper, Loader and syncer to support the export and import of full-size data, as well as incremental synchronization of MySQL Binlog data, and supports data synchronization of a more complete pooled table scenario.
- [DM](/dev/reference/tools/data-migration/overview.md) (Data Migration) integrates the functions of mydumper, Loader and syncer to support the export and import of full-size data, as well as incremental replication of MySQL Binlog data, and supports data replication of a more complete pooled table scenario.
- [TiDB-Lightning](/dev/reference/tools/tidb-lightning/overview.md) imports data to TiDB in an optimized way. For example, a 1TiB backup could take 24+ hours to import with loader, while it will typically complete at least 3 times faster in TiDB-Lightning.

## Scenarios

0 comments on commit 159bdf9

Please sign in to comment.
You can’t perform that action at this time.