Cleaned up and added TiDB Server info (#1076)

* Cleaned up and added TiDB Server info * More doc cleanup * Fixed table of node mappings * Clarified command line parameters * Fixed letter casing and indenting * Clarified need to download binlogctl
pingcap · Apr 23, 2019 · 395c183 · 395c183
1 parent 9365012
commit 395c183
Showing 1 changed file with 50 additions and 40 deletions.
diff --git a/tools/tidb-binlog-cluster.md b/tools/tidb-binlog-cluster.md
@@ -6,9 +6,9 @@ category: tools
 
 # TiDB-Binlog Cluster User Guide
 
-This document introduces the architecture and the deployment of TiDB-Binlog of the cluster version.
+This document introduces the architecture and the deployment of the cluster version of TiDB-Binlog.
 
-TiDB-Binlog is an enterprise tool used to collect the binlog data of TiDB and provide real-time backup and synchronization.
+TiDB-Binlog is tool used to collect binlog data from TiDB and provide real-time backup and synchronization to downstream platforms.
 
 TiDB-Binlog has the following features:
 
@@ -29,14 +29,14 @@ Pump is used to record the binlogs generated in TiDB, sort the binlogs based on
 
 ### Drainer
 
-Drainer collects and merges binlogs from each Pump, converts the binlog to SQL or data of a specific format, and synchronizes the data to the downstream.
+Drainer collects and merges binlogs from each Pump, converts the binlog to SQL or data of a specific format, and synchronizes the data to a specific downstream platform.
 
 ## Main features
 
-* Multiple Pumps form a cluster which can scale out horizontally.
-* TiDB uses the built-in Pump Client to send the binlog to each Pump.
-* Pump stores binlogs and sends the binlogs to Drainer in order.
-* Drainer reads binlogs of each Pump, merges and sorts the binlogs, and sends the binlogs to the downstream.
+* Multiple Pumps form a cluster which can scale out horizontally
+* TiDB uses the built-in Pump Client to send the binlog to each Pump
+* Pump stores binlogs and sends the binlogs to Drainer in order
+* Drainer reads binlogs of each Pump, merges and sorts the binlogs, and sends the binlogs downstream
 
 ## Hardware requirements
 
@@ -51,24 +51,25 @@ The server hardware requirements for development, testing, and the production en
 
 ## Notes
 
-* You need to use TiDB v2.0.8-binlog, v2.1.0-rc.5 or the later version. Otherwise, the TiDB cluster is not compatible with the cluster version of TiDB-Binlog.
+* You need to use TiDB v2.0.8-binlog, v2.1.0-rc.5 or a later version. Older versions of TiDB cluster are not compatible with the cluster version of TiDB-Binlog.
 * When TiDB is running, you need to guarantee that at least one Pump is running normally.
-* To enable the TiDB-Binlog service, add the `enable-binlog` startup parameter in TiDB. Make sure that the TiDB-Binlog service is enabled in all TiDB instances in a same cluster, otherwise the upstream and downstream data inconsistency might occur during data synchronization. If you want to temporarily run a TiDB instance where the TiDB-Binlog service is not enabled, configure `run_ddl= false` in the TiDB configuration file.
+* To enable the TiDB-Binlog service in TiDB server, use the `-enable-binlog` startup parameter in TiDB, or add `enable=true` to the `[binlog]` section of the TiDB server configuration file.
+* Make sure that the TiDB-Binlog service is enabled in all TiDB instances in a same cluster, otherwise upstream and downstream data inconsistency might occur during data synchronization. If you want to temporarily run a TiDB instance where the TiDB-Binlog service is not enabled, set `run_ddl=false` in the TiDB configuration file.
 * Drainer does not support the `rename` DDL operation on the table of `ignore schemas` (the schemas in the filter list).
-* If you want to start Drainer in the existing TiDB cluster, generally, you need to make a full backup of the cluster data, obtain `savepoint`, import the data to the target database, and then start Drainer to synchronize the incremental data from `savepoint`.
-* Drainer supports synchronizing binlogs to MySQL, TiDB, Kafka or the local files. If you need to synchronize binlogs to other destinations, you can set Drainer to synchronize the binlog to Kafka and read the data in Kafka for customization processing. See [Binlog Slave Client User Guide](../tools/binlog-slave-client.md).
-* If TiDB-Binlog is used for recovering the incremental data, you can set the downstream to `pb` (local files in the proto buffer format). Drainer converts the binlog to data in the specified proto buffer format and writes the data to local files. In this way, you can use [Reparo](../tools/reparo.md) to recover the incremental data.
-* Pump/Drainer has two states: `paused` and `offline`. If you press Ctrl + C or kill the process, both Pump and Drainer become `paused`. The paused Pump do not need to send all the binlog data to Drainer. If you need to exit from Pump for a long period of time (or do not use Pump any more), use `binlogctl` to make Pump offline. The same goes for Drainer.
-* If the downstream is MySQL/TiDB, you can use [sync-diff-inspector](../tools/sync-diff-inspector.md) to verify the data after data synchronization.
+* If you want to start Drainer in an existing TiDB cluster, generally you need to make a full backup of the cluster data, obtain `savepoint`, import the data to the target database, and then start Drainer to synchronize the incremental data from `savepoint`.
+* Drainer supports synchronizing binlogs to MySQL, TiDB, Kafka or local files. If you need to synchronize binlogs to other destinations, you can set Drainer to synchronize the binlog to Kafka and read the data in Kafka for customized processing. See [Binlog Slave Client User Guide](../tools/binlog-slave-client.md).
+* To use TiDB-Binlog for recovering incremental data, set the downstream to `pb` (local files in the proto buffer format). Drainer converts the binlog to data in the specified proto buffer format and writes the data to local files. In this way, you can use [Reparo](../tools/reparo.md) to recover data incrementally.
+* Pump and Drainer have several states, including `online`, `paused`, and `offline`. If you press Ctrl + C or kill the process, both Pump and Drainer become `paused`. The paused Pump do not need to send all the binlog data to Drainer. If you need to exit from Pump for a long period of time (or are permanently removing Pump from the cluster), use `binlogctl` to make Pump offline. The same goes for Drainer.
+* If the downstream is MySQL, MariaDB, or another TiDB cluster, you can use [sync-diff-inspector](../tools/sync-diff-inspector.md) to verify the data after data synchronization.
 
 ## TiDB-Binlog deployment
 
 This section shows two methods of deploying TiDB-Binlog:
 
 - [Deploy TiDB-Binlog using TiDB-Ansible](#deploy-tidb-binlog-using-tidb-ansible)
-- [Deploy TiDB-Binlog using Binary](#deploy-tidb-binlog-using-binary)
+- [Deploy TiDB-Binlog using a Binary package](#deploy-tidb-binlog-using-binary)
 
-It is recommended to deploy TiDB-Binlog using TiDB-Ansible. If you just want to do a simple testing, you can deploy TiDB-Binlog using Binary.
+It is recommended to deploy TiDB-Binlog using TiDB-Ansible. If you just want to do a simple testing, you can deploy TiDB-Binlog using a Binary package.
 
 ### Deploy TiDB-Binlog using TiDB-Ansible 
 
@@ -294,49 +295,48 @@ It is recommended to deploy TiDB-Binlog using TiDB-Ansible. If you just want to
     $ ansible-playbook start_drainer.yml
     ```
 
-### Deploy TiDB-Binlog using Binary
+### Deploy TiDB-Binlog using a Binary package
 
-#### Download the official Binary
+#### Download the official Binary package
 
-Run the following command to download the binary:
+Run the following commands to download the packages:
 
 ```bash
-wget https://download.pingcap.org/tidb-{version}-linux-amd64.tar.gz
-wget https://download.pingcap.org/tidb-{version}-linux-amd64.sha256
+version=v2.1.8 # or "latest" for nightly builds
+wget https://download.pingcap.org/tidb-$version-linux-amd64.{tar.gz,sha256}
 
 # Check the file integrity. If the result is OK, the file is correct.
-sha256sum -c tidb-{version}-binlog-linux-amd64.sha256
+sha256sum -c tidb-$version-linux-amd64.sha256
 ```
 
 For TiDB v2.1.0 GA or later versions, Pump and Drainer are already included in the TiDB download package. For other TiDB versions, you need to download Pump and Drainer separately using the following command:
 
 ```bash
-wget https://download.pingcap.org/tidb-binlog-{version}-linux-amd64.tar.gz
-wget https://download.pingcap.org/tidb-binlog-{version}-linux-amd64.sha256
+wget https://download.pingcap.org/tidb-binlog-$version-linux-amd64.{tar.gz,sha256}
 
 # Check the file integrity. If the result is OK, the file is correct.
-sha256sum -c tidb-binlog-{version}-linux-amd64.sha256
+sha256sum -c tidb-binlog-$version-linux-amd64.sha256
 ```
 
 #### The usage example
 
 Assuming that you have three PD nodes, one TiDB node, two Pump nodes, and one Drainer node, the information of each node is as follows:
 
-```
-TiDB="192.168.0.10"
-PD1="192.168.0.16"
-PD2="192.168.0.15"
-PD3="192.168.0.14"
-Pump="192.168.0.11"
-Pump="192.168.0.12"
-Drainer="192.168.0.13"
-```
+| Node     | IP           |
+| ---------|:------------:|
+| TiDB     | 192.168.0.10 |
+| PD1      | 192.168.0.16 |
+| PD2      | 192.168.0.15 |
+| PD3      | 192.168.0.14 |
+| Pump     | 192.168.0.11 |
+| Pump     | 192.168.0.12 |
+| Drainer  | 192.168.0.13 |
 
 The following part shows how to use Pump and Drainer based on the nodes above.
 
 1. Deploy Pump using the binary.
 
-    - Taking deploying Pump on "192.168.0.11" as an example, the description for command line parameters of Pump is as follows:
+    - To view the command line parameters of Pump, execute `./bin/pump -help`:
 
         ```
         Usage of Pump:
@@ -408,7 +408,7 @@ The following part shows how to use Pump and Drainer based on the nodes above.
 
 2. Deploy Drainer using binary.
 
-    - Taking deploying Drainer on "192.168.0.13" as an example, the description for command line parameters of Drainer is as follows:
+    - To view the command line parameters of Drainer, execute `./bin/drainer -help`:
 
         ```
         Usage of Drainer:
@@ -547,7 +547,7 @@ The following part shows how to use Pump and Drainer based on the nodes above.
         # topic-name = ""
         ```
 
-    - The example of starting Drainer:
+    - Starting Drainer:
 
         > **Note:** If the downstream is MySQL/TiDB, to guarantee the data integrity, you need to obtain the `initial-commit-ts` value and make a full backup of the data and restore the data before the initial start of Drainer. For details, see [Deploy Drainer](#step-3-deploy-drainer).
 
@@ -559,6 +559,17 @@ The following part shows how to use Pump and Drainer based on the nodes above.
 
         If the command line parameter and the configuration file parameter are the same, the parameter value in the command line is used.
 
+3. Starting TiDB server:
+
+    - After starting Pump and Drainer, start TiDB server with binlog enabled by adding this section to your config file for TiDB server:
+        ```
+        [binlog]
+        enable=true
+        ```
+
+    - TiDB server will obtain the addresses of registered Pumps from PD and will stream data to all of them. If there are no registered Pump instances, TiDB server will refuse to start or will block starting until a Pump instance comes online.
+
+
 ## TiDB-Binlog operations
 
 ### Pump/Drainer state
@@ -598,11 +609,10 @@ For how to pause, close, check, and modify the state of Drainer, see the [binlog
 
 #### Download `binlogctl`
 
-Download `binlogctl` via:
+Your distribution of TiDB or TiDB-Binlog may already include binlogctl. If not, download `binlogctl`:
 
 ```bash
-wget https://download.pingcap.org/binlogctl-new-linux-amd64.tar.gz
-wget https://download.pingcap.org/binlogctl-new-linux-amd64.sha256
+wget https://download.pingcap.org/binlogctl-new-linux-amd64.{tar.gz,sha256}
 
 # Check the file integrity. It should return OK.
 sha256sum -c tidb-binlog-new-linux-amd64.sha256