From 84e0e8b494f152dec69ea36b83c2510d0b09a003 Mon Sep 17 00:00:00 2001 From: Keke Yi <40977455+yikeke@users.noreply.github.com> Date: Wed, 10 Jun 2020 11:16:32 +0800 Subject: [PATCH 1/3] cherry pick #2705 to release-2.1 Signed-off-by: sre-bot --- TOC.md | 12 +++ export-or-backup-using-dumpling.md | 122 +++++++++++++++++++++++++++++ mydumper-overview.md | 2 +- 3 files changed, 135 insertions(+), 1 deletion(-) create mode 100644 export-or-backup-using-dumpling.md diff --git a/TOC.md b/TOC.md index 3ebe0e00752d2..cf7d7105290e7 100644 --- a/TOC.md +++ b/TOC.md @@ -76,8 +76,20 @@ - [Migrate from CSV](/tidb-lightning/migrate-from-csv-using-tidb-lightning.md) + Maintain - [Common Ansible Operations](/maintain-tidb-using-ansible.md) +<<<<<<< HEAD - [Backup and Restore](/backup-and-restore.md) - [Identify Slow Queries](/identify-slow-queries.md) +======= + + Backup and Restore + - [Use Mydumper and TiDB Lightning](/backup-and-restore-using-mydumper-lightning.md) + - [Use Dumpling for Export or Backup](/export-or-backup-using-dumpling.md) + + Use BR + - [Use BR](/br/backup-and-restore-tool.md) + - [BR Use Cases](/br/backup-and-restore-use-cases.md) + + Identify Abnormal Queries + - [Identify Slow Queries](/identify-slow-queries.md) + - [Identify Expensive Queries](/identify-expensive-queries.md) +>>>>>>> dcb4bb2... dumpling: add export-or-backup-using-dumpling.md (#2705) + Scale - [Scale using Ansible](/scale-tidb-using-ansible.md) - [Scale a TiDB Cluster](/horizontal-scale.md) diff --git a/export-or-backup-using-dumpling.md b/export-or-backup-using-dumpling.md new file mode 100644 index 0000000000000..68c8450ff3a67 --- /dev/null +++ b/export-or-backup-using-dumpling.md @@ -0,0 +1,122 @@ +--- +title: Export or Backup Data Using Dumpling +summary: Use the Dumpling tool to export or backup data in TiDB. +category: how-to +--- + +# Export or Backup Data Using Dumpling + +This document introduces how to use the [Dumpling](https://github.com/pingcap/dumpling) tool to export or backup data in TiDB. Dumpling exports data stored in TiDB as SQL or CSV data files and can be used to make a logical full backup or export. + +For backups of SST files (KV pairs) or backups of incremental data that are not sensitive to latency, refer to [BR](/br/backup-and-restore-tool.md). For real-time backups of incremental data, refer to [TiCDC](/ticdc/ticdc-overview.md). + +When using Dumpling, you need to execute the export command on a running cluster. This document assumes that there is a TiDB instance on the `127.0.0.1:4000` host and that this TiDB instance has a root user without a password. + +## Export data from TiDB + +Export data using the following command: + +{{< copyable "shell-regular" >}} + +```shell +dumpling \ + -u root \ + -P 4000 \ + -H 127.0.0.1 \ + --filetype sql \ + --threads 32 \ + -o /tmp/test \ + -F $(( 1024 * 1024 * 256 )) +``` + +In the above command, `-H`, `-P` and `-u` mean address, port and user, respectively. If password authentication is required, you can pass it to Dumpling with `-p $YOUR_SECRET_PASSWORD`. + +Dumpling exports all tables (except for system tables) in the entire database by default. You can use `--where ` to select the records to be exported. If the exported data is in CSV format (CSV files can be exported using `--filetype csv`), you can also use `--sql ` to export records selected by the specified SQL statement. + +For example, you can export all records that match `id < 100` in `test.sbtest1` using the following command: + +{{< copyable "shell-regular" >}} + +```shell +./dumpling \ + -u root \ + -P 4000 \ + -H 127.0.0.1 \ + -o /tmp/test \ + --filetype csv \ + --sql "select * from `test`.`sbtest1` where id < 100" +``` + +Note that the `--sql` option can be used only for exporting CSV files for now. However, you can use `--where` to filter the rows to be exported, and use the following command to export all rows with `id < 100`: + +> **Note:** +> +> You need to execute the `select * from where id < 100` statement on all tables to be exported. If any table does not have the specified field, then the export fails. + +{{< copyable "shell-regular" >}} + +```shell +./dumpling \ + -u root \ + -P 4000 \ + -H 127.0.0.1 \ + -o /tmp/test \ + --where "id < 100" +``` + +> **Note:** +> +> Currently, Dumpling does not support exporting only certain tables specified by users (i.e. `-T` flag, see [this issue](https://github.com/pingcap/dumpling/issues/76)). If you do need this feature, you can use [MyDumper](/backup-and-restore-using-mydumper-lightning.md) instead. + +The exported file is stored in the `./export-` directory by default. Commonly used parameters are as follows: + +- `-o` is used to select the directory where the exported files are stored. +- `-F` option is used to specify the maximum size of a single file (the unit here is byte, different from MyDumper). +- `-r` option is used to specify the maximum number of records (or the number of rows in the database) for a single file. + +You can use the above parameters to provide Dumpling with a higher degree of parallelism. + +Another flag that is not mentioned above is `--consistency `, which controls the way in which data is exported for "consistency assurance". For TiDB, consistency is ensured by getting a snapshot of a certain timestamp by default (i.e. `--consistency snapshot`). When using snapshot for consistency, you can use the `--snapshot` parameter to specify the timestamp to be backed up. You can also use the following levels of consistency: + +- `flush`: Use [`FLUSH TABLES WITH READ LOCK`](https://dev.mysql.com/doc/refman/8.0/en/flush.html#flush-tables-with-read-lock) to ensure consistency. +- `snapshot`: Get a consistent snapshot of the specified timestamp and export it. +- `lock`: Add read locks on all tables to be exported. +- `none`: No guarantee for consistency. +- `auto`: Use `flush` for MySQL and `snapshot` for TiDB. + +After everything is done, you can see the exported file in `/tmp/test`: + +{{< copyable "shell-regular" >}} + +```shell +ls -lh /tmp/test | awk '{print $5 "\t" $9}' +``` + +``` +140B metadata +66B test-schema-create.sql +300B test.sbtest1-schema.sql +190K test.sbtest1.0.sql +300B test.sbtest2-schema.sql +190K test.sbtest2.0.sql +300B test.sbtest3-schema.sql +190K test.sbtest3.0.sql +``` + +In addition, if the data volume is very large, to avoid export failure due to GC during the export process, you can extend the GC time in advance: + +{{< copyable "sql" >}} + +```sql +update mysql.tidb set VARIABLE_VALUE = '720h' where VARIABLE_NAME = 'tikv_gc_life_time'; +``` + +After your operation is completed, set the GC time back (the default value is `10m`): + +{{< copyable "sql" >}} + +```sql +update mysql.tidb set VARIABLE_VALUE = '10m' where VARIABLE_NAME = 'tikv_gc_life_time'; +``` + +Finally, all the exported data can be imported back to TiDB using [Lightning](/tidb-lightning/tidb-lightning-tidb-backend.md). diff --git a/mydumper-overview.md b/mydumper-overview.md index aae32ce159d02..fa584e8ae471b 100644 --- a/mydumper-overview.md +++ b/mydumper-overview.md @@ -9,7 +9,7 @@ aliases: ['/docs/v2.1/reference/tools/mydumper/'] ## What is Mydumper? -[Mydumper](https://github.com/pingcap/mydumper) is a fork project optimized for TiDB. It is recommended to use this tool for logical backups of TiDB. +[Mydumper](https://github.com/pingcap/mydumper) is a fork project optimized for TiDB. You can use this tool for logical backups of TiDB. It can be [downloaded](/download-ecosystem-tools.md) as part of the Enterprise Tools package. From 4b9bb280b179aab296badbedcbf4161e228b4202 Mon Sep 17 00:00:00 2001 From: yikeke Date: Wed, 10 Jun 2020 11:35:45 +0800 Subject: [PATCH 2/3] Update TOC.md --- TOC.md | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-) diff --git a/TOC.md b/TOC.md index cf7d7105290e7..e006bb5eab728 100644 --- a/TOC.md +++ b/TOC.md @@ -76,20 +76,10 @@ - [Migrate from CSV](/tidb-lightning/migrate-from-csv-using-tidb-lightning.md) + Maintain - [Common Ansible Operations](/maintain-tidb-using-ansible.md) -<<<<<<< HEAD - - [Backup and Restore](/backup-and-restore.md) - - [Identify Slow Queries](/identify-slow-queries.md) -======= + Backup and Restore - - [Use Mydumper and TiDB Lightning](/backup-and-restore-using-mydumper-lightning.md) + - [Use Mydumper and TiDB Lightning](/backup-and-restore.md) - [Use Dumpling for Export or Backup](/export-or-backup-using-dumpling.md) - + Use BR - - [Use BR](/br/backup-and-restore-tool.md) - - [BR Use Cases](/br/backup-and-restore-use-cases.md) - + Identify Abnormal Queries - - [Identify Slow Queries](/identify-slow-queries.md) - - [Identify Expensive Queries](/identify-expensive-queries.md) ->>>>>>> dcb4bb2... dumpling: add export-or-backup-using-dumpling.md (#2705) + - [Identify Slow Queries](/identify-slow-queries.md) + Scale - [Scale using Ansible](/scale-tidb-using-ansible.md) - [Scale a TiDB Cluster](/horizontal-scale.md) From ab4e42e2ab7abb11def3b6d39505397333e1a8f0 Mon Sep 17 00:00:00 2001 From: yikeke Date: Wed, 10 Jun 2020 14:01:20 +0800 Subject: [PATCH 3/3] fix 4 dead links --- export-or-backup-using-dumpling.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/export-or-backup-using-dumpling.md b/export-or-backup-using-dumpling.md index 68c8450ff3a67..2763aecfcf03a 100644 --- a/export-or-backup-using-dumpling.md +++ b/export-or-backup-using-dumpling.md @@ -8,8 +8,6 @@ category: how-to This document introduces how to use the [Dumpling](https://github.com/pingcap/dumpling) tool to export or backup data in TiDB. Dumpling exports data stored in TiDB as SQL or CSV data files and can be used to make a logical full backup or export. -For backups of SST files (KV pairs) or backups of incremental data that are not sensitive to latency, refer to [BR](/br/backup-and-restore-tool.md). For real-time backups of incremental data, refer to [TiCDC](/ticdc/ticdc-overview.md). - When using Dumpling, you need to execute the export command on a running cluster. This document assumes that there is a TiDB instance on the `127.0.0.1:4000` host and that this TiDB instance has a root user without a password. ## Export data from TiDB @@ -66,7 +64,7 @@ Note that the `--sql` option can be used only for exporting CSV files for now. H > **Note:** > -> Currently, Dumpling does not support exporting only certain tables specified by users (i.e. `-T` flag, see [this issue](https://github.com/pingcap/dumpling/issues/76)). If you do need this feature, you can use [MyDumper](/backup-and-restore-using-mydumper-lightning.md) instead. +> Currently, Dumpling does not support exporting only certain tables specified by users (i.e. `-T` flag, see [this issue](https://github.com/pingcap/dumpling/issues/76)). If you do need this feature, you can use [MyDumper](/mydumper-overview.md) instead. The exported file is stored in the `./export-` directory by default. Commonly used parameters are as follows: @@ -119,4 +117,4 @@ After your operation is completed, set the GC time back (the default value is `1 update mysql.tidb set VARIABLE_VALUE = '10m' where VARIABLE_NAME = 'tikv_gc_life_time'; ``` -Finally, all the exported data can be imported back to TiDB using [Lightning](/tidb-lightning/tidb-lightning-tidb-backend.md). +Finally, all the exported data can be imported back to TiDB using [Lightning](/tidb-lightning/tidb-lightning-overview.md).