Skip to content

Commit f283af1

Browse files
yikekesre-bot
authored andcommitted
cherry pick pingcap#2705 to release-3.0
Signed-off-by: sre-bot <sre-bot@pingcap.com>
1 parent d26f66e commit f283af1

File tree

3 files changed

+135
-1
lines changed

3 files changed

+135
-1
lines changed

TOC.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,10 +78,22 @@
7878
- [Migrate from CSV](/tidb-lightning/migrate-from-csv-using-tidb-lightning.md)
7979
+ Maintain
8080
- [Common Ansible Operations](/maintain-tidb-using-ansible.md)
81+
<<<<<<< HEAD
8182
- [Backup and Restore](/backup-and-restore-using-mydumper-lightning.md)
8283
+ Identify Abnormal Queries
8384
- [Identify Slow Queries](/identify-slow-queries.md)
8485
- [Identify Expensive Queries](/identify-expensive-queries.md)
86+
=======
87+
+ Backup and Restore
88+
- [Use Mydumper and TiDB Lightning](/backup-and-restore-using-mydumper-lightning.md)
89+
- [Use Dumpling for Export or Backup](/export-or-backup-using-dumpling.md)
90+
+ Use BR
91+
- [Use BR](/br/backup-and-restore-tool.md)
92+
- [BR Use Cases](/br/backup-and-restore-use-cases.md)
93+
+ Identify Abnormal Queries
94+
- [Identify Slow Queries](/identify-slow-queries.md)
95+
- [Identify Expensive Queries](/identify-expensive-queries.md)
96+
>>>>>>> dcb4bb2... dumpling: add export-or-backup-using-dumpling.md (#2705)
8597
+ Scale
8698
- [Scale using Ansible](/scale-tidb-using-ansible.md)
8799
- [Scale a TiDB Cluster](/horizontal-scale.md)

export-or-backup-using-dumpling.md

Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
---
2+
title: Export or Backup Data Using Dumpling
3+
summary: Use the Dumpling tool to export or backup data in TiDB.
4+
category: how-to
5+
---
6+
7+
# Export or Backup Data Using Dumpling
8+
9+
This document introduces how to use the [Dumpling](https://github.com/pingcap/dumpling) tool to export or backup data in TiDB. Dumpling exports data stored in TiDB as SQL or CSV data files and can be used to make a logical full backup or export.
10+
11+
For backups of SST files (KV pairs) or backups of incremental data that are not sensitive to latency, refer to [BR](/br/backup-and-restore-tool.md). For real-time backups of incremental data, refer to [TiCDC](/ticdc/ticdc-overview.md).
12+
13+
When using Dumpling, you need to execute the export command on a running cluster. This document assumes that there is a TiDB instance on the `127.0.0.1:4000` host and that this TiDB instance has a root user without a password.
14+
15+
## Export data from TiDB
16+
17+
Export data using the following command:
18+
19+
{{< copyable "shell-regular" >}}
20+
21+
```shell
22+
dumpling \
23+
-u root \
24+
-P 4000 \
25+
-H 127.0.0.1 \
26+
--filetype sql \
27+
--threads 32 \
28+
-o /tmp/test \
29+
-F $(( 1024 * 1024 * 256 ))
30+
```
31+
32+
In the above command, `-H`, `-P` and `-u` mean address, port and user, respectively. If password authentication is required, you can pass it to Dumpling with `-p $YOUR_SECRET_PASSWORD`.
33+
34+
Dumpling exports all tables (except for system tables) in the entire database by default. You can use `--where <SQL where expression>` to select the records to be exported. If the exported data is in CSV format (CSV files can be exported using `--filetype csv`), you can also use `--sql <SQL>` to export records selected by the specified SQL statement.
35+
36+
For example, you can export all records that match `id < 100` in `test.sbtest1` using the following command:
37+
38+
{{< copyable "shell-regular" >}}
39+
40+
```shell
41+
./dumpling \
42+
-u root \
43+
-P 4000 \
44+
-H 127.0.0.1 \
45+
-o /tmp/test \
46+
--filetype csv \
47+
--sql "select * from `test`.`sbtest1` where id < 100"
48+
```
49+
50+
Note that the `--sql` option can be used only for exporting CSV files for now. However, you can use `--where` to filter the rows to be exported, and use the following command to export all rows with `id < 100`:
51+
52+
> **Note:**
53+
>
54+
> You need to execute the `select * from <table-name> where id < 100` statement on all tables to be exported. If any table does not have the specified field, then the export fails.
55+
56+
{{< copyable "shell-regular" >}}
57+
58+
```shell
59+
./dumpling \
60+
-u root \
61+
-P 4000 \
62+
-H 127.0.0.1 \
63+
-o /tmp/test \
64+
--where "id < 100"
65+
```
66+
67+
> **Note:**
68+
>
69+
> Currently, Dumpling does not support exporting only certain tables specified by users (i.e. `-T` flag, see [this issue](https://github.com/pingcap/dumpling/issues/76)). If you do need this feature, you can use [MyDumper](/backup-and-restore-using-mydumper-lightning.md) instead.
70+
71+
The exported file is stored in the `./export-<current local time>` directory by default. Commonly used parameters are as follows:
72+
73+
- `-o` is used to select the directory where the exported files are stored.
74+
- `-F` option is used to specify the maximum size of a single file (the unit here is byte, different from MyDumper).
75+
- `-r` option is used to specify the maximum number of records (or the number of rows in the database) for a single file.
76+
77+
You can use the above parameters to provide Dumpling with a higher degree of parallelism.
78+
79+
Another flag that is not mentioned above is `--consistency <consistency level>`, which controls the way in which data is exported for "consistency assurance". For TiDB, consistency is ensured by getting a snapshot of a certain timestamp by default (i.e. `--consistency snapshot`). When using snapshot for consistency, you can use the `--snapshot` parameter to specify the timestamp to be backed up. You can also use the following levels of consistency:
80+
81+
- `flush`: Use [`FLUSH TABLES WITH READ LOCK`](https://dev.mysql.com/doc/refman/8.0/en/flush.html#flush-tables-with-read-lock) to ensure consistency.
82+
- `snapshot`: Get a consistent snapshot of the specified timestamp and export it.
83+
- `lock`: Add read locks on all tables to be exported.
84+
- `none`: No guarantee for consistency.
85+
- `auto`: Use `flush` for MySQL and `snapshot` for TiDB.
86+
87+
After everything is done, you can see the exported file in `/tmp/test`:
88+
89+
{{< copyable "shell-regular" >}}
90+
91+
```shell
92+
ls -lh /tmp/test | awk '{print $5 "\t" $9}'
93+
```
94+
95+
```
96+
140B metadata
97+
66B test-schema-create.sql
98+
300B test.sbtest1-schema.sql
99+
190K test.sbtest1.0.sql
100+
300B test.sbtest2-schema.sql
101+
190K test.sbtest2.0.sql
102+
300B test.sbtest3-schema.sql
103+
190K test.sbtest3.0.sql
104+
```
105+
106+
In addition, if the data volume is very large, to avoid export failure due to GC during the export process, you can extend the GC time in advance:
107+
108+
{{< copyable "sql" >}}
109+
110+
```sql
111+
update mysql.tidb set VARIABLE_VALUE = '720h' where VARIABLE_NAME = 'tikv_gc_life_time';
112+
```
113+
114+
After your operation is completed, set the GC time back (the default value is `10m`):
115+
116+
{{< copyable "sql" >}}
117+
118+
```sql
119+
update mysql.tidb set VARIABLE_VALUE = '10m' where VARIABLE_NAME = 'tikv_gc_life_time';
120+
```
121+
122+
Finally, all the exported data can be imported back to TiDB using [Lightning](/tidb-lightning/tidb-lightning-tidb-backend.md).

mydumper-overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ aliases: ['/docs/v3.0/reference/tools/mydumper/','/docs/tools/mydumper/']
99

1010
## What is Mydumper?
1111

12-
[Mydumper](https://github.com/pingcap/mydumper) is a fork project optimized for TiDB. It is recommended to use this tool for logical backups of TiDB.
12+
[Mydumper](https://github.com/pingcap/mydumper) is a fork project optimized for TiDB. You can use this tool for logical backups of TiDB.
1313

1414
It can be [downloaded](/download-ecosystem-tools.md) as part of the Enterprise Tools package.
1515

0 commit comments

Comments
 (0)