|
| 1 | +--- |
| 2 | +title: Export or Backup Data Using Dumpling |
| 3 | +summary: Use the Dumpling tool to export or backup data in TiDB. |
| 4 | +category: how-to |
| 5 | +--- |
| 6 | + |
| 7 | +# Export or Backup Data Using Dumpling |
| 8 | + |
| 9 | +This document introduces how to use the [Dumpling](https://github.com/pingcap/dumpling) tool to export or backup data in TiDB. Dumpling exports data stored in TiDB as SQL or CSV data files and can be used to make a logical full backup or export. |
| 10 | + |
| 11 | +For backups of SST files (KV pairs) or backups of incremental data that are not sensitive to latency, refer to [BR](/br/backup-and-restore-tool.md). For real-time backups of incremental data, refer to [TiCDC](/ticdc/ticdc-overview.md). |
| 12 | + |
| 13 | +When using Dumpling, you need to execute the export command on a running cluster. This document assumes that there is a TiDB instance on the `127.0.0.1:4000` host and that this TiDB instance has a root user without a password. |
| 14 | + |
| 15 | +## Export data from TiDB |
| 16 | + |
| 17 | +Export data using the following command: |
| 18 | + |
| 19 | +{{< copyable "shell-regular" >}} |
| 20 | + |
| 21 | +```shell |
| 22 | +dumpling \ |
| 23 | + -u root \ |
| 24 | + -P 4000 \ |
| 25 | + -H 127.0.0.1 \ |
| 26 | + --filetype sql \ |
| 27 | + --threads 32 \ |
| 28 | + -o /tmp/test \ |
| 29 | + -F $(( 1024 * 1024 * 256 )) |
| 30 | +``` |
| 31 | + |
| 32 | +In the above command, `-H`, `-P` and `-u` mean address, port and user, respectively. If password authentication is required, you can pass it to Dumpling with `-p $YOUR_SECRET_PASSWORD`. |
| 33 | + |
| 34 | +Dumpling exports all tables (except for system tables) in the entire database by default. You can use `--where <SQL where expression>` to select the records to be exported. If the exported data is in CSV format (CSV files can be exported using `--filetype csv`), you can also use `--sql <SQL>` to export records selected by the specified SQL statement. |
| 35 | + |
| 36 | +For example, you can export all records that match `id < 100` in `test.sbtest1` using the following command: |
| 37 | + |
| 38 | +{{< copyable "shell-regular" >}} |
| 39 | + |
| 40 | +```shell |
| 41 | +./dumpling \ |
| 42 | + -u root \ |
| 43 | + -P 4000 \ |
| 44 | + -H 127.0.0.1 \ |
| 45 | + -o /tmp/test \ |
| 46 | + --filetype csv \ |
| 47 | + --sql "select * from `test`.`sbtest1` where id < 100" |
| 48 | +``` |
| 49 | + |
| 50 | +Note that the `--sql` option can be used only for exporting CSV files for now. However, you can use `--where` to filter the rows to be exported, and use the following command to export all rows with `id < 100`: |
| 51 | + |
| 52 | +> **Note:** |
| 53 | +> |
| 54 | +> You need to execute the `select * from <table-name> where id < 100` statement on all tables to be exported. If any table does not have the specified field, then the export fails. |
| 55 | +
|
| 56 | +{{< copyable "shell-regular" >}} |
| 57 | + |
| 58 | +```shell |
| 59 | +./dumpling \ |
| 60 | + -u root \ |
| 61 | + -P 4000 \ |
| 62 | + -H 127.0.0.1 \ |
| 63 | + -o /tmp/test \ |
| 64 | + --where "id < 100" |
| 65 | +``` |
| 66 | + |
| 67 | +> **Note:** |
| 68 | +> |
| 69 | +> Currently, Dumpling does not support exporting only certain tables specified by users (i.e. `-T` flag, see [this issue](https://github.com/pingcap/dumpling/issues/76)). If you do need this feature, you can use [MyDumper](/backup-and-restore-using-mydumper-lightning.md) instead. |
| 70 | +
|
| 71 | +The exported file is stored in the `./export-<current local time>` directory by default. Commonly used parameters are as follows: |
| 72 | + |
| 73 | +- `-o` is used to select the directory where the exported files are stored. |
| 74 | +- `-F` option is used to specify the maximum size of a single file (the unit here is byte, different from MyDumper). |
| 75 | +- `-r` option is used to specify the maximum number of records (or the number of rows in the database) for a single file. |
| 76 | + |
| 77 | +You can use the above parameters to provide Dumpling with a higher degree of parallelism. |
| 78 | + |
| 79 | +Another flag that is not mentioned above is `--consistency <consistency level>`, which controls the way in which data is exported for "consistency assurance". For TiDB, consistency is ensured by getting a snapshot of a certain timestamp by default (i.e. `--consistency snapshot`). When using snapshot for consistency, you can use the `--snapshot` parameter to specify the timestamp to be backed up. You can also use the following levels of consistency: |
| 80 | + |
| 81 | +- `flush`: Use [`FLUSH TABLES WITH READ LOCK`](https://dev.mysql.com/doc/refman/8.0/en/flush.html#flush-tables-with-read-lock) to ensure consistency. |
| 82 | +- `snapshot`: Get a consistent snapshot of the specified timestamp and export it. |
| 83 | +- `lock`: Add read locks on all tables to be exported. |
| 84 | +- `none`: No guarantee for consistency. |
| 85 | +- `auto`: Use `flush` for MySQL and `snapshot` for TiDB. |
| 86 | + |
| 87 | +After everything is done, you can see the exported file in `/tmp/test`: |
| 88 | + |
| 89 | +{{< copyable "shell-regular" >}} |
| 90 | + |
| 91 | +```shell |
| 92 | +ls -lh /tmp/test | awk '{print $5 "\t" $9}' |
| 93 | +``` |
| 94 | + |
| 95 | +``` |
| 96 | +140B metadata |
| 97 | +66B test-schema-create.sql |
| 98 | +300B test.sbtest1-schema.sql |
| 99 | +190K test.sbtest1.0.sql |
| 100 | +300B test.sbtest2-schema.sql |
| 101 | +190K test.sbtest2.0.sql |
| 102 | +300B test.sbtest3-schema.sql |
| 103 | +190K test.sbtest3.0.sql |
| 104 | +``` |
| 105 | + |
| 106 | +In addition, if the data volume is very large, to avoid export failure due to GC during the export process, you can extend the GC time in advance: |
| 107 | + |
| 108 | +{{< copyable "sql" >}} |
| 109 | + |
| 110 | +```sql |
| 111 | +update mysql.tidb set VARIABLE_VALUE = '720h' where VARIABLE_NAME = 'tikv_gc_life_time'; |
| 112 | +``` |
| 113 | + |
| 114 | +After your operation is completed, set the GC time back (the default value is `10m`): |
| 115 | + |
| 116 | +{{< copyable "sql" >}} |
| 117 | + |
| 118 | +```sql |
| 119 | +update mysql.tidb set VARIABLE_VALUE = '10m' where VARIABLE_NAME = 'tikv_gc_life_time'; |
| 120 | +``` |
| 121 | + |
| 122 | +Finally, all the exported data can be imported back to TiDB using [Lightning](/tidb-lightning/tidb-lightning-tidb-backend.md). |
0 commit comments