diff --git a/TOC.md b/TOC.md index 791acafb4fd02..0ff644a96239e 100644 --- a/TOC.md +++ b/TOC.md @@ -87,6 +87,7 @@ - [Use Mydumper and TiDB Lightning](/backup-and-restore-using-mydumper-lightning.md) - [Use BR](/br/backup-and-restore-tool.md) - [BR Usage Scenarios](/br/backup-and-restore-use-cases.md) + - [BR storages](/br/backup-and-restore-storages.md) + Identify Abnormal Queries - [Identify Slow Queries](/identify-slow-queries.md) - [Identify Expensive Queries](/identify-expensive-queries.md) @@ -175,6 +176,7 @@ - [`ALTER TABLE`](/sql-statements/sql-statement-alter-table.md) - [`ALTER USER`](/sql-statements/sql-statement-alter-user.md) - [`ANALYZE TABLE`](/sql-statements/sql-statement-analyze-table.md) + - [`BACKUP`](/sql-statements/sql-statement-backup.md) - [`BEGIN`](/sql-statements/sql-statement-begin.md) - [`COMMIT`](/sql-statements/sql-statement-commit.md) - [`CREATE DATABASE`](/sql-statements/sql-statement-create-database.md) @@ -213,6 +215,7 @@ - [`RENAME INDEX`](/sql-statements/sql-statement-rename-index.md) - [`RENAME TABLE`](/sql-statements/sql-statement-rename-table.md) - [`REPLACE`](/sql-statements/sql-statement-replace.md) + - [`RESTORE`](/sql-statements/sql-statement-restore.md) - [`REVOKE `](/sql-statements/sql-statement-revoke-privileges.md) - [`ROLLBACK`](/sql-statements/sql-statement-rollback.md) - [`SELECT`](/sql-statements/sql-statement-select.md) @@ -220,6 +223,7 @@ - [`SET PASSWORD`](/sql-statements/sql-statement-set-password.md) - [`SET TRANSACTION`](/sql-statements/sql-statement-set-transaction.md) - [`SET [GLOBAL|SESSION] `](/sql-statements/sql-statement-set-variable.md) + - [`SHOW [BACKUPS|RESTORES]`](/sql-statements/sql-statement-show-backups.md) - [`SHOW CHARACTER SET`](/sql-statements/sql-statement-show-character-set.md) - [`SHOW COLLATION`](/sql-statements/sql-statement-show-collation.md) - [`SHOW [FULL] COLUMNS FROM`](/sql-statements/sql-statement-show-columns-from.md) @@ -332,10 +336,10 @@ - [Maintain a TiFlash Cluster](/tiflash/maintain-tiflash.md) - [Monitor TiFlash](/tiflash/monitor-tiflash.md) - [Scale TiFlash](/scale-tidb-using-tiup.md#scale-out-a-tiflash-node) - - [Upgrade TiFlash Nodes](/tiflash/upgrade-tiflash.md) - [Configure TiFlash](/tiflash/tiflash-configuration.md) - [TiFlash Alert Rules](/tiflash/tiflash-alert-rules.md) - [Tune TiFlash Performance](/tiflash/tune-tiflash-performance.md) + - [Troubleshoot a TiFlash Cluster](/tiflash/troubleshoot-tiflash.md) - [FAQ](/tiflash/tiflash-faq.md) + TiDB Binlog - [Overview](/tidb-binlog/tidb-binlog-overview.md) diff --git a/best-practices/massive-regions-best-practices.md b/best-practices/massive-regions-best-practices.md index 1bc87a4ac2455..77a62e6b5d26f 100644 --- a/best-practices/massive-regions-best-practices.md +++ b/best-practices/massive-regions-best-practices.md @@ -40,7 +40,7 @@ You can check the following monitoring metrics in Grafana's **TiKV Dashboard**: + `Raft store CPU` in the **Thread-CPU** panel - Reference value: lower than `raftstore.store-pool-size * 85%`. TiDB v2.1 does not have the `raftstore.store-pool-size` configuration item, so you can take this item's value as `1` in v2.1 versions. + Reference value: lower than `raftstore.store-pool-size * 85%`. ![Check Raftstore CPU](/media/best-practices/raft-store-cpu.png) @@ -61,7 +61,7 @@ After finding out the cause of a performance problem, try to solve it from the f ### Method 1: Increase Raftstore concurrency -Raftstore in TiDB v3.0 has been upgraded to a multi-threaded module, which greatly reduces the possibility that a Raftstore thread becomes the bottleneck. +Raftstore has been upgraded to a multi-threaded module since TiDB v3.0, which greatly reduces the possibility that a Raftstore thread becomes the bottleneck. By default, `raftstore.store-pool-size` is configured to `2` in TiKV. If a bottleneck occurs in Raftstore, you can properly increase the value of this configuration item according to the actual situation. But to avoid introducing unnecessary thread switching overhead, it is recommended that you do not set this value too high. @@ -69,13 +69,13 @@ By default, `raftstore.store-pool-size` is configured to `2` in TiKV. If a bottl In the actual situation, read and write requests are not evenly distributed on every Region. Instead, they are concentrated on a few Regions. Then you can minimize the number of messages between the Raft leader and the followers for the temporarily idle Regions, which is the feature of Hibernate Region. In this feature, Raftstore does sent tick messages to the Raft state machines of idle Regions if not necessary. Then these Raft state machines will not be triggered to generate heartbeat messages, which can greatly reduce the workload of Raftstore. -Up to TiDB v3.0.9 or v3.1.0-beta.1, Hibernate Region is still an experimental feature, which is enabled by default in [TiKV master](https://github.com/tikv/tikv/tree/master). You can enable this feature according to your needs. For the configuration of Hibernate Region, refer to [Configure Hibernate Region](https://github.com/tikv/tikv/blob/master/docs/reference/configuration/raftstore-config.md#hibernate-region). +Hibernate Region is enabled by default in [TiKV master](https://github.com/tikv/tikv/tree/master). You can enable this feature according to your needs. For the configuration of Hibernate Region, refer to [Configure Hibernate Region](https://github.com/tikv/tikv/blob/master/docs/reference/configuration/raftstore-config.md#hibernate-region). ### Method 3: Enable `Region Merge` > **Note:** > -> `Region Merge` is enabled in TiDB v3.0 by default. +> `Region Merge` is enabled by default since TiDB v3.0. You can also reduce the number of Regions by enabling `Region Merge`. Contrary to `Region Split`, `Region Merge` is the process of merging adjacent small Regions through scheduling. After dropping data or executing the `Drop Table` or `Truncate Table` statement, you can merge small Regions or even empty Regions to reduce resource consumption. @@ -133,7 +133,7 @@ This section describes some other problems and solutions. PD needs to persist Region Meta information on etcd to ensure that PD can quickly resume to provide Region routing services after switching the PD Leader node. As the number of Regions increases, the performance problem of etcd appears, making it slower for PD to get Region Meta information from etcd when PD is switching the Leader. With millions of Regions, it might take more than ten seconds or even tens of seconds to get the meta information from etcd. -To address this problem, `use-region-storage` is enabled by default in PD in TiDB v3.0. With this feature enabled, PD stores Region Meta information on local LevelDB and synchronizes the information among PD nodes through other mechanisms. +To address this problem, `use-region-storage` is enabled by default in PD since TiDB v3.0. With this feature enabled, PD stores Region Meta information on local LevelDB and synchronizes the information among PD nodes through other mechanisms. ### PD routing information is not updated in time @@ -143,8 +143,8 @@ You can check **Worker pending tasks** under **Task** in the **TiKV Grafana** pa ![Check pd-worker](/media/best-practices/pd-worker-metrics.png) -Currently, pd-worker is optimized for better efficiency in [#5620](https://github.com/tikv/tikv/pull/5620) on [TiKV master](https://github.com/tikv/tikv/tree/master), which is applied since [v3.0.5](/releases/release-3.0.5.md#tikv). If you encounter a similar problem, it is recommended to upgrade to v3.0.5 or later versions. +pd-worker has been optimized for better performance since [v3.0.5](/releases/release-3.0.5.md#tikv). If you encounter a similar problem, it is recommended to upgrade to the latest version. ### Prometheus is slow to query metrics -In a large-scale cluster, as the number of TiKV instances increases, Prometheus has greater pressure to query metrics, making it slower for Grafana to display these metrics. To ease this problem, metrics pre-calculation is configured in v3.0. +In a large-scale cluster, as the number of TiKV instances increases, Prometheus has greater pressure to query metrics, making it slower for Grafana to display these metrics. To ease this problem, metrics pre-calculation is configured since v3.0. diff --git a/br/backup-and-restore-storages.md b/br/backup-and-restore-storages.md new file mode 100644 index 0000000000000..973f06a876715 --- /dev/null +++ b/br/backup-and-restore-storages.md @@ -0,0 +1,82 @@ +--- +title: BR storages +summary: Describes the storage URL format used in BR. +category: reference +--- + +# BR storages + +BR supports reading and writing data on the local filesystem, as well as on Amazon S3 and Google Cloud Storage. These are distinguished by the URL scheme in the `--storage` parameter passed into BR. + +## Schemes + +The following services are supported: + +| Service | Schemes | Example URL | +|---------|---------|-------------| +| Local filesystem, distributed on every node | local | `local:///path/to/dest/` | +| Amazon S3 and compatible services | s3 | `s3://bucket-name/prefix/of/dest/` | +| Google Cloud Storage (GCS) | gcs, gs | `gcs://bucket-name/prefix/of/dest/` | +| Write to nowhere (for benchmarking only) | noop | `noop://` | + +## Parameters + +Cloud storages such as S3 and GCS sometimes require additional configuration for connection. You can specify parameters for such configuration. For example: + +{{< copyable "shell-regular" >}} + +```shell +./br backup full -u 127.0.0.1:2379 -s 's3://bucket-name/prefix?region=us-west-2' +``` + +### S3 parameters + +| Parameter | Description | +|----------:|---------| +| `access-key` | The access key | +| `secret-access-key` | The secret access key | +| `region` | Service Region for Amazon S3 (default to `us-east-1`) | +| `use-accelerate-endpoint` | Whether to use the accelerate endpoint on Amazon S3 (default to `false`) | +| `endpoint` | URL of custom endpoint for S3-compatible services (for example, `https://s3.example.com/`) | +| `force-path-style` | Use path style access rather than virtual hosted style access (default to `false`) | +| `storage-class` | Storage class of the uploaded objects (for example, `STANDARD`, `STANDARD_IA`) | +| `sse` | Server-side encryption algorithm used to encrypt the upload (empty, `AES256` or `aws:kms`) | +| `sse-kms-key-id` | If `sse` is set to `aws:kms`, specifies the KMS ID | +| `acl` | Canned ACL of the uploaded objects (for example, `private`, `authenticated-read`) | + +> **Note:** +> +> It is not recommended to pass in the access key and secret access key directly in the storage URL, because these keys are logged in plain text. BR tries to infer these keys from the environment in the following order: + +1. `$AWS_ACCESS_KEY_ID` and `$AWS_SECRET_ACCESS_KEY` environment variables +2. `$AWS_ACCESS_KEY` and `$AWS_SECRET_KEY` environment variables +3. Shared credentials file on the BR node at the path specified by the `$AWS_SHARED_CREDENTIALS_FILE` environment variable +4. Shared credentials file on the BR node at `~/.aws/credentials` +5. Current IAM role of the Amazon EC2 container +6. Current IAM role of the Amazon ECS task + +### GCS parameters + +| Parameter | Description | +|----------:|---------| +| `credentials-file` | The path to the credentials JSON file on the TiDB node | +| `storage-class` | Storage class of the uploaded objects (for example, `STANDARD`, `COLDLINE`) | +| `predefined-acl` | Predefined ACL of the uploaded objects (for example, `private`, `project-private`) | + +When `credentials-file` is not specified, BR will try to infer the credentials from the environment, in the following order: + +1. Content of the file on the BR node at the path specified by the `$GOOGLE_APPLICATION_CREDENTIALS` environment variable +2. Content of the file on the BR node at `~/.config/gcloud/application_default_credentials.json` +3. When running in GCE or GAE, the credentials fetched from the metadata server. + +## Sending credentials to TiKV + +By default, when using S3 and GCS destinations, BR will send the credentials to every TiKV nodes to reduce setup complexity. + +However, this is unsuitable on cloud environment, where every node has their own role and permission. In such cases, you need to disable credentials sending with `--send-credentials-to-tikv=false` (or the short form `-c=0`): + +{{< copyable "shell-regular" >}} + +```shell +./br backup full -c=0 -u pd-service:2379 -s 's3://bucket-name/prefix' +``` diff --git a/br/backup-and-restore-tool.md b/br/backup-and-restore-tool.md index 6f59017f0b856..dd5a9512b897b 100644 --- a/br/backup-and-restore-tool.md +++ b/br/backup-and-restore-tool.md @@ -263,7 +263,7 @@ To restore the cluster data, use the `br restore` command. You can add the `full > - Data are replicated into multiple peers. When ingesting SSTs, these files have to be present on *all* peers. This is unlike back up where reading from a single node is enough. > - Where each peer is scattered to during restore is random. We don't know in advance which node will read which file. > -> These can be avoided using shared storage, e.g. mounting an NFS on the local path, or using S3. With network storage, every node can automatically read every SST file, so these caveats no longer apply. +> These can be avoided using shared storage, for example mounting an NFS on the local path, or using S3. With network storage, every node can automatically read every SST file, so these caveats no longer apply. ### Restore all the backup data diff --git a/command-line-flags-for-pd-configuration.md b/command-line-flags-for-pd-configuration.md index 0fd6f8919f535..6d387dff758fa 100644 --- a/command-line-flags-for-pd-configuration.md +++ b/command-line-flags-for-pd-configuration.md @@ -102,12 +102,6 @@ PD is configurable using command-line flags and environment variables. - The path of the PEM file including the X509 key, used to enable TLS - Default: "" -## `--namespace-classifier` - -- To specify the namespace classifier used by PD -- Default: "table" -- If you use TiKV separately, not in the entire TiDB cluster, it is recommended to configure the value to 'default'. - ## `--metrics-addr` - The address of Prometheus Pushgateway, which does not push data to Promethus by default. diff --git a/media/sqlgram/BRIETables.png b/media/sqlgram/BRIETables.png new file mode 100644 index 0000000000000..f737423edcb11 Binary files /dev/null and b/media/sqlgram/BRIETables.png differ diff --git a/media/sqlgram/BackupOption.png b/media/sqlgram/BackupOption.png new file mode 100644 index 0000000000000..4425d81c996e0 Binary files /dev/null and b/media/sqlgram/BackupOption.png differ diff --git a/media/sqlgram/BackupStmt.png b/media/sqlgram/BackupStmt.png new file mode 100644 index 0000000000000..5ffa691835a7d Binary files /dev/null and b/media/sqlgram/BackupStmt.png differ diff --git a/media/sqlgram/BackupTSO.png b/media/sqlgram/BackupTSO.png new file mode 100644 index 0000000000000..29ea0bf28a549 Binary files /dev/null and b/media/sqlgram/BackupTSO.png differ diff --git a/media/sqlgram/Boolean.png b/media/sqlgram/Boolean.png new file mode 100644 index 0000000000000..cedd61a775344 Binary files /dev/null and b/media/sqlgram/Boolean.png differ diff --git a/media/sqlgram/RestoreOption.png b/media/sqlgram/RestoreOption.png new file mode 100644 index 0000000000000..8ff807478df15 Binary files /dev/null and b/media/sqlgram/RestoreOption.png differ diff --git a/media/sqlgram/RestoreStmt.png b/media/sqlgram/RestoreStmt.png new file mode 100644 index 0000000000000..7fae5bc1152b7 Binary files /dev/null and b/media/sqlgram/RestoreStmt.png differ diff --git a/media/sqlgram/ShowBRIEStmt.png b/media/sqlgram/ShowBRIEStmt.png new file mode 100644 index 0000000000000..651e03a67802b Binary files /dev/null and b/media/sqlgram/ShowBRIEStmt.png differ diff --git a/media/sqlgram/ShowLikeOrWhereOpt.png b/media/sqlgram/ShowLikeOrWhereOpt.png index c2e3cd0807518..2f3506732ac39 100644 Binary files a/media/sqlgram/ShowLikeOrWhereOpt.png and b/media/sqlgram/ShowLikeOrWhereOpt.png differ diff --git a/sql-statements/sql-statement-backup.md b/sql-statements/sql-statement-backup.md new file mode 100644 index 0000000000000..284b9be0db186 --- /dev/null +++ b/sql-statements/sql-statement-backup.md @@ -0,0 +1,190 @@ +--- +title: BACKUP | TiDB SQL Statement Reference +summary: An overview of the usage of BACKUP for the TiDB database. +category: reference +--- + +# BACKUP + +This statement is used to perform a distributed backup of the TiDB cluster. + +The `BACKUP` statement uses the same engine as the [BR tool](/br/backup-and-restore-use-cases.md) does, except that the backup process is driven by TiDB itself rather than a separate BR tool. All benefits and warnings of BR also apply in this statement. + +Executing `BACKUP` requires `SUPER` privilege. Additionally, both the TiDB node executing the backup and all TiKV nodes in the cluster must have read or write permission to the destination. + +The `BACKUP` statement is blocked until the entire backup task is finished, failed, or canceled. A long-lasting connection should be prepared for executing `BACKUP`. The task can be canceled using the [`KILL TIDB QUERY`](/sql-statements/sql-statement-kill.md) statement. + +Only one `BACKUP` and [`RESTORE`](/sql-statements/sql-statement-restore.md) task can be executed at a time. If a `BACKUP` or `RESTORE` statement is already being executed on the same TiDB server, the new `BACKUP` execution will wait until all previous tasks are finished. + +`BACKUP` can only be used with "tikv" storage engine. Using `BACKUP` with the "mocktikv" engine will fail. + +## Synopsis + +**BackupStmt:** + +![BackupStmt](/media/sqlgram/BackupStmt.png) + +**BRIETables:** + +![BRIETables](/media/sqlgram/BRIETables.png) + +**BackupOption:** + +![BackupOption](/media/sqlgram/BackupOption.png) + +**Boolean:** + +![Boolean](/media/sqlgram/Boolean.png) + +**BackupTSO:** + +![BackupTSO](/media/sqlgram/BackupTSO.png) + +## Examples + +### Back up databases + +{{< copyable "sql" >}} + +```sql +BACKUP DATABASE `test` TO 'local:///mnt/backup/2020/04/'; +``` + +```sql ++------------------------------+-----------+-----------------+---------------------+---------------------+ +| Destination | Size | BackupTS | Queue Time | Execution Time | ++------------------------------+-----------+-----------------+---------------------+---------------------+ +| local:///mnt/backup/2020/04/ | 248665063 | 416099531454472 | 2020-04-12 23:09:48 | 2020-04-12 23:09:48 | ++------------------------------+-----------+-----------------+---------------------+---------------------+ +1 row in set (58.453 sec) +``` + +In the example above, the `test` database is backed up into the local filesystem. The data is saved as SST files in the `/mnt/backup/2020/04/` directories distributed among all TiDB and TiKV nodes. + +The first row of the result above is described as follows: + +| Column | Description | +| :-------- | :--------- | +| `Destination` | The destination URL | +| `Size` | The total size of the backup archive, in bytes | +| `BackupTS` | The TSO of the snapshot when the backup is created (useful for [incremental backup](#incremental-backup)) | +| `Queue Time` | The timestamp (in current time zone) when the `BACKUP` task is queued. | +| `Execution Time` | The timestamp (in current time zone) when the `BACKUP` task starts to run. | + +### Back up tables + +{{< copyable "sql" >}} + +```sql +BACKUP TABLE `test`.`sbtest01` TO 'local:///mnt/backup/sbtest01/'; +``` + +{{< copyable "sql" >}} + +```sql +BACKUP TABLE sbtest02, sbtest03, sbtest04 TO 'local:///mnt/backup/sbtest/'; +``` + +### Back up the entire cluster + +{{< copyable "sql" >}} + +```sql +BACKUP DATABASE * TO 'local:///mnt/backup/full/'; +``` + +Note that the system tables (`mysql.*`, `INFORMATION_SCHEMA.*`, `PERFORMANCE_SCHEMA.*`, …) will not be included into the backup. + +### Remote destinations + +BR supports backing up data to S3 or GCS: + +{{< copyable "sql" >}} + +```sql +BACKUP DATABASE `test` TO 's3://example-bucket-2020/backup-05/?region=us-west-2'; +``` + +The URL syntax is further explained in [BR storages](/br/backup-and-restore-storages.md). + +When running on cloud environment where credentials should not be distributed, set the `SEND_CREDENTIALS_TO_TIKV` option to `FALSE`: + +{{< copyable "sql" >}} + +```sql +BACKUP DATABASE `test` TO 's3://example-bucket-2020/backup-05/?region=us-west-2' + SEND_CREDENTIALS_TO_TIKV = FALSE; +``` + +### Performance fine-tuning + +Use `RATE_LIMIT` to limit the average upload speed per TiKV node to reduce network bandwidth. + +By default, every TiKV node would run 4 backup threads. This value can be adjusted with the `CONCURRENCY` option. + +Before backup is completed, `BACKUP` would perform a checksum against the data on the cluster to verify correctness. This step can be disabled with the `CHECKSUM` option if you are confident that this is unnecessary. + +{{< copyable "sql" >}} + +```sql +BACKUP DATABASE `test` TO 's3://example-bucket-2020/backup-06/' + RATE_LIMIT = 120 MB/SECOND + CONCURRENCY = 8 + CHECKSUM = FALSE; +``` + +### Snapshot + +Specify a timestamp, TSO or relative time to backup historical data. + +{{< copyable "sql" >}} + +```sql +-- relative time +BACKUP DATABASE `test` TO 'local:///mnt/backup/hist01' + SNAPSHOT = 36 HOUR AGO; + +-- timestamp (in current time zone) +BACKUP DATABASE `test` TO 'local:///mnt/backup/hist02' + SNAPSHOT = '2020-04-01 12:00:00'; + +-- timestamp oracle +BACKUP DATABASE `test` TO 'local:///mnt/backup/hist03' + SNAPSHOT = 415685305958400; +``` + +The supported units for relative time are: + +* MICROSECOND +* SECOND +* MINUTE +* HOUR +* DAY +* WEEK + +Note that, following SQL standard, the units are always singular. + +### Incremental backup + +Supply the `LAST_BACKUP` option to only backup the changes between the last backup to the current snapshot. + +{{< copyable "sql" >}} + +```sql +-- timestamp (in current time zone) +BACKUP DATABASE `test` TO 'local:///mnt/backup/hist02' + LAST_BACKUP = '2020-04-01 12:00:00'; + +-- timestamp oracle +BACKUP DATABASE `test` TO 'local:///mnt/backup/hist03' + LAST_BACKUP = 415685305958400; +``` + +## MySQL compatibility + +This statement is a TiDB extension to MySQL syntax. + +## See also + +* [RESTORE](/sql-statements/sql-statement-restore.md) +* [SHOW BACKUPS](/sql-statements/sql-statement-show-backups.md) diff --git a/sql-statements/sql-statement-restore.md b/sql-statements/sql-statement-restore.md new file mode 100644 index 0000000000000..d47aabb676dfd --- /dev/null +++ b/sql-statements/sql-statement-restore.md @@ -0,0 +1,159 @@ +--- +title: RESTORE | TiDB SQL Statement Reference +summary: An overview of the usage of RESTORE for the TiDB database. +category: reference +--- + +# RESTORE + +This statement performs a distributed restore from a backup archive previously produced by a [`BACKUP` statement](/sql-statements/sql-statement-backup.md). + +The `RESTORE` statement uses the same engine as the [BR tool](/br/backup-and-restore-use-cases.md), except that the restore process is driven by TiDB itself rather than a separate BR tool. All benefits and caveats of BR also apply here. In particular, **`RESTORE` is currently not ACID-compliant**. Before running `RESTORE`, ensure that the following requirements are met: + +* The cluster is "offline", and the current TiDB session is the only active SQL connection to access all tables being restored. +* When a full restore is being performed, the tables being restored should not already exist, because existing data might be overridden and causes inconsistency between the data and indices. +* When an incremental restore is being performed, the tables should be at the exact same state as the `LAST_BACKUP` timestamp when the backup is created. + +Running `RESTORE` requires `SUPER` privilege. Additionally, both the TiDB node executing the backup and all TiKV nodes in the cluster must have read permission from the destination. + +The `RESTORE` statement is blocking, and will finish only after the entire backup task is finished, failed, or canceled. A long-lasting connection should be prepared for running `RESTORE`. The task can be canceled using the [`KILL TIDB QUERY`](/sql-statements/sql-statement-kill.md) statement. + +Only one `BACKUP` and `RESTORE` task can be executed at a time. If a `BACKUP` or `RESTORE` task is already running on the same TiDB server, the new `RESTORE` execution will wait until all previous tasks are done. + +`RESTORE` can only be used with "tikv" storage engine. Using `RESTORE` with the "mocktikv" engine will fail. + +## Synopsis + +**RestoreStmt:** + +![RestoreStmt](/media/sqlgram/RestoreStmt.png) + +**BRIETables:** + +![BRIETables](/media/sqlgram/BRIETables.png) + +**RestoreOption:** + +![RestoreOption](/media/sqlgram/RestoreOption.png) + +**Boolean:** + +![Boolean](/media/sqlgram/Boolean.png) + +## Examples + +### Restore from backup archive + +{{< copyable "sql" >}} + +```sql +RESTORE DATABASE * FROM 'local:///mnt/backup/2020/04/'; +``` + +```sql ++------------------------------+-----------+----------+---------------------+---------------------+ +| Destination | Size | BackupTS | Queue Time | Execution Time | ++------------------------------+-----------+----------+---------------------+---------------------+ +| local:///mnt/backup/2020/04/ | 248665063 | 0 | 2020-04-21 17:16:55 | 2020-04-21 17:16:55 | ++------------------------------+-----------+----------+---------------------+---------------------+ +1 row in set (28.961 sec) +``` + +In the example above, all data is restored from a backup archive at the local filesystem. The data is read as SST files from the `/mnt/backup/2020/04/` directories distributed among all TiDB and TiKV nodes. + +The first row of the result above is described as follows: + +| Column | Description | +| :-------- | :--------- | +| `Destination` | The destination URL to read from | +| `Size` | The total size of the backup archive, in bytes | +| `BackupTS` | (not used) | +| `Queue Time` | The timestamp (in current time zone) when the `RESTORE` task was queued. | +| `Execution Time` | The timestamp (in current time zone) when the `RESTORE` task starts to run. | + +### Partial restore + +You can specify which databases or tables to restore. If some databases or tables are missing from the backup archive, they will be ignored, and thus `RESTORE` would complete without doing anything. + +{{< copyable "sql" >}} + +```sql +RESTORE DATABASE `test` FROM 'local:///mnt/backup/2020/04/'; +``` + +{{< copyable "sql" >}} + +```sql +RESTORE TABLE `test`.`sbtest01`, `test`.`sbtest02` FROM 'local:///mnt/backup/2020/04/'; +``` + +### Remote destinations + +BR supports restoring data from S3 or GCS: + +{{< copyable "sql" >}} + +```sql +RESTORE DATABASE * FROM 's3://example-bucket-2020/backup-05/?region=us-west-2'; +``` + +The URL syntax is further explained in [BR storages](/br/backup-and-restore-storages.md). + +When running on cloud environment where credentials should not be distributed, set the `SEND_CREDENTIALS_TO_TIKV` option to `FALSE`: + +{{< copyable "sql" >}} + +```sql +RESTORE DATABASE * FROM 's3://example-bucket-2020/backup-05/?region=us-west-2' + SEND_CREDENTIALS_TO_TIKV = FALSE; +``` + +### Performance fine-tuning + +Use `RATE_LIMIT` to limit the average download speed per TiKV node to reduce network bandwidth. + +By default, TiDB node would run 128 restore threads. This value can be adjusted with the `CONCURRENCY` option. + +Before restore is completed, `RESTORE` would perform a checksum against the data from the archive to verify correctness. This step can be disabled with the `CHECKSUM` option if you are confident that this is unnecessary. + +{{< copyable "sql" >}} + +```sql +RESTORE DATABASE * FROM 's3://example-bucket-2020/backup-06/' + RATE_LIMIT = 120 MB/SECOND + CONCURRENCY = 64 + CHECKSUM = FALSE; +``` + +### Incremental restore + +There is no special syntax to perform incremental restore. TiDB will recognize whether the backup archive is full or incremental and take appropriate action. You only need to apply each incremental restore in correct order. + +For instance, if a backup task is created as follows: + +{{< copyable "sql" >}} + +```sql +BACKUP DATABASE `test` TO 's3://example-bucket/full-backup' SNAPSHOT = 413612900352000; +BACKUP DATABASE `test` TO 's3://example-bucket/inc-backup-1' SNAPSHOT = 414971854848000 LAST_BACKUP = 413612900352000; +BACKUP DATABASE `test` TO 's3://example-bucket/inc-backup-2' SNAPSHOT = 416353458585600 LAST_BACKUP = 414971854848000; +``` + +then the same order should be applied in the restore: + +{{< copyable "sql" >}} + +```sql +RESTORE DATABASE * FROM 's3://example-bucket/full-backup'; +RESTORE DATABASE * FROM 's3://example-bucket/inc-backup-1'; +RESTORE DATABASE * FROM 's3://example-bucket/inc-backup-2'; +``` + +## MySQL compatibility + +This statement is a TiDB extension to MySQL syntax. + +## See also + +* [BACKUP](/sql-statements/sql-statement-backup.md) +* [SHOW RESTORES](/sql-statements/sql-statement-show-backups.md) diff --git a/sql-statements/sql-statement-show-backups.md b/sql-statements/sql-statement-show-backups.md new file mode 100644 index 0000000000000..9ae112ee44335 --- /dev/null +++ b/sql-statements/sql-statement-show-backups.md @@ -0,0 +1,99 @@ +--- +title: SHOW [BACKUPS|RESTORES] | TiDB SQL Statement Reference +summary: An overview of the usage of SHOW [BACKUPS|RESTORES] for the TiDB database. +category: reference +--- + +# SHOW [BACKUPS|RESTORES] + +This statement shows a list of all queued and running [`BACKUP`](/sql-statements/sql-statement-backup.md) and [`RESTORE`](/sql-statements/sql-statement-restore.md) tasks. + +Use `SHOW BACKUPS` to query `BACKUP` tasks and use `SHOW RESTORES` to query `RESTORE` tasks. Both statements require `SUPER` privilege to run. + +## Synopsis + +**ShowBRIEStmt:** + +![ShowBRIEStmt](/media/sqlgram/ShowBRIEStmt.png) + +**ShowLikeOrWhereOpt:** + +![ShowLikeOrWhereOpt](/media/sqlgram/ShowLikeOrWhereOpt.png) + +## Examples + +In one connection, execute the following statement: + +{{< copyable "sql" >}} + +```sql +BACKUP DATABASE `test` TO 's3://example-bucket/backup-01/?region=us-west-1'; +``` + +Before the backup completes, run `SHOW BACKUPS` in a new connection: + +{{< copyable "sql" >}} + +```sql +SHOW BACKUPS; +``` + +```sql ++--------------------------------+---------+----------+---------------------+---------------------+-------------+------------+ +| Destination | State | Progress | Queue_Time | Execution_Time | Finish_Time | Connection | ++--------------------------------+---------+----------+---------------------+---------------------+-------------+------------+ +| s3://example-bucket/backup-01/ | Backup | 98.38 | 2020-04-12 23:09:03 | 2020-04-12 23:09:25 | NULL | 4 | ++--------------------------------+---------+----------+---------------------+---------------------+-------------+------------+ +1 row in set (0.00 sec) +``` + +The first row of the result above is described as follows: + +| Column | Description | +| :-------- | :--------- | +| `Destination` | The destination URL (with all parameters stripped to avoid leaking secret keys) | +| `State` | State of the task | +| `Progress` | Estimated progress in the current state as a percentage | +| `Queue_Time` | When the task was queued | +| `Execution_Time` | When the task was started; the value is `0000-00-00 00:00:00` for queueing tasks | +| `Finish_Time` | (not used for now) | +| `Connection` | Connection ID running this task | + +The connection ID can be used to cancel a backup/restore task via the [`KILL TIDB QUERY`](/sql-statements/sql-statement-kill.md) statement. + +{{< copyable "sql" >}} + +```sql +KILL TIDB QUERY 4; +``` + +```sql +Query OK, 0 rows affected (0.00 sec) +``` + +### Filtering + +Use the `LIKE` clause to filter out tasks by matching the destination URL against a wildcard expression. + +{{< copyable "sql" >}} + +```sql +SHOW BACKUPS LIKE 's3://%'; +``` + +Use the `WHERE` clause to filter by columns. + +{{< copyable "sql" >}} + +```sql +SHOW BACKUPS WHERE `Progress` < 25.0; +``` + +## MySQL compatibility + +This statement is a TiDB extension to MySQL syntax. + +## See also + +* [BACKUP](/sql-statements/sql-statement-backup.md) +* [RESTORE](/sql-statements/sql-statement-restore.md) diff --git a/tiflash/maintain-tiflash.md b/tiflash/maintain-tiflash.md index 7a3434fcbf08b..521d689f5579c 100644 --- a/tiflash/maintain-tiflash.md +++ b/tiflash/maintain-tiflash.md @@ -7,7 +7,7 @@ aliases: ['/docs/stable/reference/tiflash/maintain/'] # Maintain a TiFlash Cluster -This document describes how to perform common operations when you maintain a TiFlash cluster, including checking the TiFlash version, taking TiFlash nodes down, and troubleshooting TiFlash. This document also introduces critical logs and a system table of TiFlash. +This document describes how to perform common operations when you maintain a TiFlash cluster, including checking the TiFlash version, and taking TiFlash nodes down. This document also introduces critical logs and a system table of TiFlash. ## Check the TiFlash version @@ -102,70 +102,6 @@ To manually delete the replication rules in PD, take the following steps: curl -v -X DELETE http://:/pd/api/v1/config/rule/tiflash/table-45-r ``` -## TiFlash troubleshooting - -This section describes some commonly encountered issues when using TiFlash, the reasons, and the solutions. - -### TiFlash replica is always unavailable - -This is because TiFlash is in an abnormal state caused by configuration errors or environment issues. Take the following steps to identify the faulty component: - -1. Check whether PD enables the `Placement Rules` feature (to enable the feature, see the step 2 of [Add TiFlash component to an existing TiDB cluster](/tiflash/deploy-tiflash.md#add-tiflash-component-to-an-existing-tidb-cluster): - - {{< copyable "shell-regular" >}} - - ```shell - echo 'config show replication' | /path/to/pd-ctl -u http://: - ``` - - The expected result is `"enable-placement-rules": "true"`. - -2. Check whether the TiFlash process is working correctly by viewing `UpTime` on the TiFlash-Summary monitoring panel. - -3. Check whether the TiFlash proxy status is normal through `pd-ctl`. - - {{< copyable "shell-regular" >}} - - ```shell - echo "store" | /path/to/pd-ctl -u http://: - ``` - - The TiFlash proxy's `store.labels` includes information such as `{"key": "engine", "value": "tiflash"}`. You can check this information to confirm a TiFlash proxy. - -4. Check whether `pd buddy` can correctly print the logs (the log path is the value of `log` in the [flash.flash_cluster] configuration item; the default log path is under the `tmp` directory configured in the TiFlash configuration file). - -5. Check whether the value of `max-replicas` in PD is less than or equal to the number of TiKV nodes in the cluster. If not, PD cannot replicate data to TiFlash: - - {{< copyable "shell-regular" >}} - - ```shell - echo 'config show replication' | /path/to/pd-ctl -u http://: - ``` - - Reconfirm the value of `max-replicas`. - -6. Check whether the remaining disk space of the machine (where `store` of the TiFlash node is) is sufficient. By default, when the remaining disk space is less than 20% of the `store` capacity (which is controlled by the `low-space-ratio` parameter), PD cannot schedule data to this TiFlash node. - -### TiFlash query time is unstable, and the error log prints many `Lock Exception` messages - -This is because large amounts of data are written to the cluster, which causes that the TiFlash query encounters a lock and requires query retry. - -You can set the query timestamp to one second earlier in TiDB. For example, if the current time is '2020-04-08 20:15:01', you can execute `set @@tidb_snapshot='2020-04-08 20:15:00';` before you execute the query. This makes less TiFlash queries encounter a lock and mitigates the risk of unstable query time. - -### Some queries return the `Region Unavailable` error - -If the load pressure on TiFlash is too heavy and it causes that TiFlash data replication falls behind, some queries might return the `Region Unavailable` error. - -In this case, you can balance the load pressure by adding more TiFlash nodes. - -### Data file corruption - -Take the following steps to handle the data file corruption: - -1. Refer to [Take a TiFlash node down](#take-a-tiflash-node-down) to take the corresponding TiFlash node down. -2. Delete the related data of the TiFlash node. -3. Redeploy the TiFlash node in the cluster. - ## TiFlash critical logs | Log Information | Log Description | diff --git a/tiflash/troubleshoot-tiflash.md b/tiflash/troubleshoot-tiflash.md new file mode 100644 index 0000000000000..c246d5d09c747 --- /dev/null +++ b/tiflash/troubleshoot-tiflash.md @@ -0,0 +1,69 @@ +--- +title: Troubleshoot a TiFlash Cluster +summary: Learn common operations when you troubleshoot a TiFlash cluster. +category: reference +--- + +# Troubleshoot a TiFlash Cluster + +This section describes some commonly encountered issues when using TiFlash, the reasons, and the solutions. + +## TiFlash replica is always unavailable + +This is because TiFlash is in an abnormal state caused by configuration errors or environment issues. Take the following steps to identify the faulty component: + +1. Check whether PD enables the `Placement Rules` feature (to enable the feature, see the step 2 of [Add TiFlash component to an existing TiDB cluster](/tiflash/deploy-tiflash.md#add-tiflash-component-to-an-existing-tidb-cluster): + + {{< copyable "shell-regular" >}} + + ```shell + echo 'config show replication' | /path/to/pd-ctl -u http://: + ``` + + The expected result is `"enable-placement-rules": "true"`. + +2. Check whether the TiFlash process is working correctly by viewing `UpTime` on the TiFlash-Summary monitoring panel. + +3. Check whether the TiFlash proxy status is normal through `pd-ctl`. + + {{< copyable "shell-regular" >}} + + ```shell + echo "store" | /path/to/pd-ctl -u http://: + ``` + + The TiFlash proxy's `store.labels` includes information such as `{"key": "engine", "value": "tiflash"}`. You can check this information to confirm a TiFlash proxy. + +4. Check whether `pd buddy` can correctly print the logs (the log path is the value of `log` in the [flash.flash_cluster] configuration item; the default log path is under the `tmp` directory configured in the TiFlash configuration file). + +5. Check whether the value of `max-replicas` in PD is less than or equal to the number of TiKV nodes in the cluster. If not, PD cannot replicate data to TiFlash: + + {{< copyable "shell-regular" >}} + + ```shell + echo 'config show replication' | /path/to/pd-ctl -u http://: + ``` + + Reconfirm the value of `max-replicas`. + +6. Check whether the remaining disk space of the machine (where `store` of the TiFlash node is) is sufficient. By default, when the remaining disk space is less than 20% of the `store` capacity (which is controlled by the `low-space-ratio` parameter), PD cannot schedule data to this TiFlash node. + +## TiFlash query time is unstable, and the error log prints many `Lock Exception` messages + +This is because large amounts of data are written to the cluster, which causes that the TiFlash query encounters a lock and requires query retry. + +You can set the query timestamp to one second earlier in TiDB. For example, if the current time is '2020-04-08 20:15:01', you can execute `set @@tidb_snapshot='2020-04-08 20:15:00';` before you execute the query. This makes less TiFlash queries encounter a lock and mitigates the risk of unstable query time. + +## Some queries return the `Region Unavailable` error + +If the load pressure on TiFlash is too heavy and it causes that TiFlash data replication falls behind, some queries might return the `Region Unavailable` error. + +In this case, you can balance the load pressure by adding more TiFlash nodes. + +## Data file corruption + +Take the following steps to handle the data file corruption: + +1. Refer to [Take a TiFlash node down](#take-a-tiflash-node-down) to take the corresponding TiFlash node down. +2. Delete the related data of the TiFlash node. +3. Redeploy the TiFlash node in the cluster. diff --git a/tiflash/upgrade-tiflash.md b/tiflash/upgrade-tiflash.md deleted file mode 100644 index a7abe60e95af5..0000000000000 --- a/tiflash/upgrade-tiflash.md +++ /dev/null @@ -1,44 +0,0 @@ ---- -title: Upgrade TiFlash Nodes -summary: Learn how to upgrade TiFlash nodes. -category: reference -aliases: ['/docs/stable/reference/tiflash/upgrade/'] ---- - -# Upgrade TiFlash Nodes - -> **Note:** -> -> To upgrade TiFlash from the Pre-RC version to a later version, contact [PingCAP](mailto:info@pingcap.com) for more information and help. - -Currently, you cannot upgrade TiFlash by running the `tiup cluster upgrade` command. Instead, take the following steps to upgrade TiFlash: - -Before the upgrade, make sure that the cluster is started. To upgrade TiFlash nodes, take the following steps: - -1. Refer to [Scale in a TiFlash node](/scale-tidb-using-tiup.md#sclale-in-a-tiflash-node), and scale in all the TiFlash nodes. - -2. Run the upgrade command: - - {{< copyable "shell-regular" >}} - - ```shell - tiup cluster upgrade test v4.0.0-rc - ``` - -3. Run the scale-out command: - - {{< copyable "shell-regular" >}} - - ```shell - tiup cluster scale-out test scale-out.yaml - ``` - -4. View the cluster status: - - {{< copyable "shell-regular" >}} - - ```shell - tiup cluster display test - ``` - -5. Access the monitoring platform using your browser, and view the status of the cluster.