Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
- Develop
- [Overview](/develop/dev-guide-overview.md)
- Quick Start
- [Build a TiDB Cluster in TiDB Cloud (Serverless Tier)](/develop/dev-guide-build-cluster-in-cloud.md)
- [Build a TiDB Serverless Cluster](/develop/dev-guide-build-cluster-in-cloud.md)
- [CRUD SQL in TiDB](/develop/dev-guide-tidb-crud-sql.md)
- Example Applications
- [Golang](/develop/dev-guide-sample-application-golang.md)
Expand Down
158 changes: 158 additions & 0 deletions _docHome.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
---
title: PingCAP Documentation
hide_sidebar: true
hide_commit: true
hide_leftNav: true
---

<DocHomeContainer title="PingCAP Documentation" subTitle="Explore the how-to guides and references you need to use TiDB Cloud and TiDB, migrate data, and build your applications on the database.">

<DocHomeSection label="TiDB Cloud" anchor="tidb-cloud" id="tidb-cloud">

TiDB Cloud is a fully-managed Database-as-a-Service (DBaaS) that brings everything great about TiDB to your cloud, and lets you focus on your applications, not the complexities of your database.

<DocHomeCardContainer>

<DocHomeCard href="https://docs.pingcap.com/tidbcloud" icon="doc2" label="TiDB Cloud Docs">

See the documentation of TiDB Cloud

</DocHomeCard>

<DocHomeCard href="https://docs.pingcap.com/tidbcloud/tidb-cloud-quickstart" icon="cloud5" label="Get Started with TiDB Cloud">

Guides you through an easy way to get started with TiDB Cloud

</DocHomeCard>

<DocHomeCard href="https://docs.pingcap.com/tidbcloud/tidb-cloud-poc" icon="cloud3" label="Perform a PoC with TiDB Cloud">

Helps you quickly complete a Proof of Concept (PoC) of TiDB Cloud

</DocHomeCard>

</DocHomeCardContainer>

Get the power of a cloud-native, distributed SQL database built for real-time analytics in a fully-managed service.

<a href="https://tidbcloud.com/free-trial" class="button" target="_blank" referrerpolicy="no-referrer-when-downgrade">Try Free</a>

</DocHomeSection>

<DocHomeSection label="TiDB" anchor="tidb" id="tidb">

<!-- Localization note for TiDB:

- English: use distributed SQL, and start to emphasize HTAP
- Chinese: can keep "NewSQL" and emphasize one-stop real-time HTAP ("一栈式实时 HTAP")
- Japanese: use NewSQL because it is well-recognized

-->

TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability. You can deploy TiDB in a self-hosted environment or in the cloud.

<DocHomeCardContainer>

<DocHomeCard href="https://docs.pingcap.com/tidb/stable" icon="doc1" label="TiDB Docs">

See the documentation of TiDB

</DocHomeCard>

<DocHomeCard href="https://docs.pingcap.com/tidb/stable/quick-start-with-tidb" icon="doc5" label="Get Started with TiDB">

Walks you through the quickest way to get started with TiDB

</DocHomeCard>

<DocHomeCard href="https://docs.pingcap.com/tidb/stable/production-deployment-using-tiup" icon="cloud7" label="Deploy a Local TiDB Cluster">

Learn how to deploy TiDB locally in production

</DocHomeCard>

</DocHomeCardContainer>

The open-source TiDB platform is released under the Apache 2.0 license, and supported by the community.

<a href="https://en.pingcap.com/download/" class="button" target="_blank" referrerpolicy="no-referrer-when-downgrade">Download</a>

</DocHomeSection>

<DocHomeSection label="Developers" anchor="developers" id="developers">

<DocHomeCardContainer>

<DocHomeCard href="https://docs.pingcap.com/tidb/stable/dev-guide-overview" icon="doc8" label="Developer Guide">

Documentation for TiDB application developers

</DocHomeCard>

<DocHomeCard href="https://docs.pingcap.com/tidbcloud/dev-guide-overview" icon="cloud-dev" label="Developer Guide">

Documentation for TiDB Cloud application developers

</DocHomeCard>

</DocHomeCardContainer>

</DocHomeSection>

<DocHomeSection label="More resources" anchor="resources" id="resources">

<DocHomeCardContainer>

<DocHomeCard href="https://en.pingcap.com/education/" icon="cloud1" label="PingCAP Education">

Learn TiDB and TiDB Cloud through well-designed online courses and instructor-led training

</DocHomeCard>

<DocHomeCard href="https://en.pingcap.com/community/" icon="doc9" label="Community">

Join us on Slack or become a contributor

</DocHomeCard>

<DocHomeCard href="https://en.pingcap.com/blog/" icon="doc10" label="Blog Posts">

Learn great articles about TiDB and TiDB Cloud

</DocHomeCard>

<DocHomeCard href="https://en.pingcap.com/videos/" icon="doc11" label="Videos">

See a compilation of short videos describing TiDB and a variety of use cases

</DocHomeCard>

<DocHomeCard href="https://en.pingcap.com/event/" icon="events" label="Events">

Learn events about PingCAP and the community

</DocHomeCard>

<DocHomeCard href="https://en.pingcap.com/ebook-whitepaper/" icon="papers" label="eBooks & Papers">

Download eBooks and papers

</DocHomeCard>

<DocHomeCard href="https://ossinsight.io/" icon="ossinsight" label="OSS Insight">

A powerful insight tool that analyzes in depth any GitHub repository, powered by TiDB Cloud

</DocHomeCard>

<DocHomeCard href="https://github.com/pingcap/docs/blob/master/CONTRIBUTING.md" icon="contributor" label="Contribute">

Let’s work together to make the documentation better!

</DocHomeCard>

</DocHomeCardContainer>

</DocHomeSection>

</DocHomeContainer>
146 changes: 146 additions & 0 deletions br/br-pitr-guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
---
title: TiDB Log Backup and PITR Guide
summary: Learns about how to perform log backup and PITR in TiDB.
aliases: ['/tidb/dev/pitr-usage']
---

# TiDB Log Backup and PITR Guide

A full backup (snapshot backup) contains the full cluster data at a certain point, while TiDB log backup can back up data written by applications to a specified storage in a timely manner. If you want to choose the restore point as required, that is, to perform point-in-time recovery (PITR), you can [start log backup](#start-log-backup) and [run full backup regularly](#run-full-backup-regularly).

Before you back up or restore data using the br command-line tool (hereinafter referred to as `br`), you need to [install `br`](/br/br-use-overview.md#deploy-and-use-br) first.

## Back up TiDB cluster

### Start log backup

> **Note:**
>
> - The following examples assume that Amazon S3 access keys and secret keys are used to authorize permissions. If IAM roles are used to authorize permissions, you need to set `--send-credentials-to-tikv` to `false`.
> - If other storage systems or authorization methods are used to authorize permissions, adjust the parameter settings according to [Backup Storages](/br/backup-and-restore-storages.md).

To start a log backup, run `br log start`. A cluster can only run one log backup task each time.

```shell
tiup br log start --task-name=pitr --pd "${PD_IP}:2379" \
--storage 's3://backup-101/logbackup?access-key=${access-key}&secret-access-key=${secret-access-key}"'
```

After the log backup task starts, it runs in the background of the TiDB cluster until you stop it manually. During this process, the TiDB change logs are regularly backed up to the specified storage in small batches. To query the status of the log backup task, run the following command:

```shell
tiup br log status --task-name=pitr --pd "${PD_IP}:2379"
```

Expected output:

```
● Total 1 Tasks.
> #1 <
name: pitr
status: ● NORMAL
start: 2022-05-13 11:09:40.7 +0800
end: 2035-01-01 00:00:00 +0800
storage: s3://backup-101/log-backup
speed(est.): 0.00 ops/s
checkpoint[global]: 2022-05-13 11:31:47.2 +0800; gap=4m53s
```

### Run full backup regularly

The snapshot backup can be used as a method of full backup. You can run `br backup full` to back up the cluster snapshot to the backup storage according to a fixed schedule (for example, every 2 days).

```shell
tiup br backup full --pd "${PD_IP}:2379" \
--storage 's3://backup-101/snapshot-${date}?access-key=${access-key}&secret-access-key=${secret-access-key}"'
```

## Run PITR

To restore the cluster to any point in time within the backup retention period, you can use `br restore point`. When you run this command, you need to specify the **time point you want to restore**, **the latest snapshot backup data before the time point**, and the **log backup data**. BR will automatically determine and read data needed for the restore, and then restore these data to the specified cluster in order.

```shell
br restore point --pd "${PD_IP}:2379" \
--storage='s3://backup-101/logbackup?access-key=${access-key}&secret-access-key=${secret-access-key}"' \
--full-backup-storage='s3://backup-101/snapshot-${date}?access-key=${access-key}&secret-access-key=${secret-access-key}"' \
--restored-ts '2022-05-15 18:00:00+0800'
```

During data restore, you can view the progress through the progress bar in the terminal. The restore is divided into two phases, full restore and log restore (restore meta files and restore KV files). After each phase is completed, `br` outputs information such as restore time and data size.

```shell
Full Restore <--------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
*** ["Full Restore success summary"] ****** [total-take=xxx.xxxs] [restore-data-size(after-compressed)=xxx.xxx] [Size=xxxx] [BackupTS={TS}] [total-kv=xxx] [total-kv-size=xxx] [average-speed=xxx]
Restore Meta Files <--------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
Restore KV Files <----------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
*** ["restore log success summary"] [total-take=xxx.xx] [restore-from={TS}] [restore-to={TS}] [total-kv-count=xxx] [total-size=xxx]
```

## Clean up outdated data

As described in the [Usage Overview of TiDB Backup and Restore](/br/br-use-overview.md):

To perform PITR, you need to restore the full backup before the restore point, and the log backup between the full backup point and the restore point. Therefore, for log backups that exceed the backup retention period, you can use `br log truncate` to delete the backup before the specified time point. **It is recommended to only delete the log backup before the full snapshot**.

The following steps describe how to clean up backup data that exceeds the backup retention period:

1. Get the **last full backup** outside the backup retention period.
2. Use the `validate` command to get the time point corresponding to the backup. Assume that the backup data before 2022/09/01 needs to be cleaned, you should look for the last full backup before this time point and ensure that it will not be cleaned.

```shell
FULL_BACKUP_TS=`tiup br validate decode --field="end-version" --storage "s3://backup-101/snapshot-${date}?access-key=${access-key}&secret-access-key=${secret-access-key}"| tail -n1`
```

3. Delete log backup data earlier than the snapshot backup `FULL_BACKUP_TS`:

```shell
tiup br log truncate --until=${FULL_BACKUP_TS} --storage='s3://backup-101/logbackup?access-key=${access-key}&secret-access-key=${secret-access-key}"'
```

4. Delete snapshot data earlier than the snapshot backup `FULL_BACKUP_TS`:

```shell
rm -rf s3://backup-101/snapshot-${date}
```

## Performance and impact of PITR

### Capabilities

- On each TiKV node, PITR can restore snapshot data at a speed of 280 GB/h and log data 30 GB/h.
- BR deletes outdated log backup data at a speed of 600 GB/h.

> **Note:**
>
> The preceding specifications are based on test results from the following two testing scenarios. The actual data might be different.
>
> - Snapshot data restore speed = Snapshot data size / (duration * the number of TiKV nodes)
> - Log data restore speed = Restored log data size / (duration * the number of TiKV nodes)
>
> The snapshot data size refers to the logical size of all KVs in a single replica, not the actual amount of restored data. BR restores all replicas according to the number of replicas configured for the cluster. The more replicas there are, the more data can be actually restored.
> The default replica number for all clusters in the test is 3.
> To improve the overall restore performance, you can modify the [`import.num-threads`](/tikv-configuration-file.md#import) item in the TiKV configuration file and the [`concurrency`](/br/use-br-command-line-tool.md#common-options) option in the BR command.

Testing scenario 1 (on [TiDB Cloud](https://tidbcloud.com)):

- The number of TiKV nodes (8 core, 16 GB memory): 21
- TiKV configuration item `import.num-threads`: 8
- BR command option `concurrency`: 128
- The number of Regions: 183,000
- New log data created in the cluster: 10 GB/h
- Write (INSERT/UPDATE/DELETE) QPS: 10,000

Testing scenario 2 (on TiDB Self-Hosted):

- The number of TiKV nodes (8 core, 64 GB memory): 6
- TiKV configuration item `import.num-threads`: 8
- BR command option `concurrency`: 128
- The number of Regions: 50,000
- New log data created in the cluster: 10 GB/h
- Write (INSERT/UPDATE/DELETE) QPS: 10,000

## See also

* [TiDB Backup and Restore Use Cases](/br/backup-and-restore-use-cases.md)
* [br Command-line Manual](/br/use-br-command-line-tool.md)
* [Log Backup and PITR Architecture](/br/br-log-architecture.md)
2 changes: 1 addition & 1 deletion clinic/clinic-user-guide-for-tiup.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ For TiDB clusters and DM clusters deployed using TiUP, you can use PingCAP Clini

> **Note:**
>
> - This document **only** applies to clusters deployed using TiUP in an on-premises environment. For clusters deployed using TiDB Operator on Kubernetes, see [PingCAP Clinic for TiDB Operator environments](https://docs.pingcap.com/tidb-in-kubernetes/stable/clinic-user-guide).
> - This document **only** applies to clusters deployed using TiUP in a self-hosted environment. For clusters deployed using TiDB Operator on Kubernetes, see [PingCAP Clinic for TiDB Operator environments](https://docs.pingcap.com/tidb-in-kubernetes/stable/clinic-user-guide).
>
> - PingCAP Clinic **does not support** collecting data from clusters deployed using TiDB Ansible.

Expand Down
10 changes: 5 additions & 5 deletions develop/dev-guide-aws-appflow-integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ summary: Introduce how to integrate TiDB with Amazon AppFlow step by step.

[Amazon AppFlow](https://aws.amazon.com/appflow/) is a fully managed API integration service that you use to connect your software as a service (SaaS) applications to AWS services, and securely transfer data. With Amazon AppFlow, you can import and export data from and to TiDB into many types of data providers, such as Salesforce, Amazon S3, LinkedIn, and GitHub. For more information, see [Supported source and destination applications](https://docs.aws.amazon.com/appflow/latest/userguide/app-specific.html) in AWS documentation.

This document describes how to integrate TiDB with Amazon AppFlow and takes integrating a TiDB Cloud Serverless Tier cluster as an example.
This document describes how to integrate TiDB with Amazon AppFlow and takes integrating a TiDB Serverless cluster as an example.

If you do not have a TiDB cluster, you can create a [Serverless Tier](https://tidbcloud.com/console/clusters) cluster, which is free and can be created in approximately 30 seconds.
If you do not have a TiDB cluster, you can create a [TiDB Serverless](https://tidbcloud.com/console/clusters) cluster, which is free and can be created in approximately 30 seconds.

## Prerequisites

Expand Down Expand Up @@ -66,7 +66,7 @@ git clone https://github.com/pingcap-inc/tidb-appflow-integration
>
> - The `--guided` option uses prompts to guide you through the deployment. Your input will be stored in a configuration file, which is `samconfig.toml` by default.
> - `stack_name` specifies the name of AWS Lambda that you are deploying.
> - This prompted guide uses AWS as the cloud provider of TiDB Cloud Serverless Tier. To use Amazon S3 as the source or destination, you need to set the `region` of AWS Lambda as the same as that of Amazon S3.
> - This prompted guide uses AWS as the cloud provider of TiDB Serverless. To use Amazon S3 as the source or destination, you need to set the `region` of AWS Lambda as the same as that of Amazon S3.
> - If you have already run `sam deploy --guided` before, you can just run `sam deploy` instead, and SAM CLI will use the configuration file `samconfig.toml` to simplify the interaction.

If you see a similar output as follows, this Lambda is successfully deployed.
Expand Down Expand Up @@ -148,7 +148,7 @@ Choose the **Source details** and **Destination details**. TiDB connector can be
```

5. After the `sf_account` table is created, click **Connect**. A connection dialog is displayed.
6. In the **Connect to TiDB-Connector** dialog, enter the connection properties of the TiDB cluster. If you use a TiDB Cloud Serverless Tier cluster, you need to set the **TLS** option to `Yes`, which lets the TiDB connector use the TLS connection. Then, click **Connect**.
6. In the **Connect to TiDB-Connector** dialog, enter the connection properties of the TiDB cluster. If you use a TiDB Serverless cluster, you need to set the **TLS** option to `Yes`, which lets the TiDB connector use the TLS connection. Then, click **Connect**.

![tidb connection message](/media/develop/aws-appflow-step-tidb-connection-message.png)

Expand Down Expand Up @@ -244,5 +244,5 @@ test> SELECT * FROM sf_account;

- If anything goes wrong, you can navigate to the [CloudWatch](https://console.aws.amazon.com/cloudwatch/home) page on the AWS Management Console to get logs.
- The steps in this document are based on [Building custom connectors using the Amazon AppFlow Custom Connector SDK](https://aws.amazon.com/blogs/compute/building-custom-connectors-using-the-amazon-appflow-custom-connector-sdk/).
- [TiDB Cloud Serverless Tier](https://docs.pingcap.com/tidbcloud/select-cluster-tier#serverless-tier-beta) is **NOT** a production environment.
- [TiDB Serverless](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-serverless-beta) is **NOT** a production environment.
- To prevent excessive length, the examples in this document only show the `Insert` strategy, but `Update` and `Upsert` strategies are also tested and can be used.
Loading