diff --git a/configure-placement-rules.md b/configure-placement-rules.md index 0f5e56cf92891..f746f8d968a87 100644 --- a/configure-placement-rules.md +++ b/configure-placement-rules.md @@ -38,6 +38,7 @@ The following table shows the meaning of each field in a rule: | `Count` | `int`, positive integer | The number of replicas. | | `LabelConstraint` | `[]Constraint` | Filers nodes based on the label. | | `LocationLabels` | `[]string` | Used for physical isolation. | +| `IsolationLevel` | `string` | Used to set the minimum physical isolation level `LabelConstraint` is similar to the function in Kubernetes that filters labels based on these four primitives: `in`, `notIn`, `exists`, and `notExists`. The meanings of these four primitives are as follows: @@ -48,6 +49,8 @@ The following table shows the meaning of each field in a rule: The meaning and function of `LocationLabels` are the same with those earlier than v4.0. For example, if you have deployed `[zone,rack,host]` that defines a three-layer topology: the cluster has multiple zones (Availability Zones), each zone has multiple racks, and each rack has multiple hosts. When performing schedule, PD first tries to place the Region's peers in different zones. If this try fails (such as there are three replicas but only two zones in total), PD guarantees to place these replicas in different racks. If the number of racks is not enough to guarantee isolation, then PD tries the host-level isolation. +The meaning and function of `IsolationLevel` is elaborated in [Cluster topology configuration](/location-awareness.md). For example, if you have deployed `[zone,rack,host]` that defines a three-layer topology with `LocationLabels` and set `IsolationLevel` to `zone`, then PD ensures that all peers of each Region are placed in different zones during the scheduling. If the minimum isolation level restriction on `IsolationLevel` cannot be met (for example, 3 replicas are configured but there are only 2 data zones in total), PD will not try to make up to meet this restriction. The default value of `IsolationLevel` is an empty string, which means that it is disabled. + ## Configure rules The operations in this section are based on [pd-ctl](/pd-control.md), and the commands involved in the operations also support calls via HTTP API. @@ -75,7 +78,8 @@ In this way, PD enables this feature after the cluster is successfully bootstrap "end_key": "", "role": "voter", "count": 3, - "location_labels": ["zone", "rack", "host"] + "location_labels": ["zone", "rack", "host"], + "isolation_level": "" } ``` diff --git a/location-awareness.md b/location-awareness.md index 132ab854f7666..46881778d6f2e 100644 --- a/location-awareness.md +++ b/location-awareness.md @@ -8,7 +8,7 @@ aliases: ['/docs/dev/location-awareness/','/docs/dev/how-to/deploy/geographic-re ## Overview -PD schedules according to the topology of the TiKV cluster to maximize the TiKV's capability for disaster recovery. +PD schedules according to the topology of the TiKV cluster to maximize the TiKV's capability for disaster recovery. It is recommended that TiKV nodes are physically distributed as much as possible. For example, TiKV nodes can be distributed on different racks or even in different data zones. According to the topology information of TiKV, the PD scheduler automatically performs scheduling at the background to isolate the replicas of Regions as much as possible, thereby maximizing the capability for disaster recovery. Before you begin, see [Deploy TiDB Using TiDB Ansible (Recommended)](/online-deployment-using-ansible.md) and [Deploy TiDB Using Docker](/test-deployment-using-docker.md). @@ -16,7 +16,7 @@ Before you begin, see [Deploy TiDB Using TiDB Ansible (Recommended)](/online-dep TiKV reports the topological information to PD according to the startup parameter or configuration of TiKV. -Assuming that the topology has three structures: zone > rack > host, use lables to specify the following information: +Assuming that the topology has three structures: zone > rack > host, use labels to specify the following information: Startup parameter: @@ -41,12 +41,45 @@ max-replicas = 3 location-labels = ["zone", "rack", "host"] ``` +After the PD cluster is initialized, you need to use the pd-ctl tool to make online changes: + +{{< copyable "shell-regular" >}} + +```bash +pd-ctl config set location-labels zone,rack,host +``` + `location-labels` needs to correspond to the TiKV `labels` name so that PD can understand that the `labels` represents the TiKV topology. > **Note:** > > You must configure `location-labels` for PD and `labels` for TiKV at the same time for `labels` to take effect. +## PD restricts the TiKV topology + +Having configured `location-labels`, you can further enhance the topological isolation requirements on TiKV clusters through the `isolation-level` parameter. Assume that you have made a three-layer cluster topology by configuring `location-labels` according to the instructions above: zone -> rack -> host, and have configured the `isolation-level` as follows: + +{{< copyable "" >}} + +```toml +[replication] +isolation-level = "zone" +``` + +After the PD cluster is initialized, you need to use the pd-ctl tool to make online changes: + +{{< copyable "shell-regular" >}} + +```bash +pd-ctl config set isolation-level zone +``` + +`isolation-level` needs to correspond to one of the `location-labels` names so that PD can understand that this label represents the TiKV topology. + +> **Note:** +> +> `isolation-level` is empty by default, that is, there is no mandatory isolation level restriction. To set it, you must first configure the PD's `location-labels` parameter, and ensure that the value of `isolation-level` must be one of the `location-labels` names. + ## PD schedules based on the TiKV topology PD makes optimal scheduling according to the topological information. You just need to care about what kind of topology can achieve the desired effect. @@ -88,4 +121,8 @@ In this case, PD will schedule different replicas of each datum to different dat - If one of the data zones goes down, the high availability of the TiDB cluster is not affected. - If the data zone cannot recover within a period of time, PD will remove the replica from this data zone. -To sum up, PD maximizes the disaster recovery of the cluster according to the current topology. Therefore, if you want to reach a certain level of disaster recovery, deploy many machines in different sites according to the topology. The number of machines must be more than the number of `max-replicas`. +However, if `isolation-level` is set to `zone`, PD will ensure that different replicas of a Region are isolated from each other at the zone level, even if guaranteeing this restriction does not meet the requirement of `max-replicas`. For example, a TiKV cluster is distributed across three data zones z1/z2/z3. Each Region has three replicas as required, and PD distributes the three replicas of the same Region to these three data zones respectively. If a power outage occurs in z1 and cannot be recovered after a period of time, PD determines that the Region replicas on z1 are no longer available. However, because `isolation-level` is set to `zone`, PD needs to strictly guarantee that different replicas of the same Region will not be scheduled on the same data zone. Because both z2 and z3 already have replicas, PD will not perform any scheduling under the minimum isolation level restriction of `isolation-level`, even if there are only two replicas at this moment. + +Similarly, when `isolation-level` is set to `rack`, the minimum isolation level applies to different racks in the same data zone. With this configuration, the isolation at the zone level is guaranteed first if possible. When the isolation at the zone level cannot be guaranteed, PD tries to avoid scheduling different replicas to the same rack in the same zone, and so on. + +In summary, PD maximizes the disaster recovery of the cluster according to the current topology. Therefore, if you want to achieve a certain level of disaster recovery, deploy more machines on different sites according to the topology than the number of `max-replicas`. TiDB also provides a mandatory configuration item `isolation-level` to control the topological isolation level of data according to different cases. diff --git a/pd-configuration-file.md b/pd-configuration-file.md index 6bb367dc40491..1ca6b96bbc5ea 100644 --- a/pd-configuration-file.md +++ b/pd-configuration-file.md @@ -258,6 +258,12 @@ Configuration items related to replicas + Default value: `[]` + [Cluster topology configuration](/location-awareness.md) +### `isolation-level` + ++ The minimum topological isolation level of a TiKV cluster ++ Default value: `""` ++ [Cluster topology configuration](/location-awareness.md) + ### `strictly-match-label` + Enables the strict check for whether the TiKV label matches PD's `location-labels`. diff --git a/pd-control.md b/pd-control.md index 6066783943c8f..08eb9a2907168 100644 --- a/pd-control.md +++ b/pd-control.md @@ -128,6 +128,7 @@ Usage: { "replication": { "enable-placement-rules": "false", + "isolation-level": "", "location-labels": "", "max-replicas": 3, "strictly-match-label": "false" @@ -169,6 +170,7 @@ Usage: { "max-replicas": 3, "location-labels": "", + "isolation-level": "", "strictly-match-label": "false", "enable-placement-rules": "false" }