diff --git a/TOC.md b/TOC.md
index c26732954e5b..c1aa740b5aa3 100644
--- a/TOC.md
+++ b/TOC.md
@@ -220,6 +220,7 @@
     - [分区表](/reference/sql/partitioning.md)
     - [字符集](/reference/sql/character-set.md)
     - [SQL 模式](/reference/sql/sql-mode.md)
+    - [SQL 诊断](/reference/system-databases/sql-diagnosis.md)
     - [视图](/reference/sql/view.md)
   + 配置
     + tidb-server
@@ -247,6 +248,18 @@
   + 系统数据库
     - [`mysql`](/reference/system-databases/mysql.md)
     - [`information_schema`](/reference/system-databases/information-schema.md)
+    + `sql-diagnosis`
+      - [`cluster_info`](/reference/system-databases/cluster-info.md)
+      - [`cluster_hardware`](/reference/system-databases/cluster-hardware.md)
+      - [`cluster_config`](/reference/system-databases/cluster-config.md)
+      - [`cluster_load`](/reference/system-databases/cluster-load.md)
+      - [`cluster_systeminfo`](/reference/system-databases/cluster-systeminfo.md)
+      - [`cluster_log`](/reference/system-databases/cluster-log.md)
+      - [`metrics_schema`](/reference/system-databases/metrics-schema.md)
+      - [`metrics_tables`](/reference/system-databases/metrics-tables.md)
+      - [`metrics_summary`](/reference/system-databases/metrics-summary.md)
+      - [`inspection_result`](/reference/system-databases/inspection-result.md)
+      - [`inspection_summary`](/reference/system-databases/inspection-summary.md)
   - [错误码](/reference/error-codes.md)
   - [支持的连接器和 API](/reference/supported-clients.md)
   + 垃圾回收 (GC)
diff --git a/reference/system-databases/cluster-config.md b/reference/system-databases/cluster-config.md
new file mode 100644
index 000000000000..286645d068b5
--- /dev/null
+++ b/reference/system-databases/cluster-config.md
@@ -0,0 +1,54 @@
+---
+title: CLUSTER_CONFIG
+category: reference
+summary: 了解 TiDB 集群配置表 `CLUSTER_CONFIG`。
+---
+
+# CLUSTER_CONFIG
+
+集群配置表 `CLUSTER_CONFIG` 用于获取集群当前所有 TiDB/PD/TiKV 节点的配置。对于 TiDB 4.0 以前的版本，用户需要逐个访问各个节点的 HTTP API 才能收集到所有组件配置。
+
+{{< copyable "sql" >}}
+
+```sql
+desc cluster_config;
+```
+
+```
++----------+--------------+------+------+---------+-------+
+| Field    | Type         | Null | Key  | Default | Extra |
++----------+--------------+------+------+---------+-------+
+| TYPE     | varchar(64)  | YES  |      | NULL    |       |
+| INSTANCE | varchar(64)  | YES  |      | NULL    |       |
+| KEY      | varchar(256) | YES  |      | NULL    |       |
+| VALUE    | varchar(128) | YES  |      | NULL    |       |
++----------+--------------+------+------+---------+-------+
+```
+
+字段解释：
+
+* `TYPE`：节点的类型，可取值为 `tidb`，`pd` 或 `tikv`。
+* `INSTANCE`：节点的服务地址。
+* `KEY`：配置项名。
+* `VALUE`：配置项值。
+
+以下示例查询 TiKV 节点的 `coprocessor` 相关配置：
+
+{{< copyable "sql" >}}
+
+```sql
+select * from cluster_config where type='tikv' and `key` like 'coprocessor%';
+```
+
+```
++------+-----------------+-----------------------------------+----------+
+| TYPE | INSTANCE        | KEY                               | VALUE    |
++------+-----------------+-----------------------------------+----------+
+| tikv | 127.0.0.1:20160 | coprocessor.batch-split-limit     | 10       |
+| tikv | 127.0.0.1:20160 | coprocessor.region-max-keys       | 1.44e+06 |
+| tikv | 127.0.0.1:20160 | coprocessor.region-max-size       | 144MiB   |
+| tikv | 127.0.0.1:20160 | coprocessor.region-split-keys     | 960000   |
+| tikv | 127.0.0.1:20160 | coprocessor.region-split-size     | 96MiB    |
+| tikv | 127.0.0.1:20160 | coprocessor.split-region-on-table | false    |
++------+-----------------+-----------------------------------+----------+
+```
diff --git a/reference/system-databases/cluster-hardware.md b/reference/system-databases/cluster-hardware.md
new file mode 100644
index 000000000000..aefe09b6200b
--- /dev/null
+++ b/reference/system-databases/cluster-hardware.md
@@ -0,0 +1,62 @@
+---
+title: CLUSTER_HARDWARE
+summary: 了解 TiDB 集群配置表 `CLUSTER_HARDWARE`。
+category: reference
+---
+
+# CLUSTER_HARDWARE
+
+集群硬件表 `CLUSTER_HARDWARE` 提供了集群各节点所在服务器的硬件信息。
+
+{{< copyable "sql" >}}
+
+```sql
+desc cluster_hardware;
+```
+
+```
++-------------+--------------+------+------+---------+-------+
+| Field       | Type         | Null | Key  | Default | Extra |
++-------------+--------------+------+------+---------+-------+
+| TYPE        | varchar(64)  | YES  |      | NULL    |       |
+| INSTANCE    | varchar(64)  | YES  |      | NULL    |       |
+| DEVICE_TYPE | varchar(64)  | YES  |      | NULL    |       |
+| DEVICE_NAME | varchar(64)  | YES  |      | NULL    |       |
+| NAME        | varchar(256) | YES  |      | NULL    |       |
+| VALUE       | varchar(128) | YES  |      | NULL    |       |
++-------------+--------------+------+------+---------+-------+
+```
+
+字段解释：
+
+* `TYPE`：对应节点信息表 `information_schema.cluster_info` 中的 `TYPE` 字段，可取值为 `tidb`，`pd` 或 `tikv`。
+* `INSTANCE`：对应于节点信息表 `information_schema.cluster_info` 中的 `STATUS_ADDRESS` 字段。
+* `DEVICE_TYPE`：硬件类型。目前可以查询的硬件类型有 `cpu`、`memory`、`disk` 和 `net`。
+* `DEVICE_NAME`：硬件名。对于不同的 `DEVICE_TYPE`，`DEVICE_NAME` 的取值不同。
+    * `cpu`：硬件名为 cpu。
+    * `memory`：硬件名为 memory。
+    * `disk`：磁盘名。
+    * `net`：网卡名。
+* `NAME`：硬件不同的信息名，比如 cpu 有 `cpu-logical-cores` ， `cpu-physical-cores` 两个信息名，表示逻辑核心数量和物理核心数量。
+* `VALUE`：对应硬件信息的值。例如磁盘容量和 CPU 核数。
+
+查询集群 CPU 信息的示例如下：
+
+{{< copyable "sql" >}}
+
+```sql
+select * from cluster_hardware where device_type='cpu' and device_name='cpu' and name like '%cores';
+```
+
+```
++------+-----------------+-------------+-------------+--------------------+-------+
+| TYPE | INSTANCE        | DEVICE_TYPE | DEVICE_NAME | NAME               | VALUE |
++------+-----------------+-------------+-------------+--------------------+-------+
+| tidb | 127.0.0.1:10080 | cpu         | cpu         | cpu-logical-cores  | 8     |
+| tidb | 127.0.0.1:10080 | cpu         | cpu         | cpu-physical-cores | 4     |
+| pd   | 127.0.0.1:2379  | cpu         | cpu         | cpu-logical-cores  | 8     |
+| pd   | 127.0.0.1:2379  | cpu         | cpu         | cpu-physical-cores | 4     |
+| tikv | 127.0.0.1:20160 | cpu         | cpu         | cpu-logical-cores  | 8     |
+| tikv | 127.0.0.1:20160 | cpu         | cpu         | cpu-physical-cores | 4     |
++------+-----------------+-------------+-------------+--------------------+-------+
+```
diff --git a/reference/system-databases/cluster-info.md b/reference/system-databases/cluster-info.md
new file mode 100644
index 000000000000..ba74408f07bd
--- /dev/null
+++ b/reference/system-databases/cluster-info.md
@@ -0,0 +1,56 @@
+---
+title: CLUSTER_INFO
+summary: 了解 TiDB 集群配置表 `CLUSTER_INFO`。
+category: reference
+---
+
+# CLUSTER_INFO
+
+集群拓扑表 `CLUSTER_INFO` 提供集群当前的拓扑信息，以及各个节点的版本信息、版本对应的 Git Hash、各节点的启动时间、各节点的运行时间。
+
+{{< copyable "sql" >}}
+
+```sql
+desc cluster_info;
+```
+
+```
++----------------+-------------+------+------+---------+-------+
+| Field          | Type        | Null | Key  | Default | Extra |
++----------------+-------------+------+------+---------+-------+
+| TYPE           | varchar(64) | YES  |      | NULL    |       |
+| INSTANCE       | varchar(64) | YES  |      | NULL    |       |
+| STATUS_ADDRESS | varchar(64) | YES  |      | NULL    |       |
+| VERSION        | varchar(64) | YES  |      | NULL    |       |
+| GIT_HASH       | varchar(64) | YES  |      | NULL    |       |
+| START_TIME     | varchar(32) | YES  |      | NULL    |       |
+| UPTIME         | varchar(32) | YES  |      | NULL    |       |
++----------------+-------------+------+------+---------+-------+
+7 rows in set (0.00 sec)
+```
+
+字段解释：
+
+* `TYPE`：节点类型，目前节点的可取值为 `tidb`，`pd` 或 `tikv`。
+* `INSTANCE`：实例地址，为 `IP:PORT` 格式的字符串。
+* `STATUS_ADDRESS`：HTTP API 的服务地址。部分 `tikv-ctl`、`pd-ctl` 或 `tidb-ctl` 命令会使用到 HTTP API 和该地址。用户也可以通过该地址获取一些额外的集群信息，详情可参考 HTTP API 官方文档。
+* `VERSION`：对应节点的语义版本号。TiDB 版本为了兼容 MySQL 的版本号，以 `${mysql-version}-${tidb-version}` 的格式展示版本号。
+* `GIT_HASH`：编译节点版本时的 Git Commit Hash，用于识别两个节点是否是绝对一致的版本。
+* `START_TIME`：对应节点的启动时间。
+* `UPTIME`：对应节点已经运行的时间。
+
+{{< copyable "sql" >}}
+
+```sql
+select * from cluster_info;
+```
+
+```
++------+-----------------+-----------------+----------------------------------------+------------------------------------------+---------------------------+--------------+
+| TYPE | INSTANCE        | STATUS_ADDRESS  | VERSION                                | GIT_HASH                                 | START_TIME                | UPTIME       |
++------+-----------------+-----------------+----------------------------------------+------------------------------------------+---------------------------+--------------+
+| tidb | 127.0.0.1:4000  | 127.0.0.1:10080 | 5.7.25-TiDB-v4.0.0-beta-195-gb5ea3232a | b5ea3232afa970f00db7a0fb13ed10857db1912e | 2020-03-02T16:27:28+08:00 | 4m18.845924s |
+| pd   | 127.0.0.1:2379  | 127.0.0.1:2379  | 4.1.0-alpha                            | 4b9bcbc1425c96848042b6d700eb63f84e72b338 | 2020-03-02T16:27:17+08:00 | 4m29.845928s |
+| tikv | 127.0.0.1:20160 | 127.0.0.1:20180 | 4.1.0-alpha                            | 7c4202a1c8faf60eda659dfe0e64e31972488e78 | 2020-03-02T16:27:28+08:00 | 4m18.845929s |
++------+-----------------+-----------------+----------------------------------------+------------------------------------------+---------------------------+--------------+
+```
diff --git a/reference/system-databases/cluster-load.md b/reference/system-databases/cluster-load.md
new file mode 100644
index 000000000000..57c2e0e2fc1d
--- /dev/null
+++ b/reference/system-databases/cluster-load.md
@@ -0,0 +1,65 @@
+---
+title: CLUSTER_LOAD
+summary: 了解 TiDB 集群配置表 `CLUSTER_LOAD`。
+category: reference
+---
+
+# CLUSTER_LOAD
+
+集群负载表 `CLUSTER_LOAD` 提供集群各个节点所在服务器的的当前负载信息。
+
+{{< copyable "sql" >}}
+
+```sql
+desc cluster_load;
+```
+
+```
++-------------+--------------+------+------+---------+-------+
+| Field       | Type         | Null | Key  | Default | Extra |
++-------------+--------------+------+------+---------+-------+
+| TYPE        | varchar(64)  | YES  |      | NULL    |       |
+| INSTANCE    | varchar(64)  | YES  |      | NULL    |       |
+| DEVICE_TYPE | varchar(64)  | YES  |      | NULL    |       |
+| DEVICE_NAME | varchar(64)  | YES  |      | NULL    |       |
+| NAME        | varchar(256) | YES  |      | NULL    |       |
+| VALUE       | varchar(128) | YES  |      | NULL    |       |
++-------------+--------------+------+------+---------+-------+
+```
+
+字段解释：
+
+* `TYPE`：对应于节点信息表 `information_schema.cluster_info` 中的 `TYPE` 字段，可取值为 `tidb`，`pd` 或 `tikv`。
+* `INSTANCE`：对应于节点信息表 `information_schema.cluster_info` 中的 `STATUS_ADDRESS` 字段。
+* `DEVICE_TYPE`：硬件类型，目前可以查询的硬件类型有 `cpu`、`memory`、`disk` 和 `net`。
+* `DEVICE_NAME`：硬件名。对于不同的 `DEVICE_TYPE`，`DEVICE_NAME` 取值不同。
+    * `cpu`：硬件名为 cpu。
+    * `disk`：磁盘名。
+    * `net`：网卡名。
+    * `memory`：硬件名为 memory。
+* `NAME`：不同的负载类型。例如 cpu 有 `load1`/`load5`/`load15` 三个负载类型，分别表示 cpu 在 `1min`/`5min`/`15min` 内的平均负载。
+* `VALUE`：硬件负载的值，例如 cpu 在 `1min`/`5min`/`15min` 内的平均负载。
+
+查询集群当前的 CPU 负载信息示例如下：
+
+{{< copyable "sql" >}}
+
+```sql
+select * from cluster_load where device_type='cpu' and device_name='cpu';
+```
+
+```
++------+-----------------+-------------+-------------+--------+---------------+
+| TYPE | INSTANCE        | DEVICE_TYPE | DEVICE_NAME | NAME   | VALUE         |
++------+-----------------+-------------+-------------+--------+---------------+
+| tidb | 127.0.0.1:10080 | cpu         | cpu         | load1  | 1.94          |
+| tidb | 127.0.0.1:10080 | cpu         | cpu         | load5  | 2.16          |
+| tidb | 127.0.0.1:10080 | cpu         | cpu         | load15 | 2.24          |
+| pd   | 127.0.0.1:2379  | cpu         | cpu         | load1  | 1.94          |
+| pd   | 127.0.0.1:2379  | cpu         | cpu         | load5  | 2.16          |
+| pd   | 127.0.0.1:2379  | cpu         | cpu         | load15 | 2.24          |
+| tikv | 127.0.0.1:20160 | cpu         | cpu         | load1  | 1.94287109375 |
+| tikv | 127.0.0.1:20160 | cpu         | cpu         | load5  | 2.15576171875 |
+| tikv | 127.0.0.1:20160 | cpu         | cpu         | load15 | 2.2421875     |
++------+-----------------+-------------+-------------+--------+---------------+
+```
\ No newline at end of file
diff --git a/reference/system-databases/cluster-log.md b/reference/system-databases/cluster-log.md
new file mode 100644
index 000000000000..8adbad647bd4
--- /dev/null
+++ b/reference/system-databases/cluster-log.md
@@ -0,0 +1,73 @@
+---
+title: CLUSTER_LOG
+summary: 了解 TiDB 集群配置表 `CLUSTER_LOG`。
+category: reference
+---
+
+# CLUSTER_LOG
+
+集群日志表 `CLUSTER_LOG` 表用于查询集群当前所有 TiDB/PD/TiKV 节点日志。它通过将查询条件下推到各个节点，降低了日志查询对集群的影响。该表的查询性能优于 grep 命令。
+
+TiDB 4.0 版本之前，要获取集群的日志，用户需要逐个登录各个节点汇总日志。TiDB 4.0 的集群日志表提供了一个全局且时间有序的日志搜索结果，为跟踪全链路事件提供了便利的手段。例如按照某一个 `region id` 搜索日志，可以查询该 Region 生命周期内的所有日志；类似地，通过慢日志的 `txn id` 搜索全链路日志，可以查询该事务在各个节点扫描的 key 数量以及流量等信息。
+
+{{< copyable "sql" >}}
+
+```sql
+desc cluster_log;
+```
+
+```
++----------+---------------------------+------+------+---------+-------+
+| Field    | Type                      | Null | Key  | Default | Extra |
++----------+---------------------------+------+------+---------+-------+
+| TIME     | varchar(32)               | YES  |      | NULL    |       |
+| TYPE     | varchar(64)               | YES  |      | NULL    |       |
+| INSTANCE | varchar(64)               | YES  |      | NULL    |       |
+| LEVEL    | varchar(8)                | YES  |      | NULL    |       |
+| MESSAGE  | var_string(1024) unsigned | YES  |      | NULL    |       |
++----------+---------------------------+------+------+---------+-------+
+5 rows in set (0.00 sec)
+```
+
+字段解释：
+
+* `TIME`：日志打印时间。
+* `TYPE`：节点的类型，可取值为 `tidb`，`pd` 或 `tikv`。
+* `INSTANCE`：节点的服务地址。
+* `LEVEL`：日志级别。
+* `MESSAGE`：日志内容。
+
+> **注意：**
+>
+> + 日志表的所有字段都会下推到对应节点执行，所以为了降低使用集群日志表的开销，需尽可能地指定更多的条件。例如 `select * from cluter_log where instance='tikv-1'` 只会在 `tikv-1` 上执行日志搜索。
+>
+> + `message` 字段支持 `like` 和 `regexp` 正则表达式，对应的 pattern 会编译为 `regexp`。同时指定多个 `message` 条件，相当于 `grep` 命令的 `pipeline` 形式，例如：`select * from cluster_log where message like 'coprocessor%' and message regexp '.*slow.*'` 相当于在集群所有节点执行 `grep 'coprocessor' xxx.log | grep -E '.*slow.*'`。
+
+查询某个 DDL 的执行过程示例如下：
+
+{{< copyable "sql" >}}
+
+```sql
+select * from `CLUSTER_LOG` where message like '%ddl%' and message like '%job%58%' and type='tidb' and time > '2020-03-27 15:39:00';
+```
+
+```
++-------------------------+------+------------------+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| TIME                    | TYPE | INSTANCE         | LEVEL | MESSAGE                                                                                                                                                                                                                                                                                                                                     |
++-------------------------+------+------------------+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| 2020/03/27 15:39:36.140 | tidb | 172.16.5.40:4008 | INFO  | [ddl_worker.go:253] ["[ddl] add DDL jobs"] ["batch count"=1] [jobs="ID:58, Type:create table, State:none, SchemaState:none, SchemaID:1, TableID:57, RowCount:0, ArgLen:1, start time: 2020-03-27 15:39:36.129 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0; "]                                                                       |
+| 2020/03/27 15:39:36.140 | tidb | 172.16.5.40:4008 | INFO  | [ddl.go:457] ["[ddl] start DDL job"] [job="ID:58, Type:create table, State:none, SchemaState:none, SchemaID:1, TableID:57, RowCount:0, ArgLen:1, start time: 2020-03-27 15:39:36.129 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0"] [query="create table t3 (a int, b int,c int)"]                                                   |
+| 2020/03/27 15:39:36.879 | tidb | 172.16.5.40:4009 | INFO  | [ddl_worker.go:554] ["[ddl] run DDL job"] [worker="worker 1, tp general"] [job="ID:58, Type:create table, State:none, SchemaState:none, SchemaID:1, TableID:57, RowCount:0, ArgLen:0, start time: 2020-03-27 15:39:36.129 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0"]                                                             |
+| 2020/03/27 15:39:36.936 | tidb | 172.16.5.40:4009 | INFO  | [ddl_worker.go:739] ["[ddl] wait latest schema version changed"] [worker="worker 1, tp general"] [ver=35] ["take time"=52.165811ms] [job="ID:58, Type:create table, State:done, SchemaState:public, SchemaID:1, TableID:57, RowCount:0, ArgLen:1, start time: 2020-03-27 15:39:36.129 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0"] |
+| 2020/03/27 15:39:36.938 | tidb | 172.16.5.40:4009 | INFO  | [ddl_worker.go:359] ["[ddl] finish DDL job"] [worker="worker 1, tp general"] [job="ID:58, Type:create table, State:synced, SchemaState:public, SchemaID:1, TableID:57, RowCount:0, ArgLen:0, start time: 2020-03-27 15:39:36.129 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0"]                                                      |
+| 2020/03/27 15:39:36.140 | tidb | 172.16.5.40:4009 | INFO  | [ddl_worker.go:253] ["[ddl] add DDL jobs"] ["batch count"=1] [jobs="ID:58, Type:create table, State:none, SchemaState:none, SchemaID:1, TableID:57, RowCount:0, ArgLen:1, start time: 2020-03-27 15:39:36.129 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0; "]                                                                       |
+| 2020/03/27 15:39:36.140 | tidb | 172.16.5.40:4009 | INFO  | [ddl.go:457] ["[ddl] start DDL job"] [job="ID:58, Type:create table, State:none, SchemaState:none, SchemaID:1, TableID:57, RowCount:0, ArgLen:1, start time: 2020-03-27 15:39:36.129 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0"] [query="create table t3 (a int, b int,c int)"]                                                   |
+| 2020/03/27 15:39:37.141 | tidb | 172.16.5.40:4008 | INFO  | [ddl.go:489] ["[ddl] DDL job is finished"] [jobID=58]                                                                                                                                                                                                                                                                                       |
++-------------------------+------+------------------+-------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+```
+
+上面查询结果表示：
+
+1. 用户将 DDL JOB ID 为 58 的请求发给 `172.16.5.40:4008` TiDB 节点。
+2. `172.16.5.40:4009` TiDB 节点处理这个 DDL 请求，说明此时 `172.16.5.40:4009` 节点是 DDL owner。
+3. DDL JOB ID 为 58 的请求处理完成。
\ No newline at end of file
diff --git a/reference/system-databases/cluster-systeminfo.md b/reference/system-databases/cluster-systeminfo.md
new file mode 100644
index 000000000000..ef6c6faeea27
--- /dev/null
+++ b/reference/system-databases/cluster-systeminfo.md
@@ -0,0 +1,54 @@
+---
+title: CLUSTER_SYSTEMINFO
+summary: 了解 TiDB 集群配置表 `CLUSTER_SYSTEMINFO`。
+category: reference
+---
+
+# CLUSTER_SYSTEMINFO
+
+集群负载表 `CLUSTER_SYSTEMINFO` 用于查询集群所有节点所在服务器的内核配置信息。目前支持查询 `sysctl` 的信息。
+
+{{< copyable "sql" >}}
+
+```sql
+desc cluster_systeminfo;
+```
+
+```
++-------------+--------------+------+------+---------+-------+
+| Field       | Type         | Null | Key  | Default | Extra |
++-------------+--------------+------+------+---------+-------+
+| TYPE        | varchar(64)  | YES  |      | NULL    |       |
+| INSTANCE    | varchar(64)  | YES  |      | NULL    |       |
+| SYSTEM_TYPE | varchar(64)  | YES  |      | NULL    |       |
+| SYSTEM_NAME | varchar(64)  | YES  |      | NULL    |       |
+| NAME        | varchar(256) | YES  |      | NULL    |       |
+| VALUE       | varchar(128) | YES  |      | NULL    |       |
++-------------+--------------+------+------+---------+-------+
+6 rows in set (0.00 sec)
+```
+
+字段解释：
+
+* `TYPE`：对应于节点信息表 `information_schema.cluster_info` 中的 `TYPE` 字段，可取值为 `tidb`，`pd` 或 `tikv`。
+* `INSTANCE`：对应于节点信息表 `information_schema.cluster_info` 中的 `INSTANCE` 字段。
+* `SYSTEM_TYPE`：系统类型，目前可以查询的系统类型有 `system`。
+* `SYSTEM_NAME`：目前可以查询的 `SYSTEM_NAME` 为 `sysctl`。
+* `NAME`：`sysctl` 对应的配置名。
+* `VALUE`：`sysctl` 对应配置项的值。
+
+查询集群所有服务器的内核版本示例如下：
+
+```sql
+select * from CLUSTER_SYSTEMINFO where name like '%kernel.osrelease%'
+```
+
+```
++------+-------------------+-------------+-------------+------------------+----------------------------+
+| TYPE | INSTANCE          | SYSTEM_TYPE | SYSTEM_NAME | NAME             | VALUE                      |
++------+-------------------+-------------+-------------+------------------+----------------------------+
+| tidb | 172.16.5.40:4008  | system      | sysctl      | kernel.osrelease | 3.10.0-862.14.4.el7.x86_64 |
+| pd   | 172.16.5.40:20379 | system      | sysctl      | kernel.osrelease | 3.10.0-862.14.4.el7.x86_64 |
+| tikv | 172.16.5.40:21150 | system      | sysctl      | kernel.osrelease | 3.10.0-862.14.4.el7.x86_64 |
++------+-------------------+-------------+-------------+------------------+----------------------------+
+```
\ No newline at end of file
diff --git a/reference/system-databases/information-schema.md b/reference/system-databases/information-schema.md
index 05b9c178efcc..930812bd6c74 100644
--- a/reference/system-databases/information-schema.md
+++ b/reference/system-databases/information-schema.md
@@ -628,7 +628,6 @@ CONSTRAINT_CATALOG: def
       TABLE_SCHEMA: mysql
         TABLE_NAME: gc_delete_range_done
    CONSTRAINT_TYPE: UNIQUE
-6 rows in set (0.00 sec)
 ```
 
 其中：
@@ -831,6 +830,20 @@ COLLATION_CONNECTION: utf8_general_ci
 1 row in set (0.00 sec)
 ```
 
+## SQL 诊断相关的表
+
+* [`information_schema.cluster_info`](/reference/system-databases/cluster-info.md)
+* [`information_schema.cluster_config`](/reference/system-databases/cluster-config.md)
+* [`information_schema.cluster_hardware`](/reference/system-databases/cluster-hardware.md)
+* [`information_schema.cluster_load`](/reference/system-databases/cluster-load.md)
+* [`information_schema.cluster_systeminfo`](/reference/system-databases/cluster-systeminfo.md)
+* [`information_schema.cluster_log`](/reference/system-databases/cluster-log.md)
+* [`information_schema.metrics_tables`](/reference/system-databases/metrics-tables.md)
+* [`information_schema.metrics_summary`](/reference/system-databases/metrics-summary.md)
+* [`information_schema.metrics_summary_by_label`](/reference/system-databases/metrics-summary.md)
+* [`information_schema.inspection_result`](/reference/system-databases/inspection-result.md)
+* [`information_schema.inspection_summary`](/reference/system-databases/inspection-summary.md)
+
 ## 不支持的 Information Schema 表
 
 TiDB 包含以下 `INFORMATION_SCHEMA` 表，但仅会返回空行：
diff --git a/reference/system-databases/inspection-result.md b/reference/system-databases/inspection-result.md
new file mode 100644
index 000000000000..90ac3f5aec2c
--- /dev/null
+++ b/reference/system-databases/inspection-result.md
@@ -0,0 +1,308 @@
+---
+title: INSPECTION_RESULT
+summary: 了解 TiDB 集群配置表 `INSPECTION_RESULT`。
+category: reference
+---
+
+# INSPECTION_RESULT
+
+TiDB 内置了一些诊断规则，用于检测系统中的故障以及隐患。
+
+该诊断功能可以帮助用户快速发现问题，减少用户的重复性手动工作。可使用 `select * from information_schema.inspection_result` 语句来触发内部诊断。
+
+诊断结果表 `information_schema.inspection_result` 的表结构如下：
+
+{{< copyable "sql" >}}
+
+```sql
+mysql> desc inspection_result;
+```
+
+```
++-----------+--------------+------+------+---------+-------+
+| Field     | Type         | Null | Key  | Default | Extra |
++-----------+--------------+------+------+---------+-------+
+| RULE      | varchar(64)  | YES  |      | NULL    |       |
+| ITEM      | varchar(64)  | YES  |      | NULL    |       |
+| TYPE      | varchar(64)  | YES  |      | NULL    |       |
+| INSTANCE  | varchar(64)  | YES  |      | NULL    |       |
+| VALUE     | varchar(64)  | YES  |      | NULL    |       |
+| REFERENCE | varchar(64)  | YES  |      | NULL    |       |
+| SEVERITY  | varchar(64)  | YES  |      | NULL    |       |
+| DETAILS   | varchar(256) | YES  |      | NULL    |       |
++-----------+--------------+------+------+---------+-------+
+8 rows in set (0.00 sec)
+```
+
+字段解释：
+
+* `RULE`：诊断规则名称，目前实现了以下规则：
+    * `config`：配置一致性检测。如果同一个配置在不同节点不一致，会生成 `warning` 诊断结果。
+    * `version`：版本一致性检测。如果同一类型的节点版本不同，会生成 `critical` 诊断结果。
+    * `current-load`：如果当前系统负载太高，会生成对应的 `warning` 诊断结果。
+    * `critical-error`：系统各个模块定义了严重的错误，如果某一个严重错误在对应时间段内超过阈值，会生成 `warning` 诊断结果。
+    * `threshold-check`：诊断系统会对大量指标进行阈值判断，如果超过阈值会生成对应的诊断信息。
+* `ITEM`：每一个规则会对不同的项进行诊断，该字段表示对应规则下面的具体诊断项。
+* `TYPE`：诊断的实例类型，可取值为 `tidb`，`pd` 或 `tikv`。
+* `INSTANCE`：诊断的具体实例地址。
+* `VALUE`：针对这个诊断项得到的值。
+* `REFERENCE`：针对这个诊断项的参考值（阈值）。如果 `VALUE` 和阈值相差较大，就会产生对应的诊断信息。
+* `SEVERITY`：严重程度，取值为 `warning` 或 `critical`。
+* `DETAILS`：诊断的详细信息，可能包含进一步调查的 SQL 或文档链接。
+
+## 诊断示例
+
+诊断集群当前时间的问题。
+
+{{< copyable "sql" >}}
+
+```sql
+select * from inspection_result\G
+```
+
+```
+***************************[ 1. row ]***************************
+RULE      | config
+ITEM      | log.slow-threshold
+TYPE      | tidb
+INSTANCE  | 172.16.5.40:4000
+VALUE     | 0
+REFERENCE | not 0
+SEVERITY  | warning
+DETAILS   | slow-threshold = 0 will record every query to slow log, it may affect performance
+***************************[ 2. row ]***************************
+RULE      | version
+ITEM      | git_hash
+TYPE      | tidb
+INSTANCE  |
+VALUE     | inconsistent
+REFERENCE | consistent
+SEVERITY  | critical
+DETAILS   | the cluster has 2 different tidb version, execute the sql to see more detail: select * from information_schema.cluster_info where type='tidb'
+***************************[ 3. row ]***************************
+RULE      | threshold-check
+ITEM      | storage-write-duration
+TYPE      | tikv
+INSTANCE  | 172.16.5.40:23151
+VALUE     | 130.417
+REFERENCE | < 0.100
+SEVERITY  | warning
+DETAILS   | max duration of 172.16.5.40:23151 tikv storage-write-duration was too slow
+***************************[ 4. row ]***************************
+RULE      | threshold-check
+ITEM      | rocksdb-write-duration
+TYPE      | tikv
+INSTANCE  | 172.16.5.40:20151
+VALUE     | 108.105
+REFERENCE | < 0.100
+SEVERITY  | warning
+DETAILS   | max duration of 172.16.5.40:20151 tikv rocksdb-write-duration was too slow
+```
+
+上述诊断结果发现了以下几个问题：
+
+* 第一行表示 TiDB 的 log.slow-threshold 配置值为 0 ， 可能会影响性能。
+* 第二行表示集群中有 2 个不同的 TiDB 版本
+* 第三、四行表示 TiKV 的写入延迟太大，期望时间是不超过 0.1s, 但实际值远超预期。 
+
+诊断集群在时间段 "2020-03-26 00:03:00", "2020-03-26 00:08:00" 的问题。指定时间范围需要使用 `/*+ time_range() */` 的 SQL Hint，参考下面的查询示例：
+
+{{< copyable "sql" >}}
+
+```sql
+select /*+ time_range("2020-03-26 00:03:00", "2020-03-26 00:08:00") */ * from inspection_result\G
+```
+
+```
+***************************[ 1. row ]***************************
+RULE      | critical-error
+ITEM      | server-down
+TYPE      | tidb
+INSTANCE  | 172.16.5.40:4009
+VALUE     |
+REFERENCE |
+SEVERITY  | critical
+DETAILS   | tidb 172.16.5.40:4009 restarted at time '2020/03/26 00:05:45.670'
+***************************[ 2. row ]***************************
+RULE      | threshold-check
+ITEM      | get-token-duration
+TYPE      | tidb
+INSTANCE  | 172.16.5.40:10089
+VALUE     | 0.234
+REFERENCE | < 0.001
+SEVERITY  | warning
+DETAILS   | max duration of 172.16.5.40:10089 tidb get-token-duration is too slow
+```
+
+上面的诊断结果发现了以下问题：
+
+* 第一行表示 172.16.5.40:4009 TiDB 节点在 `2020/03/26 00:05:45.670` 发生了重启。
+* 第二行表示 172.16.5.40:10089 TiDB 节点的最大的 get-token-duration 时间为 0.234s, 期望时间是小于 0.001s。 
+
+也可以指定条件，比如只查询 `critical` 严重级别的诊断结果：
+
+{{< copyable "sql" >}}
+
+```sql
+select * from inspection_result where severity='critical';
+```
+
+只查询 `critical-error` 规则的诊断结果:
+
+{{< copyable "sql" >}}
+
+```sql
+select * from inspection_result where rule='critical-error';
+```
+
+## 诊断规则介绍
+
+诊断模块内部包含一系列的规则，这些规则会通过查询已有的监控表和集群信息表，对结果和预先设定的阈值进行对比。如果结果超过阈值或低于阈值将生成 `warning` 或 `critical` 的结果，并在 `details` 列中提供相应信息。
+
+可以通过查询 `inspection_rules` 系统表查询已有的诊断规则:
+
+{{< copyable "sql" >}}
+
+```sql
+select * from inspection_rules where type='inspection';
+```
+
+```
++-----------------+------------+---------+
+| NAME            | TYPE       | COMMENT |
++-----------------+------------+---------+
+| config          | inspection |         |
+| version         | inspection |         |
+| current-load    | inspection |         |
+| critical-error  | inspection |         |
+| threshold-check | inspection |         |
++-----------------+------------+---------+
+```
+
+### config 诊断规则
+
+config 诊断规则通过查询 `CLUSTER_CONFIG` 系统表，执行以下 2 个诊断规则：
+
+* 检测相同组件的配置值是否一致，并非所有配置项都会有一致性检查，下面是一致性检查的白名单：
+
+    ```go
+    // TiDB 配置一致性检查白名单
+    port
+    status.status-port
+    host
+    path
+    advertise-address
+    status.status-port
+    log.file.filename
+    log.slow-query-file
+
+    // PD 配置一致性检查白名单 
+    advertise-client-urls
+    advertise-peer-urls
+    client-urls
+    data-dir
+    log-file
+    log.file.filename
+    metric.job
+    name
+    peer-urls
+
+    // TiKV 配置一致性检查白名单
+    server.addr
+    server.advertise-addr
+    server.status-addr
+    log-file
+    raftstore.raftdb-path
+    storage.data-dir
+    ```
+
+* 检测以下配置项的值是否符合预期。
+
+|  组件  | 配置项 | 预期值 |
+|  ----  | ----  |  ----  |
+| TiDB | log.slow-threshold | 大于 0 |
+| TiKV | raftstore.sync-log | true |
+
+### version 诊断规则
+
+version 诊断规则通过查询 `CLUSTER_INFO` 系统表，检测相同组件的版本 hash 是否一致。示例如下：
+
+{{< copyable "sql" >}}
+
+```sql
+select * from inspection_result where rule='version'\G
+```
+
+```
+***************************[ 1. row ]***************************
+RULE      | version
+ITEM      | git_hash
+TYPE      | tidb
+INSTANCE  |
+VALUE     | inconsistent
+REFERENCE | consistent
+SEVERITY  | critical
+DETAILS   | the cluster has 2 different tidb versions, execute the sql to see more detail: select * from information_schema.cluster_info where type='tidb'
+```
+
+### critical-error 诊断规则
+
+critical-error 诊断规则执行以下 2 个诊断规则：
+
+* 通过查询 metrics_schema 数据库中相关的监控系统表，检测集群是否有出现以下比较严重的错误：
+
+|  组件  | 错误名字 | 相关监控表 | 错误说明 |
+|  ----  | ----  |  ----  |  ----  |
+| TiDB | panic-count | tidb_panic_count_total_count | TiDB 出现 panic 错误 |
+| TiDB | binlog-error | tidb_binlog_error_total_count | TiDB 写 binlog 时出现的错误 |
+| TiKV | critical-error | tikv_critical_error_total_coun | TiKV 的 critical error |
+| TiKV | scheduler-is-busy       | tikv_scheduler_is_busy_total_count | TiKV 的 scheduler 太忙，该使 TiKV 临时不可用 |
+| TiKV | coprocessor-is-busy | tikv_coprocessor_is_busy_total_count | TiKV 的 coprocessor 太忙 |
+| TiKV | channel-is-full | tikv_channel_full_total_count | TiKV 出现 channel full 的错误 |
+| TiKV | tikv_engine_write_stall | tikv_engine_write_stall | TiKV 出现写入 stall 的错误 |
+
+* 通过查询 metrics_schema.up 监控表和 `CLUSTER_LOG` 系统表，检查是否有组件发生重启。
+
+### threshold-check 诊断规则
+
+threshold-check 诊断规则通过查询 metrics_schema 数据库中相关的监控系统表，检测集群中以下指标是否超出阈值：
+
+|  组件  | 监控指标 | 相关监控表 | 预期值 |  说明  |
+|  ----  | ----  |  ----  |  ----  |  ----  |
+| TiDB | tso-duration              | pd_tso_wait_duration                | 小于 50 ms  |  获取事务 TSO 时间戳的耗时 |
+| TiDB | get-token-duration        | tidb_get_token_duration             | 小于 1 ms   |  查询获取 token 的耗时, 相关的 TiDB 配置参数是 token-limit  |
+| TiDB | load-schema-duration      | tidb_load_schema_duration           | 小于 1 s    |  TiDB 更新获取表元信息的耗时 |
+| TiKV | scheduler-cmd-duration    | tikv_scheduler_command_duration     | 小于 0.1 s  |  TiKV 执行 KV cmd 请求的耗时 |
+| TiKV | handle-snapshot-duration  | tikv_handle_snapshot_duration       | 小于 30 s   |  TiKV 处理 snapshot 的耗时 |
+| TiKV | storage-write-duration    | tikv_storage_async_request_duration | 小于 0.1 s  |  TiKV 写入的延迟 |
+| TiKV | storage-snapshot-duration | tikv_storage_async_request_duration | 小于 50 ms  |  TiKV 获取 snapshot 的耗时 |
+| TiKV | rocksdb-write-duration    | tikv_engine_write_duration          | 小于 100 ms |  TiKV RocksDB 的写入延迟 |
+| TiKV | rocksdb-get-duration | tikv_engine_max_get_duration | 小于 50 ms |  TiKV RocksDB 的读取延迟 |
+| TiKV | rocksdb-seek-duration | tikv_engine_max_seek_duration | 小于 50 ms |  TiKV RocksDB 执行 seek 的延迟 |
+| TiKV | scheduler-pending-cmd-coun | tikv_scheduler_pending_commands  | 小于 1000 | TiKV 中被阻塞的命令数量  |
+| TiKV | index-block-cache-hit | tikv_block_index_cache_hit | 大于 0.95 | TiKV 中 index block 缓存的命中率 |
+| TiKV | filter-block-cache-hit | tikv_block_filter_cache_hit | 大于 0.95 | TiKV 中 filter block 缓存的命中率 |
+| TiKV | data-block-cache-hit | tikv_block_data_cache_hit | 大于 0.80 | TiKV 中 data block 缓存的命中率 |
+| TiKV | leader-score-balance | pd_scheduler_store_status  | 小于 0.05 | 检测各个 TiKV 节点的 leader score 是否均衡，期望节点间的差异小于 5% |
+| TiKV | region-score-balance | pd_scheduler_store_status  | 小于 0.05 | 检测各个 TiKV 节点的 region score 是否均衡，期望节点间的差异小于 5% |
+| TiKV | store-available-balance | pd_scheduler_store_status  | 小于 0.2 | 检测各个 TiKV 节点的存储可用空间大小是否均衡，期望节点间的差异小于 20% |
+| TiKV | region-count | pd_scheduler_store_status  | 小于 20000 | 检测各个 TiKV 节点的 region 数量，期望单个节点的 region 数量小于 20000 |
+| PD | region-health | pd_region_health | 小于 100  | 检测集群中处于调度中间状态的 region 数量，期望总数小于 100 |
+
+另外还会检测 TiKV 节点的以下 thread cpu usage 是否过高:
+
+* scheduler-worker-cpu
+* coprocessor-normal-cpu
+* coprocessor-high-cpu
+* coprocessor-low-cpu
+* grpc-cpu
+* raftstore-cpu
+* apply-cpu
+* storage-readpool-normal-cpu
+* storage-readpool-high-cpu
+* storage-readpool-low-cpu
+* split-check-cpu
+
+## 最后
+
+TiDB 内置的诊断规则还在不断的完善改进中，如果你也想到了一些诊断规则，非常欢迎给 TiDB 提 PR 或 ISSUE。
\ No newline at end of file
diff --git a/reference/system-databases/inspection-summary.md b/reference/system-databases/inspection-summary.md
new file mode 100644
index 000000000000..be0df9a86016
--- /dev/null
+++ b/reference/system-databases/inspection-summary.md
@@ -0,0 +1,75 @@
+---
+title: INSPECTION_SUMMARY
+summary: 了解 TiDB 集群配置表 `INSPECTION_SUMMARY`。
+category: reference
+---
+
+# INSPECTION_SUMMARY
+
+在部分场景下，用户只关注特定链路或模块的监控汇总。例如当前 Coprocessor 配置的线程池为 8，如果 Coprocessor 的 CPU 使用率达到了 750%，可以确定存在风险，或者可能提前成为瓶颈。但是部分监控会因为用户的 workload 不同而差异较大，所以难以定义确定的阈值。排查这部分场景的问题也非常重要，所以TiDB 提供了 `inspection_summary` 来进行链路汇总。
+
+诊断汇总表 `information_schema.inspection_summary` 的表结构如下：
+
+{{< copyable "sql" >}}
+
+```sql
+mysql> desc inspection_summary;
+```
+
+```
++--------------+-----------------------+------+------+---------+-------+
+| Field        | Type                  | Null | Key  | Default | Extra |
++--------------+-----------------------+------+------+---------+-------+
+| RULE         | varchar(64)           | YES  |      | NULL    |       |
+| INSTANCE     | varchar(64)           | YES  |      | NULL    |       |
+| METRICS_NAME | varchar(64)           | YES  |      | NULL    |       |
+| LABEL        | varchar(64)           | YES  |      | NULL    |       |
+| QUANTILE     | double unsigned       | YES  |      | NULL    |       |
+| AVG_VALUE    | double(22,6) unsigned | YES  |      | NULL    |       |
+| MIN_VALUE    | double(22,6) unsigned | YES  |      | NULL    |       |
+| MAX_VALUE    | double(22,6) unsigned | YES  |      | NULL    |       |
++--------------+-----------------------+------+------+---------+-------+
+```
+
+字段解释：
+
+* `RULE`：汇总规则。由于规则在持续添加，最新的规则列表可以通过 `select * from inspection_rules where type='summary'` 查询。
+* `INSTANCE`：监控的具体实例。
+* `METRIC_NAME`：监控表。
+* `QUANTILE`：对于包含 `QUANTILE` 的监控表有效，可以通过谓词下推指定多个百分位，例如 `select * from inspection_summary where rule='ddl' and quantile in (0.80, 0.90, 0.99, 0.999)` 来汇总 DDL 相关监控，查询百分位为 80/90/99/999 的结果。`AVG_VALUE`、`MIN_VALUE`、`MAX_VALUE` 分别表示聚合的平均值、最小值、最大值。
+
+> **注意：**
+>
+> 由于汇总所有结果有一定开销，所以 `information_summary` 中的规则是惰性触发的，即在 SQL 的谓词中显示指定的 `rule` 才会运行。例如 `select * from inspection_summary` 语句会得到一个空的结果集。`select * from inspection_summary where rule in ('read-link', 'ddl')` 会汇总读链路和 DDL 相关的监控。
+
+使用示例:
+
+诊断结果表和诊断监控汇总表都可以通过 `hint` 的方式指定诊断的时间范围，例如 `select **+ time_range('2020-03-07 12:00:00','2020-03-07 13:00:00') */* from inspection_summary` 是对 2020-03-07 12:00:00 - 2020-03-07 13:00:00 时间段的监控汇总。和监控汇总表一样，诊断结果表通过对比两个不同时间段的数据，快速发现差异较大的监控项。以下为一个例子：
+
+诊断集群在时间段 `"2020-01-16 16:00:54.933", "2020-01-16 16:10:54.933"` 的故障:
+
+{{< copyable "sql" >}}
+
+```sql
+mysql> SELECT
+         t1.avg_value / t2.avg_value AS ratio,
+         t1.*,
+         t2.*
+       FROM
+         (
+           SELECT
+             /*+ time_range("2020-01-16 16:00:54.933", "2020-01-16 16:10:54.933")*/ *
+           FROM inspection_summary WHERE rule='read-link'
+         ) t1
+         JOIN
+         (
+           SELECT
+             /*+ time_range("2020-01-16 16:10:54.933","2020-01-16 16:20:54.933")*/ *
+           FROM inspection_summary WHERE rule='read-link'
+         ) t2
+         ON t1.metrics_name = t2.metrics_name
+         and t1.instance = t2.instance
+         and t1.label = t2.label
+       ORDER BY
+         ratio DESC;
+```
diff --git a/reference/system-databases/metrics-schema.md b/reference/system-databases/metrics-schema.md
new file mode 100644
index 000000000000..8adad90a67d0
--- /dev/null
+++ b/reference/system-databases/metrics-schema.md
@@ -0,0 +1,177 @@
+---
+title: Metrics Schema
+summary: 了解 TiDB 集群配置表 `METRICS_SCHEMA`。
+category: reference
+---
+
+# Metrics Schema
+
+为了能够动态地观察并对比不同时间段的集群情况，TiDB 4.0 诊断系统添加了集群监控系统表。所有表都在 `metrics_schema` 中，可以通过 SQL 的方式查询监控。实际上，SQL 诊断，以及 `metrics_summary`，`metrics_summary_by_label`，`inspection_result` 这三个监控相关的汇总表数据都是通过查询 `metrics_schema` 库中的各种监控表来获取信息的。
+。目前添加的系统表数量较多，用户可以通过 `information_schema.metrics_tables` 查询这些表的相关信息。
+
+## 概览
+
+下面以 `tidb_query_duration` 表来作为示例介绍监控表相关的使用和原理，其他的监控表原理都是类似的。
+
+先查询 `information_schema.metrics_tables` 中关于 `tidb_query_duration` 表相关的信息：
+
+{{< copyable "sql" >}}
+
+```sql
+select * from information_schema.metrics_tables where table_name='tidb_query_duration';
+```
+
+```
++---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------+----------+----------------------------------------------+
+| TABLE_NAME          | PROMQL                                                                                                                                                   | LABELS            | QUANTILE | COMMENT                                      |
++---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------+----------+----------------------------------------------+
+| tidb_query_duration | histogram_quantile($QUANTILE, sum(rate(tidb_server_handle_query_duration_seconds_bucket{$LABEL_CONDITIONS}[$RANGE_DURATION])) by (le,sql_type,instance)) | instance,sql_type | 0.9      | The quantile of TiDB query durations(second) |
++---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------+----------+----------------------------------------------+
+```
+
+* `TABLE_NAME`：对应于 `metrics_schema` 中的表名，这里表名是 `tidb_query_duration`。
+* `PROMQL`：因为监控表的原理是将 SQL 映射成 `PromQL`，并将 Prometheus 结果转换成 SQL 查询结果。这个字段是 `PromQL` 的表达式模板，获取监控表数据时使用查询条件改写模板中的变量，生成最终的查询表达式。
+* `LABELS`：监控定义的 label，`tidb_query_duration` 有 2 个 label，分别是 `instance` 和 `sql_type`。
+* `QUANTILE`：百分位。对于直方图类型的监控数据，指定一个默认百分位。如果值为 `0`，表示该监控表对应的监控不是直方图。`tidb_query_duration` 默认查询 0.9 ，也就是 P90 的监控值。
+* `COMMENT`：对这个监控表的解释。可以看出 `tidb_query_duration` 表的是用来查询 TiDB query 执行的百分位时间，如 P999/P99/P90 的查询耗时，单位是秒。
+
+再来看 `tidb_query_duration` 的表结构：
+
+{{< copyable "sql" >}}
+
+```sql
+show create table metrics_schema.tidb_query_duration;
+```
+
+```
++---------------------+--------------------------------------------------------------------------------------------------------------------+
+| Table               | Create Table                                                                                                       |
++---------------------+--------------------------------------------------------------------------------------------------------------------+
+| tidb_query_duration | CREATE TABLE `tidb_query_duration` (                                                                               |
+|                     |   `time` datetime unsigned DEFAULT CURRENT_TIMESTAMP,                                                              |
+|                     |   `instance` varchar(512) DEFAULT NULL,                                                                            |
+|                     |   `sql_type` varchar(512) DEFAULT NULL,                                                                            |
+|                     |   `quantile` double unsigned DEFAULT '0.9',                                                                        |
+|                     |   `value` double unsigned DEFAULT NULL                                                                             |
+|                     | ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin COMMENT='The quantile of TiDB query durations(second)' |
++---------------------+--------------------------------------------------------------------------------------------------------------------+
+```
+
+* `time`：监控项的时间。
+* `instance` 和 `sql_type`：是 `tidb_query_duration` 这个监控项的 label。`instance` 表示监控的地址，`sql_type` 表示执行 SQL 的类似。
+* `quantile`，百分位，直方图类型的监控都会有该列，表示查询的百分位时间，如 `quantile=0.9` 就是查询 P90 的时间。
+* `value`：监控项的值。
+
+下面是查询时间 [`2020-03-25 23:40:00`, `2020-03-25 23:42:00`] 范围内的 P99 的 TiDB Query 耗时：
+
+{{< copyable "sql" >}}
+
+```sql
+select * from metrics_schema.tidb_query_duration where value is not null and time>='2020-03-25 23:40:00' and time <= '2020-03-25 23:42:00' and quantile=0.99;
+```
+
+```
++---------------------+-------------------+----------+----------+----------------+
+| time                | instance          | sql_type | quantile | value          |
++---------------------+-------------------+----------+----------+----------------+
+| 2020-03-25 23:40:00 | 172.16.5.40:10089 | Insert   | 0.99     | 0.509929485256 |
+| 2020-03-25 23:41:00 | 172.16.5.40:10089 | Insert   | 0.99     | 0.494690793986 |
+| 2020-03-25 23:42:00 | 172.16.5.40:10089 | Insert   | 0.99     | 0.493460506934 |
+| 2020-03-25 23:40:00 | 172.16.5.40:10089 | Select   | 0.99     | 0.152058493415 |
+| 2020-03-25 23:41:00 | 172.16.5.40:10089 | Select   | 0.99     | 0.152193879678 |
+| 2020-03-25 23:42:00 | 172.16.5.40:10089 | Select   | 0.99     | 0.140498483232 |
+| 2020-03-25 23:40:00 | 172.16.5.40:10089 | internal | 0.99     | 0.47104        |
+| 2020-03-25 23:41:00 | 172.16.5.40:10089 | internal | 0.99     | 0.11776        |
+| 2020-03-25 23:42:00 | 172.16.5.40:10089 | internal | 0.99     | 0.11776        |
++---------------------+-------------------+----------+----------+----------------+
+```
+
+以上查询结果的第一行意思是，在 2020-03-25 23:40:00 时，在TiDB 实例 172.16.5.40:10089 上，`Insert` 类型的语句的 P99 执行时间是 0.509929485256 秒。其他各行的含义类似，`sql_type` 列的其他值含义如下：
+
+* `Select`：表示执行的 `select` 类型的语句。
+* `internal`：表示 TiDB 的内部 SQL 语句，一般是统计信息更新，获取全局变量相关的内部语句。
+
+进一步再查看上面语句的执行计划如下：
+
+{{< copyable "sql" >}}
+
+```sql
+desc select * from metrics_schema.tidb_query_duration where value is not null and time>='2020-03-25 23:40:00' and time <= '2020-03-25 23:42:00' and quantile=0.99;
+```
+
+```
++------------------+----------+------+---------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| id               | estRows  | task | access object             | operator info                                                                                                                                                                                          |
++------------------+----------+------+---------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| Selection_5      | 8000.00  | root |                           | not(isnull(Column#5))                                                                                                                                                                                  |
+| └─MemTableScan_6 | 10000.00 | root | table:tidb_query_duration | PromQL:histogram_quantile(0.99, sum(rate(tidb_server_handle_query_duration_seconds_bucket{}[60s])) by (le,sql_type,instance)), start_time:2020-03-25 23:40:00, end_time:2020-03-25 23:42:00, step:1m0s |
++------------------+----------+------+---------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+```
+
+可以发现执行计划中有一个 `PromQL`, 以及查询监控的 `start_time` 和 `end_time`，还有 `step` 值，在实际执行时，TiDB 会调用 Prometheus 的 `query_range` HTTP API 接口来查询监控数据。
+
+细心的读者可能已经发现，在 [`2020-03-25 23:40:00`, `2020-03-25 23:42:00`] 时间范围内，每个 label 只有 3 个时间的值，执行计划中的 `step` 值为 1 分钟，这实际上是由下面 2 个 session 变量决定的：
+
+* `tidb_metric_query_step`：查询的分辨率步长。从 Prometheus 的 `query_range` 数据时需要指定 `start`，`end` 和 `step`，其中 `step` 会使用该变量的值。
+* `tidb_metric_query_range_duration`：查询监控时，会将 `PROMQL` 中的 `$RANGE_DURATION` 替换成该变量的值，默认值是 60 秒。
+
+如果想要查看不同时间粒度的监控项的值，用户可以修改上面2个 session 变量后查询监控表，示例如下：
+
+首先修改 2 个 session 变量的值，将时间粒度设置为 30 秒。
+
+> 注意
+> Prometheus 支持查询的最小粒度就是 30 秒。
+
+{{< copyable "sql" >}}
+
+```sql
+set @@tidb_metric_query_step=30;
+set @@tidb_metric_query_range_duration=30;
+```
+
+再查询 `tidb_query_duration` 监控如下，可以发现在 3 分钟时间范围内，每个 label 有 6 个时间的值，每个值时间间隔是 30 秒。
+
+{{< copyable "sql" >}}
+
+```sql
+select * from metrics_schema.tidb_query_duration where value is not null and time>='2020-03-25 23:40:00' and time <= '2020-03-25 23:42:00' and quantile=0.99;
+```
+
+```
++---------------------+-------------------+----------+----------+-----------------+
+| time                | instance          | sql_type | quantile | value           |
++---------------------+-------------------+----------+----------+-----------------+
+| 2020-03-25 23:40:00 | 172.16.5.40:10089 | Insert   | 0.99     | 0.483285651924  |
+| 2020-03-25 23:40:30 | 172.16.5.40:10089 | Insert   | 0.99     | 0.484151462113  |
+| 2020-03-25 23:41:00 | 172.16.5.40:10089 | Insert   | 0.99     | 0.504576        |
+| 2020-03-25 23:41:30 | 172.16.5.40:10089 | Insert   | 0.99     | 0.493577384561  |
+| 2020-03-25 23:42:00 | 172.16.5.40:10089 | Insert   | 0.99     | 0.49482474311   |
+| 2020-03-25 23:40:00 | 172.16.5.40:10089 | Select   | 0.99     | 0.189253402185  |
+| 2020-03-25 23:40:30 | 172.16.5.40:10089 | Select   | 0.99     | 0.184224951851  |
+| 2020-03-25 23:41:00 | 172.16.5.40:10089 | Select   | 0.99     | 0.151673410553  |
+| 2020-03-25 23:41:30 | 172.16.5.40:10089 | Select   | 0.99     | 0.127953838989  |
+| 2020-03-25 23:42:00 | 172.16.5.40:10089 | Select   | 0.99     | 0.127455434547  |
+| 2020-03-25 23:40:00 | 172.16.5.40:10089 | internal | 0.99     | 0.0624          |
+| 2020-03-25 23:40:30 | 172.16.5.40:10089 | internal | 0.99     | 0.12416         |
+| 2020-03-25 23:41:00 | 172.16.5.40:10089 | internal | 0.99     | 0.0304          |
+| 2020-03-25 23:41:30 | 172.16.5.40:10089 | internal | 0.99     | 0.06272         |
+| 2020-03-25 23:42:00 | 172.16.5.40:10089 | internal | 0.99     | 0.0629333333333 |
++---------------------+-------------------+----------+----------+-----------------+
+```
+
+最后查看执行计划，也会发现执行计划中的 `PromQL` 以及 `step` 的值都已经变成了 `30` 秒。
+
+{{< copyable "sql" >}}
+
+```sql
+desc select * from metrics_schema.tidb_query_duration where value is not null and time>='2020-03-25 23:40:00' and time <= '2020-03-25 23:42:00' and quantile=0.99;
+```
+
+```
++------------------+----------+------+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| id               | estRows  | task | access object             | operator info                                                                                                                                                                                         |
++------------------+----------+------+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| Selection_5      | 8000.00  | root |                           | not(isnull(Column#5))                                                                                                                                                                                 |
+| └─MemTableScan_6 | 10000.00 | root | table:tidb_query_duration | PromQL:histogram_quantile(0.99, sum(rate(tidb_server_handle_query_duration_seconds_bucket{}[30s])) by (le,sql_type,instance)), start_time:2020-03-25 23:40:00, end_time:2020-03-25 23:42:00, step:30s |
++------------------+----------+------+---------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+```
diff --git a/reference/system-databases/metrics-summary.md b/reference/system-databases/metrics-summary.md
new file mode 100644
index 000000000000..f7bd3c35ab33
--- /dev/null
+++ b/reference/system-databases/metrics-summary.md
@@ -0,0 +1,190 @@
+---
+title: METRICS_SUMMARY
+summary: 了解 TiDB 集群配置表 `METRICS_SUMMARY`。
+category: reference
+---
+
+# METRICS_SUMMARY
+
+由于 TiDB 集群的监控指标数量较多，为了方便用户从众多监控中找出异常的监控项，TiDB 4.0 提供了以下监控汇总表：
+
+* `information_schema.metrics_summary`
+* `information_schema.metrics_summary_by_label`
+
+这两张表用于汇总所有监控数据，用户排查各个监控指标会更有效率。其中 `information_schema.metrics_summary_by_label` 会对不同的 label 进行区分统计。
+
+{{< copyable "sql" >}}
+
+```sql
+mysql> desc metrics_summary;
+```
+
+```
++--------------+-----------------------+------+------+---------+-------+
+| Field        | Type                  | Null | Key  | Default | Extra |
++--------------+-----------------------+------+------+---------+-------+
+| METRICS_NAME | varchar(64)           | YES  |      | NULL    |       |
+| QUANTILE     | double unsigned       | YES  |      | NULL    |       |
+| SUM_VALUE    | double(22,6) unsigned | YES  |      | NULL    |       |
+| AVG_VALUE    | double(22,6) unsigned | YES  |      | NULL    |       |
+| MIN_VALUE    | double(22,6) unsigned | YES  |      | NULL    |       |
+| MAX_VALUE    | double(22,6) unsigned | YES  |      | NULL    |       |
+| COMMENT      | varchar(256)          | YES  |      | NULL    |       |
++--------------+-----------------------+------+------+---------+-------+
+```
+
+字段解释：
+
+* `METRICS_NAME`：监控表名。
+* `QUANTILE`：百分位。可以通过 SQL 语句指定 `QUANTILE`，例如：
+    * `select * from metrics_summary where quantile=0.99` 指定查看百分位为 0.99 的数据。
+    * `select * from metrics_summary where quantile in (0.80, 0.99, 0.99, 0.999)` 同时查看百分位为 0.80, 0.99, 0.99, 0.999 的数据。
+* `SUM_VALUE、AVG_VALUE、MIN_VALUE、MAX_VALUE` 分别表示总和、平均值、最小值、最大值。
+* `COMMENT`：对应监控的解释。
+
+具体查询示例：
+
+查询 `'2020-03-08 13:23:00', '2020-03-08 13:33:00'` 时间范围内 TiDB 集群中平均耗时最高的三组监控项。可直接查询 `information_schema.metrics_summary` 表，并通过 `/*+ time_range() */` 这个 hint 来指定时间范围，构造的 SQL 语句如下：
+
+{{< copyable "sql" >}}
+
+```sql
+select /*+ time_range('2020-03-08 13:23:00','2020-03-08 13:33:00') */ *
+from information_schema.`METRICS_SUMMARY`
+where metrics_name like 'tidb%duration'
+ and avg_value > 0
+ and quantile = 0.99
+order by avg_value desc
+limit 3\G
+```
+
+```
+***************************[ 1. row ]***************************
+METRICS_NAME | tidb_get_token_duration
+QUANTILE     | 0.99
+SUM_VALUE    | 8.972509
+AVG_VALUE    | 0.996945
+MIN_VALUE    | 0.996515
+MAX_VALUE    | 0.997458
+COMMENT      |  The quantile of Duration (us) for getting token, it should be small until concurrency limit is reached(second)
+***************************[ 2. row ]***************************
+METRICS_NAME | tidb_query_duration
+QUANTILE     | 0.99
+SUM_VALUE    | 0.269079
+AVG_VALUE    | 0.007272
+MIN_VALUE    | 0.000667
+MAX_VALUE    | 0.01554
+COMMENT      | The quantile of TiDB query durations(second)
+***************************[ 3. row ]***************************
+METRICS_NAME | tidb_kv_request_duration
+QUANTILE     | 0.99
+SUM_VALUE    | 0.170232
+AVG_VALUE    | 0.004601
+MIN_VALUE    | 0.000975
+MAX_VALUE    | 0.013
+COMMENT      | The quantile of kv requests durations by store
+```
+
+类似的，查询 `metrics_summary_by_label` 监控汇总表示例如下：
+
+{{< copyable "sql" >}}
+
+```sql
+select /*+ time_range('2020-03-08 13:23:00','2020-03-08 13:33:00') */ *
+from information_schema.`METRICS_SUMMARY_BY_LABEL`
+where metrics_name like 'tidb%duration'
+ and avg_value > 0
+ and quantile = 0.99
+order by avg_value desc
+limit 10\G
+```
+
+```
+***************************[ 1. row ]***************************
+INSTANCE     | 172.16.5.40:10089
+METRICS_NAME | tidb_get_token_duration
+LABEL        |
+QUANTILE     | 0.99
+SUM_VALUE    | 8.972509
+AVG_VALUE    | 0.996945
+MIN_VALUE    | 0.996515
+MAX_VALUE    | 0.997458
+COMMENT      |  The quantile of Duration (us) for getting token, it should be small until concurrency limit is reached(second)
+***************************[ 2. row ]***************************
+INSTANCE     | 172.16.5.40:10089
+METRICS_NAME | tidb_query_duration
+LABEL        | Select
+QUANTILE     | 0.99
+SUM_VALUE    | 0.072083
+AVG_VALUE    | 0.008009
+MIN_VALUE    | 0.007905
+MAX_VALUE    | 0.008241
+COMMENT      | The quantile of TiDB query durations(second)
+***************************[ 3. row ]***************************
+INSTANCE     | 172.16.5.40:10089
+METRICS_NAME | tidb_query_duration
+LABEL        | Rollback
+QUANTILE     | 0.99
+SUM_VALUE    | 0.072083
+AVG_VALUE    | 0.008009
+MIN_VALUE    | 0.007905
+MAX_VALUE    | 0.008241
+COMMENT      | The quantile of TiDB query durations(second)
+```
+
+前文提到 `metrics_summary_by_label` 表结构相对于 `metrics_summary` 多了一列 `LABEL`。以上面查询结果的第 2、3 行分别表示 `tidb_query_duration` 的 `Select` 和 `Rollback` 类型的语句平均耗时非常高。
+
+除以上示例之外，监控汇总表可以通过对比两个时间段的全链路监控，迅速找出监控数据中变化最大的模块，快速定位瓶颈。以下示例对比两个时间段的所有监控（其中 t1 为 baseline），并按照差别最大的监控排序：
+
+* 时间段 t1：`("2020-03-03 17:08:00", "2020-03-03 17:11:00")`
+* 时间段 t2：`("2020-03-03 17:18:00", "2020-03-03 17:21:00")`
+
+对两个时间段的监控按照 `METRICS_NAME` 进行 join，并按照差异值大小排序。其中 `TIME_RANGE` 是用于指定查询时间的 hint。
+
+{{< copyable "sql" >}}
+
+```sql
+SELECT GREATEST(t1.avg_value,t2.avg_value)/LEAST(t1.avg_value,
+         t2.avg_value) AS ratio,
+         t1.metrics_name,
+         t1.avg_value as t1_avg_value,
+         t2.avg_value as t2_avg_value,
+         t2.comment
+FROM 
+    (SELECT /*+ time_range("2020-03-03 17:08:00", "2020-03-03 17:11:00")*/ *
+    FROM information_schema.metrics_summary ) t1
+JOIN 
+    (SELECT /*+ time_range("2020-03-03 17:18:00", "2020-03-03 17:21:00")*/ *
+    FROM information_schema.metrics_summary ) t2
+    ON t1.metrics_name = t2.metrics_name
+ORDER BY  ratio DESC limit 10;
+```
+
+```
++----------------+------------------------------------------+----------------+------------------+---------------------------------------------------------------------------------------------+
+| ratio          | metrics_name                             | t1_avg_value   | t2_avg_value     | comment                                                                                     |
++----------------+------------------------------------------+----------------+------------------+---------------------------------------------------------------------------------------------+
+| 5865.59537065  | tidb_slow_query_cop_process_total_time   |       0.016333 |        95.804724 | The total time of TiDB slow query statistics with slow query total cop process time(second) |
+| 3648.74109023  | tidb_distsql_partial_scan_key_total_num  |   10865.666667 |  39646004.4394   | The total num of distsql partial scan key numbers                                           |
+|  267.002351165 | tidb_slow_query_cop_wait_total_time      |       0.003333 |         0.890008 | The total time of TiDB slow query statistics with slow query total cop wait time(second)    |
+|  192.43267836  | tikv_cop_total_response_total_size       | 2515333.66667  | 484032394.445    |                                                                                             |
+|  192.43267836  | tikv_cop_total_response_size_per_seconds |   41922.227778 |   8067206.57408  |                                                                                             |
+|  152.780296296 | tidb_distsql_scan_key_total_num          |    5304.333333 |    810397.618317 | The total num of distsql scan numbers                                                       |
+|  126.042290167 | tidb_distsql_execution_total_time        |       0.421622 |        53.142143 | The total time of distsql execution(second)                                                 |
+|  105.164020657 | tikv_cop_scan_details                    |     134.450733 |     14139.379665 |                                                                                             |
+|  105.164020657 | tikv_cop_scan_details_total              |    8067.043981 |    848362.77991  |                                                                                             |
+|  101.635495394 | tikv_cop_scan_keys_num                   |    1070.875    |    108838.91113  |                                                                                             |
++----------------+------------------------------------------+----------------+------------------+---------------------------------------------------------------------------------------------+
+```
+
+上面查询结果表示：
+
+* t2 时间段内的 `tidb_slow_query_cop_process_total_time`（TiDB 慢查询中的 `cop process` 耗时）比 t1 时间段高了 5865 倍。
+* t2 时间段内的 `tidb_distsql_partial_scan_key_total_num`（TiDB 的 `distsql` 请求扫描key 的数量）比 t1 时间段高了 3648 倍。
+t2 时间段内，`tidb_slow_query_cop_wait_total_time` (TiDB 慢查询中的 cop 请求排队等待的耗时) 比 t1 时间段高了 267 倍。
+* t2 时间段内的 `tikv_cop_total_response_size`（TiKV 的 cop 请求结果的大小 ）比 t1 时间段高了 192 倍。
+* t2 时间段内的 `tikv_cop_scan_details`（TiKV 的 cop 请求的 scan ）比 t1 时间段高了 105 倍。
+
+综上，我们可以马上知道 t2 时间段的 cop 请求要比 t2 时间段高很多，导致 TiKV 的 Copprocessor 过载，出现了 `cop task` 等待，可以猜测可能是 t2 时间段出现了一些大查询，或者是查询较多的负载。
+
+实际上，在 t1 ~ t2 整个时间段内都在跑 `go-ycsb` 的压测，然后在 t2 时间段跑了 20 个 `tpch` 的查询，所以是因为 `tpch` 大查询导致了出现很多的 cop 请求。
diff --git a/reference/system-databases/metrics-tables.md b/reference/system-databases/metrics-tables.md
new file mode 100644
index 000000000000..b157be3ccf9c
--- /dev/null
+++ b/reference/system-databases/metrics-tables.md
@@ -0,0 +1,35 @@
+---
+title: METRICS_TABLES
+summary: 了解 TiDB 集群配置表 `METRICS_TABLES`。
+category: reference
+---
+
+# METRICS_TABLES
+
+`METRICS_TABLES` 表提供了 `metrics_schema` 中所有监控表的相关信息。
+
+{{< copyable "sql" >}}
+
+```sql
+desc metrics_tables;
+```
+
+```
++------------+-----------------+------+------+---------+-------+
+| Field      | Type            | Null | Key  | Default | Extra |
++------------+-----------------+------+------+---------+-------+
+| TABLE_NAME | varchar(64)     | YES  |      | NULL    |       |
+| PROMQL     | varchar(64)     | YES  |      | NULL    |       |
+| LABELS     | varchar(64)     | YES  |      | NULL    |       |
+| QUANTILE   | double unsigned | YES  |      | NULL    |       |
+| COMMENT    | varchar(256)    | YES  |      | NULL    |       |
++------------+-----------------+------+------+---------+-------+
+```
+
+表 `metrics_tables` 的字段解释：
+
+* `TABLE_NAME`：对应于 `metrics_schema` 中的表名。
+* `PROMQL`：监控表的主要原理是将 SQL 映射成 `PromQL`，并将 Prometheus 结果转换成 SQL 查询结果。这个字段是 `PromQL` 的表达式模板，获取监控表数据时使用查询条件改写模板中的变量，生成最终的查询表达式。
+* `LABELS`：监控定义的 label，每一个 label 对应监控表中的一列。SQL 中如果包含对应列的过滤，对应的 `PromQL` 也会改变。
+* `QUANTILE`：百分位。对于直方图类型的监控数据，指定一个默认百分位。如果值为 `0`，表示该监控表对应的监控不是直方图。
+* `COMMENT`：对这个监控表的解释。
diff --git a/reference/system-databases/sql-diagnosis.md b/reference/system-databases/sql-diagnosis.md
new file mode 100644
index 000000000000..bcfdc6e8f2a1
--- /dev/null
+++ b/reference/system-databases/sql-diagnosis.md
@@ -0,0 +1,48 @@
+---
+title: SQL 诊断
+summary: 了解 SQL 诊断功能。
+category: reference
+---
+
+# SQL 诊断
+
+SQL 诊断功能是在 TiDB 4.0 版本中引入的特性，用于提升 TiDB 问题定位的效率。TiDB 4.0 版本以前，用户需要使用不同工具获取以异构的方式获取不同信息。
+新的 SQL 诊断系统对这些离散的信息进行了整体设计，它整合系统各个维度的信息，通过系统表的方式向上层提供一致的接口，提供监控汇总与自动诊断，方便用户查询集群信息。
+
+SQL 诊断共分三大块：
+
+* **集群信息表**：TiDB 4.0 诊断系统添加了集群信息表，为原先离散的各节点实例信息提供了统一的获取方式。它将整个集群的集群拓扑、硬件信息、软件信息、内核参数、监控、系统信息、慢查询、语句、日志完全整合在表中，让用户能够统一使用 SQL 进行查询。
+* **集群监控表**：TiDB 4.0 诊断系统添加了集群监控系统表，所有表都在 `metrics_schema` 中，可以通过 SQL 语句来查询监控信息。比起原先的可视化监控，SQL 查询监控允许用户对整个集群的所有监控进行关联查询，并对比不同时间段的结果，迅速找出性能瓶颈。由于 TiDB 集群的监控指标数量较大，SQL 诊断还提供了监控汇总表，让用户能够更便捷地从众多监控中找出异常的监控项。
+* **自动诊断**：尽管用户可以手动执行 SQL 来查询集群信息表、集群监控表与汇总表，但自动诊断更加方便。所以 SQL 诊断基于已有的集群信息表和监控表，提供了与之相关的诊断结果表与诊断汇总表来执行自动诊断。
+
+## 集群信息表
+
+集群信息表将一个集群中的所有节点实例的信息都汇聚在一起，让用户仅通过一条 SQL 就能查询整个集群相关信息。
+集群信息表列表如下：
+
+* 集群拓扑表 [`information_schema.cluster_info`](/reference/system-databases/cluster-info.md) 用于获取集群当前的拓扑信息，以及各个节点的版本、版本对应的 Git Hash、各节点的启动时间、各节点的运行时间。
+* 集群配置表 [`information_schema.cluster_config`](/reference/system-databases/cluster-config.md) 用于获取集群当前所有节点的配置。对于 TiDB 4.0 之前的版本，用户必须逐个访问各个节点的 HTTP API 才能获取这些配置信息。
+* 集群硬件表 [`information_schema.cluster_hardware`](/reference/system-databases/cluster-hardware.md) 用于快速查询集群硬件信息。
+* 集群负载表 [`information_schema.cluster_load`](/reference/system-databases/cluster-load.md) 用于查询集群不同节点以及不同硬件类型的负载信息。
+* 内核参数表 [`information_schema.cluster_systeminfo`](/reference/system-databases/cluster-systeminfo.md) 用于查询集群不同节点的内核配置信息。目前支持查询 sysctl 的信息。
+* 集群日志表 [`information_schema.cluster_log`](/reference/system-databases/cluster-log.md) 用于集群日志查询，通过将查询条件下推到各个节点，降低日志查询对集群的影响，性能影响小于等 grep 命令。
+
+TiDB 4.0 之前的系统表，只能查看当前节点，TiDB 4.0 实现了对应的集群表，可以在单个 TiDB 节点上拥有整个集群的全局视图。这些表目前都位于 [`information_schema`](/reference/system-databases/information-schema.md) 中，查询方式与其他 `information_schema` 系统表一致。
+
+## 集群监控表
+
+为了能够动态地观察并对比不同时间段的集群情况，TiDB 4.0 诊断系统添加了集群监控系统表。所有监控表都在 `metrics_schema` 中，可以通过 SQL 的方式查询监控信息。SQL 查询监控允许用户对整个集群的所有监控进行关联查询，并对比不同时间段的结果，迅速找出性能瓶颈。
+
+* [`information_schema.metrics_tables`](/reference/system-databases/metrics-tables.md)：由于目前添加的系统表数量较多，因此用户可以通过该表查询这些监控表的相关元信息。
+
+由于 TiDB 集群的监控指标数量较大，因此 TiDB 4.0 提供以下监控汇总表：
+
+* 监控汇总表 [`information_schema.metrics_summary`](/reference/system-databases/metrics-summary.md) 用于汇总所有监控数据，以提升用户排查各监控指标的效率。
+* 监控汇总表 [`information_schema.metrics_summary_by_label`](/reference/system-databases/metrics-summary.md) 同样用于汇总所有监控数据，不过该表会对不同的 label 进行区分统计。
+
+## 自动诊断
+
+以上集群信息表和集群监控表均需要用户手动执行固定模式的 SQL 语句来排查集群问题。为了进一步优化用户体验，TiDB 根据已有的基础信息表，提供诊断相关的系统表，使诊断自动执行。自动诊断相关的系统表如下：
+
+* 诊断结果表 [`information_schema.inspection_result`](/reference/system-databases/inspection-result.md) 用于展示对系统的诊断结果。诊断是惰性触发，使用 `select * from inspection_result` 会触发所有诊断规则对系统进行诊断，并在结果中展示系统中的故障或风险。
+* 诊断汇总表 [`information_schema.inspection_summary`](/reference/system-databases/inspection-summary.md) 用于对特定链路或模块的监控进行汇总，用户可以根据整个模块或链路的上下文来排查定位问题。