Skip to content

Commit

Permalink
how-to/monitor: add monitor item for tidb binlog (#1539)
Browse files Browse the repository at this point in the history
* how-to/monitor: add monitor item for tidb binlog

* revise table format

* address the comments
  • Loading branch information
anotherrachel authored and yikeke committed Sep 23, 2019
1 parent 07f43b9 commit 146877c
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 8 deletions.
6 changes: 4 additions & 2 deletions dev/how-to/monitor/tidb-binlog.md
Expand Up @@ -38,10 +38,12 @@ To understand the Drainer monitoring metrics, check the following table:
| Pull Binlog QPS by Pump NodeID | Shows the QPS when Drainer obtains binlog from each Pump node |
| 95% Binlog Reach Duration By Pump | Records the delay from the time when binlog is written into Pump to the time when the binlog is obtained by Drainer |
| Error By Type | Shows the number of errors encountered by Drainer, counted based on the type of error |
| SQL Query Time | Records the time it takes Drainer to execute the SQL statement in the downstream |
| Drainer Event | Shows the number of various types of events, including "ddl", "insert", "delete", "update", "flush", and "savepoint" |
| Execute Time | Records the time it takes to execute the SQL statement in the downstream, or the time it takes to write data into downstream |
| Execute Time | Records the time it takes to write binlog into the downstream syncing module |
| 95% Binlog Size | Shows the size of the binlog data that Drainer obtains from each Pump node |
| DDL Job Count | Records the number of DDL statements handled by Drainer|
| DDL Job Count | Records the number of DDL statements handled by Drainer |
| Queue Size | Records the work queue size in Drainer |

## Alert rules

Expand Down
8 changes: 5 additions & 3 deletions v2.1/how-to/monitor/tidb-binlog.md
Expand Up @@ -38,10 +38,12 @@ To understand the Drainer monitoring metrics, check the following table:
| Pull Binlog QPS by Pump NodeID | Shows the QPS when Drainer obtains binlog from each Pump node |
| 95% Binlog Reach Duration By Pump | Records the delay from the time when binlog is written into Pump to the time when the binlog is obtained by Drainer |
| Error By Type | Shows the number of errors encountered by Drainer, counted based on the type of error |
| SQL Query Time | Records the time it takes Drainer to execute the SQL statement in the downstream |
| Drainer Event | Shows the number of various types of events, including "ddl", "insert", "delete", "update", "flush", and "savepoint" |
| Execute Time | Records the time it takes to execute the SQL statement in the downstream, or the time it takes to write data into downstream |
| Execute Time | Records the time it takes to write binlog into the downstream syncing module |
| 95% Binlog Size | Shows the size of the binlog data that Drainer obtains from each Pump node |
| DDL Job Count | Records the number of DDL statements handled by Drainer|
| DDL Job Count | Records the number of DDL statements handled by Drainer |
| Queue Size | Records the work queue size in Drainer |

## Alert rules

Expand Down Expand Up @@ -98,7 +100,7 @@ Currently, TiDB Binlog monitoring metrics are divided into the following three t
#### binlog_pump_storage_available_size_less_than_20G

- Description: The available disk space of Pump is less than 20G
- Monitoring rule: `binlog_pump_storage_storage_size_bytes{type="available"}` < 20 * 1024 * 1024 * 1024
- Monitoring rule: `binlog_pump_storage_storage_size_bytes{type="available"}` < 20 \* 1024 \* 1024 \* 1024
- Solution: Check whether Pump `gc_tso` is normal. If not, adjust the GC time configuration of Pump or get the corresponding Pump offline

#### binlog_drainer_checkpoint_tso_no_change_for_1m
Expand Down
8 changes: 5 additions & 3 deletions v3.0/how-to/monitor/tidb-binlog.md
Expand Up @@ -39,10 +39,12 @@ To understand the Drainer monitoring metrics, check the following table:
| Pull Binlog QPS by Pump NodeID | Shows the QPS when Drainer obtains binlog from each Pump node |
| 95% Binlog Reach Duration By Pump | Records the delay from the time when binlog is written into Pump to the time when the binlog is obtained by Drainer |
| Error By Type | Shows the number of errors encountered by Drainer, counted based on the type of error |
| SQL Query Time | Records the time it takes Drainer to execute the SQL statement in the downstream |
| Drainer Event | Shows the number of various types of events, including "ddl", "insert", "delete", "update", "flush", and "savepoint" |
| Execute Time | Records the time it takes to execute the SQL statement in the downstream, or the time it takes to write data into downstream |
| Execute Time | Records the time it takes to write binlog into the downstream syncing module |
| 95% Binlog Size | Shows the size of the binlog data that Drainer obtains from each Pump node |
| DDL Job Count | Records the number of DDL statements handled by Drainer|
| DDL Job Count | Records the number of DDL statements handled by Drainer |
| Queue Size | Records the work queue size in Drainer |

## Alert rules

Expand Down Expand Up @@ -99,7 +101,7 @@ Currently, TiDB Binlog monitoring metrics are divided into the following three t
#### binlog_pump_storage_available_size_less_than_20G

- Description: The available disk space of Pump is less than 20G
- Monitoring rule: `binlog_pump_storage_storage_size_bytes{type="available"}` < 20 * 1024 * 1024 * 1024
- Monitoring rule: `binlog_pump_storage_storage_size_bytes{type="available"}` < 20 \* 1024 \* 1024 \* 1024
- Solution: Check whether Pump `gc_tso` is normal. If not, adjust the GC time configuration of Pump or get the corresponding Pump offline

#### binlog_drainer_checkpoint_tso_no_change_for_1m
Expand Down

0 comments on commit 146877c

Please sign in to comment.