Connect TiDB cluster to Datadog in order to:
- Collect key TiDB metrics of your cluster.
- Collect logs of your cluster, such as TiDB/TiKV/TiFlash logs and slow query logs.
- Visualize cluster performance on the provided dashboard.
Note:
- TiDB 4.0+ is required for this integration.
- For TiDB Cloud, see the TiDB Cloud Integration.
First, download and launch the Datadog Agent.
Then, manually install the TiDB check. Instructions vary depending on the environment.
Run datadog-agent integration install -t datadog-tidb==<INTEGRATION_VERSION>
.
- Edit the
tidb.d/conf.yaml
file in theconf.d/
folder at the root of your Agent's configuration directory to start collecting your TiDB performance data. See the sample tidb.d/conf.yaml for all available configuration options.
The sample tidb.d/conf.yaml only configures the PD instance. You need to manually configure the other instances in the TiDB cluster. Like this:
init_config:
instances:
- pd_metric_url: http://localhost:2379/metrics
send_distribution_buckets: true
tags:
- cluster_name:cluster01
- tidb_metric_url: http://localhost:10080/metrics
send_distribution_buckets: true
tags:
- cluster_name:cluster01
- tikv_metric_url: http://localhost:20180/metrics
send_distribution_buckets: true
tags:
- cluster_name:cluster01
- tiflash_metric_url: http://localhost:8234/metrics
send_distribution_buckets: true
tags:
- cluster_name:cluster01
- tiflash_proxy_metric_url: http://localhost:20292/metrics
send_distribution_buckets: true
tags:
- cluster_name:cluster01
Available for Agent versions >6.0
-
Collecting logs is disabled by default in the Datadog Agent, enable it in your
datadog.yaml
file:logs_enabled: true
-
Add this configuration block to your
tidb.d/conf.yaml
file to start collecting your TiDB logs:logs: # pd log - type: file path: "/tidb-deploy/pd-2379/log/pd*.log" service: "tidb-cluster" source: "pd" # tikv log - type: file path: "/tidb-deploy/tikv-20160/log/tikv*.log" service: "tidb-cluster" source: "tikv" # tidb log - type: file path: "/tidb-deploy/tidb-4000/log/tidb*.log" service: "tidb-cluster" source: "tidb" exclude_paths: - /tidb-deploy/tidb-4000/log/tidb_slow_query.log - type: file path: "/tidb-deploy/tidb-4000/log/tidb_slow_query*.log" service: "tidb-cluster" source: "tidb" log_processing_rules: - type: multi_line name: new_log_start_with_datetime pattern: '#\sTime:' tags: - "custom_format:tidb_slow_query" # tiflash log - type: file path: "/tidb-deploy/tiflash-9000/log/tiflash*.log" service: "tidb-cluster" source: "tiflash"
Change the
path
andservice
according to your cluster's configuration.Use these commands to show all log path:
# show deploying directories tiup cluster display <YOUR_CLUSTER_NAME> # find specific logging file path by command arguments ps -fwwp <TIDB_PROCESS_PID/PD_PROCESS_PID/etc.>
Run the Agent's status subcommand and look for tidb
under the Checks section.
See metadata.csv for a list of metrics provided by this check.
It is possible to use the
metrics
configuration option to collect additional metrics from a TiDB cluster.
TiDB check does not include any events.
Service Checks are based on tidb_cluster.prometheus.health
metrics. This check is controlled by the health_service_check
config and default to true
.
You can modify this behavior in tidb.yml
file.
See service_checks.json for a list of service checks provided by this integration.
CPU and Memory metrics are not provided for TiKV and TiFlash instances in the following cases:
- Running TiKV or TiFlash instances with tiup playground on macOS.
- Running TiKV or TiFlash instances with docker-compose up on a new Apple M1 machine.
The TiDB check enables Datadog's distribution
metric type by default. This part of data is quite large and may consume lots of resources. You can modify this behavior in tidb.yml
file:
send_distribution_buckets: false
Since there are many important metrics in a TiDB cluster, the TiDB check sets max_returned_metrics
to 10000
by default. You can decrease max_returned_metrics
in tidb.yml
file if necessary:
max_returned_metrics: 1000
Need help? Contact Datadog support.