Skip to content
This repository was archived by the owner on Mar 6, 2025. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions collector/tidb/examples/compose/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Monitor TiDB with the OpenTelemetry Collector for Cloud Observability

## Overview

TiDB is a distributed, scalable, and highly available NewSQL database. Monitoring TiDB is crucial to ensure its
reliability, performance, and scalability. Utilizing the OpenTelemetry Collector, you can efficiently collect and
transmit TiDB metrics to Cloud Observability, enabling comprehensive analysis and visualization of your TiDB deployment.
This README provides step-by-step instructions for integrating TiDB metrics with Cloud Observability using the
OpenTelemetry Collector.

## Prerequisites

* Docker
* Docker Compose
* A Cloud Observability account
* Cloud Observability [access token][ls-docs-access-token]

## How to set it up

1. **Export your Cloud Observability access token**:
```bash
export LS_ACCESS_TOKEN=<YOUR_LIGHTSTEP_TOKEN>
```
2. **Run the docker compose example to spin up TiDB and the OpenTelemetry Collector**:
```bash
docker-compose up -d
```
3. **Access [tidb-vision](https://github.com/pingcap/tidb-vision)** at http://localhost:8010.
4. **Access Spark Web UI at http://localhost:8080** and access [TiSpark](https://github.com/pingcap/tispark) through
spark://127.0.0.1:7077
5. **Monitor TiDB Metrics in Cloud Observability**: After setup, TiDB metrics will start populating in your Cloud
Observability dashboard.
6. **Shutting down the monitoring setup**:
```bash
docker-compose down -v
```

## Configuring TiDB for Advanced Monitoring

This guide assumes a standard TiDB setup. TiDB offers various configuration options that can impact the metrics it
provides. For detailed configurations and best practices, always refer to [the official TiDB documentation][tidb-docs].

[ls-docs-access-token]: https://docs.lightstep.com/docs/create-and-manage-access-tokens

[tidb-docs]: https://docs.pingcap.com/tidb/stable/grafana-tidb-dashboard#tidb-monitoring-metrics
27 changes: 27 additions & 0 deletions collector/tidb/examples/compose/collector.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
receivers:
prometheus:
config:
scrape_configs:
- job_name: 'tidb-cluster'
scrape_interval: 5s
honor_labels: true
static_configs:
- targets: [ 'pushgateway:9091' ]

processors:
batch:

exporters:
logging:
loglevel: debug
otlp:
endpoint: ingest.lightstep.com:443
headers:
- lightstep-access-token: ${LS_ACCESS_TOKEN}

service:
pipelines:
metrics:
receivers: [ prometheus ]
processors: [ batch ]
exporters: [ logging,otlp ]
86 changes: 86 additions & 0 deletions collector/tidb/examples/compose/config/pd.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# PD Configuration.

name = "pd"
data-dir = "default.pd"

client-urls = "http://127.0.0.1:2379"
# if not set, use ${client-urls}
advertise-client-urls = ""

peer-urls = "http://127.0.0.1:2380"
# if not set, use ${peer-urls}
advertise-peer-urls = ""

initial-cluster = "pd=http://127.0.0.1:2380"
initial-cluster-state = "new"

lease = 3
tso-save-interval = "3s"

[security]
# Path of file that contains list of trusted SSL CAs. if set, following four settings shouldn't be empty
cacert-path = ""
# Path of file that contains X509 certificate in PEM format.
cert-path = ""
# Path of file that contains X509 key in PEM format.
key-path = ""

[log]
level = "error"

# log format, one of json, text, console
#format = "text"

# disable automatic timestamps in output
#disable-timestamp = false

# file logging
[log.file]
#filename = ""
# max log file size in MB
#max-size = 300
# max log file keep days
#max-days = 28
# maximum number of old log files to retain
#max-backups = 7
# rotate log by day
#log-rotate = true

[metric]
# prometheus client push interval, set "0s" to disable prometheus.
interval = "15s"
# prometheus pushgateway address, leaves it empty will disable prometheus.
address = "pushgateway:9091"

[schedule]
max-merge-region-size = 0
split-merge-interval = "1h"
max-snapshot-count = 3
max-pending-peer-count = 16
max-store-down-time = "30m"
leader-schedule-limit = 4
region-schedule-limit = 4
replica-schedule-limit = 8
merge-schedule-limit = 8
tolerant-size-ratio = 5.0

# customized schedulers, the format is as below
# if empty, it will use balance-leader, balance-region, hot-region as default
# [[schedule.schedulers]]
# type = "evict-leader"
# args = ["1"]

[replication]
# The number of replicas for each region.
max-replicas = 3
# The label keys specified the location of a store.
# The placement priorities is implied by the order of label keys.
# For example, ["zone", "rack"] means that we should place replicas to
# different zones first, then to different racks if we don't have enough zones.
location-labels = []

[label-property]
# Do not assign region leaders to stores that have these tags.
# [[label-property.reject-leader]]
# key = "zone"
# value = "cn1
2 changes: 2 additions & 0 deletions collector/tidb/examples/compose/config/spark-defaults.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
spark.tispark.pd.addresses pd0:2379
spark.sql.extensions org.apache.spark.sql.TiExtensions
Loading