Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

采集插件的设计 #3

Open
UlricQin opened this issue Nov 26, 2023 · 0 comments
Open

采集插件的设计 #3

UlricQin opened this issue Nov 26, 2023 · 0 comments
Labels

Comments

@UlricQin
Copy link
Contributor

前言

cprobe 需要采用插件化机制,来集成众多的采集能力,比如把 mysqld_exporter、redis_exporter、categraf http_response 等都集成到 cprobe 中。对于每个具体的插件而言,核心包括:

  • 要采集的 target 的列表获取。学习 Prometheus 的 scrape,初期支持 static_configs、http_sd_configs、file_sd_configs
  • 采集时使用的配置参数,配置文件要能够切文件管理,这样不同的 target 可以使用不同的采集规则

组织形式

所有 cprobe 的配置内容都放到 conf.d 目录下,conf.d 的每个子目录就表示一个插件,比如 conf.d 下有一个 mysql 目录,放置 mysql 采集插件的相关配置,有一个 redis 目录,放置 redis 采集插件的相关配置。

对于某一个插件目录,比如 mysql 目录下面,是一个 main.yaml 作为入口配置文件。当然,也可以有多个入口文件,cprobe 使用 main*.yaml 做 glob 通配,匹配到几个 yaml 文件,就有几个入口配置文件。入口配置文件中要配置要采集的目标,要使用的采集规则的文件。

配置举例

假设 mysql 插件目录下有个 main.yaml,其配置大概会长这个样子:

global:
  scrape_interval: 10s

scrape_configs:
- job_name: 'mysql_prod'
  scrape_interval: 5s
  scrape_rule_files:
  - 'rules.d/common.toml'
  - 'rules.d/schema.toml'
  static_configs:
  - targets:
    - 'a.com:3306'
    - 'b.com:3306'
    labels:
      name: 'ulricqin'
      city: 'beijing'
  - targets:
    - 'c.com:3306'
    - 'd.com:3306'
    labels:
      name: 'ulricqin2'
      city: 'beijing2'
      lang: '%{LANG}'

- job_name: 'mysql_test'
  http_sd_configs:
  - url: http://localhost:8080/get-targets
  scrape_rule_files:
  - 'rules.d/common.toml'

- job_name: 'mysql_abcd'
  file_sd_configs:
  - files:
    - 'inst.yaml'
  scrape_rule_files:
  - 'rule_head.toml'
  - 'rule_coll.toml'
  - 'rule_cust.toml'

上面的配置和 Prometheus 的 scrape 配置几乎一样,多的部分是 scrape_rule_files,对于每个采集 job 而言,通过各类 sd 配置可以拿到要采集的目标 target,但是具体如何采集,就是需要靠这些 scrape_rule_files 来指定了。

配置文件应该支持环境变量,使用 %{ENV_VAR} 的格式来表示,例如上例中的 %{LANG} 就是引用了环境变量中的 LANG 变量的值。

对于 file_sd_configs,初期只支持引用 yaml 文件,应该够用了。

scrape_rules_files 是引用了一堆 toml 文件,cprobe 具体处理的时候就是依次读取这几个 toml 文件的内容,然后拼成一个大的配置文件。只有 toml 格式才适合做这种拼接,yaml 和 json 都不行,所以 rule 的配置使用 toml 格式。

rule_head.toml 举例:

[global]
user = 'root'
password = 'cProbePa55'
# ssl_ca = '/etc/mysql/ssl/ca.pem'
# ssl_cert = '/etc/mysql/ssl/client-cert.pem'
# ssl_key = '/etc/mysql/ssl/client-key.pem'
# ssl_skip_verfication = true
# tls = 'skip-verify'

一般每个插件都应该要配置认证信息,对于 mysql 而言,就是 user 和 password,当然了,如果启用了 ssl 还需要有证书相关的配置,上面的配置是尽可能和 mysqld_exporter 保持一致,便于大家理解。

rule_coll.toml 举例:

[collect_global_status]
enabled = true

[collect_global_variables]
enabled = true

[collect_slave_status]
enabled = true

[collect_info_schema_innodb_cmp]
enabled = true

[collect_info_schema_innodb_cmpmem]
enabled = true

[collect_info_schema_query_response_time]
enabled = true

[collect_info_schema_processlist]
enabled = true
# Minimum time a thread must be in each state to be counted
min_time = 0
# Enable collecting the number of processes by user
processes_by_user = true
# Enable collecting the number of processes by host
processes_by_host = true

[collect_info_schema_tables]
enabled = false
# The list of databases to collect table stats for, or '*' for all
databases = "*"

[collect_info_schema_innodb_tablespaces]
enabled = false

[collect_info_schema_innodb_metrics]
enabled = false

[collect_info_schema_userstats]
enabled = false

[collect_info_schema_clientstats]
enabled = false

[collect_info_schema_tablestats]
enabled = false

[collect_info_schema_schemastats]
enabled = false

[collect_info_schema_replica_host]
enabled = false

[collect_mysql_user]
enabled = false
# Enable collecting user privileges from mysql.user
collect_user_privileges = false

[collect_auto_increment_columns]
enabled = false

[collect_binlog_size]
enabled = false

[collect_perf_schema_tableiowaits]
enabled = false

[collect_perf_schema_indexiowaits]
enabled = false

[collect_perf_schema_tablelocks]
enabled = false

[collect_perf_schema_eventsstatements]
enabled = false
# Limit the number of events statements digests by response time
limit = 250
# Limit how old the 'last_seen' events statements can be, in seconds
timelimit = 86400
# Maximum length of the normalized statement text
digest_text_limit = 120

[collect_perf_schema_eventsstatementssum]
enabled = false

[collect_perf_schema_eventswaits]
enabled = false

[collect_perf_schema_file_events]
enabled = false

[collect_perf_schema_file_instances]
enabled = false
# RegEx file_name filter for performance_schema.file_summary_by_instance
filter = ".*"
# Remove path prefix in performance_schema.file_summary_by_instance
remove_prefix = "/var/lib/mysql/"

[collect_perf_schema_memory_events]
enabled = false
# Remove instrument prefix in performance_schema.memory_summary_global_by_event_name
remove_prefix = "memory/"

[collect_perf_schema_replication_group_members]
enabled = false

[collect_perf_schema_replication_group_member_stats]
enabled = false

[collect_perf_schema_replication_applier_status_by_worker]
enabled = false

[collect_sys_user_summary]
enabled = false

[collect_engine_tokudb_status]
enabled = false

[collect_engine_innodb_status]
enabled = false

[collect_heartbeat]
enabled = false
# Database from where to collect heartbeat data
database = "heartbeat"
# Table from where to collect heartbeat data
table = "heartbeat"
# Use UTC for timestamps of the current server
utc = true

[collect_slave_hosts]
enabled = false

这一堆 collect 相关的配置,其实是 mysqld_exporter 的所有命令行参数,在 cprobe 的体系里,需要改成配置文件。

rule_cust.toml 举例:

[[queries]]
mesurement = "biz_users"
metric_fields = [ "total" ]
label_fields = [ "service" ]
field_to_append = "x"
timeout = "3s"
request = '''
select 'n9e' as service, 'test' as x, count(*) as total from n9e_v6.users
'''

这是 cprobe 对 mysql 采集的扩展能力。允许用户自定义采集的 SQL。

rule_head.toml、rule_coll.toml、rule_cust.toml 三个配置文件拼成了最终的配置,你可以对这三个文件进行重新组织,甚至重新切分,不同的 scrape job 可以采用不同的 rule 文件的组合,非常灵活。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant