move to non-deprecated sql exporter #110

mrjones-plip · 2024-05-02T23:12:10Z

Following the recent work done in medic/cht-watchdog#81, this PR

updates compose file to stop using the SQL exporter from deprecated one (prometheuscommunity/postgres-exporter) and start using the supported one (burningalchemist/sql_exporter)
converts queries from old YAML format to new
updates scrape files in prometheus
updates JSON dashboard files in grafana

per #112

mrjones-plip · 2024-05-02T23:55:32Z

test steps

in watchdog repo, check out branch mrjones-sql-exporter-data-ingest-dont-merge
in watchdog, make sure you have a symlink data-ingest -> ../cht-app-monitoring-data-ingest/watchdog-config
in data ingest, checkout branchmrjones-migrate-sql-exporter
in data ingest, copy watchdog-config/sql_servers_example.yml to watchdog-config/sql_servers.yml
in data ingest edit sql_servers.yml to have a valid postgres username and password and IP based off your local RDBMS tunnel
in data ingest, edit scrape.yml so that scrape_interval: 10s and scrape_timeout: 5s - don't commit these values though!
in watchdog ./development/kill.start.ips.sh

demo video of test steps

data-ingest-exporter-demo.webm

demo mapping new metrics to old metrics

taking the dwh_replication_by_status metric, we can open the panel in the "Edit" view and compare it to the live metric on Watchdog. We can get metric parity by:

selecting the the new metric: replication_by_status_replication_failure_count -> dwh_replication_by_status
adding another label filter: type = replication_failure_count

old: last_over_time(replication_by_status_replication_failure_count{cht_instance="$cht_instance"}[$__interval])
new: last_over_time(dwh_replication_by_status{cht_instance="$cht_instance", type="replication_failure_count"}[$__interval])

mrjones-plip added 4 commits May 2, 2024 16:11

move to non-deprecated sql exporter

7c183d7

fix name of exporter service in scrape config

f17ef76

add another example metric

97524f2

update example file to be more accurate

fcb4cc0

mrjones-plip mentioned this pull request May 3, 2024

Add monitoring via prometheus and cht-watchdog medic/cht-user-management#68

Open

mrjones-plip added 4 commits May 3, 2024 09:38

fix port/db name in example sql config file

e17f4e3

add comment in scrape config about devault vs dev values

425c418

add replication_by_status metric

e993c91

remove commented out code in compose

ddfeaca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move to non-deprecated sql exporter #110

move to non-deprecated sql exporter #110

mrjones-plip commented May 2, 2024 •

edited

Loading

mrjones-plip commented May 2, 2024 •

edited

Loading

move to non-deprecated sql exporter #110

Are you sure you want to change the base?

move to non-deprecated sql exporter #110

Conversation

mrjones-plip commented May 2, 2024 • edited Loading

mrjones-plip commented May 2, 2024 • edited Loading

test steps

demo video of test steps

demo mapping new metrics to old metrics

mrjones-plip commented May 2, 2024 •

edited

Loading

mrjones-plip commented May 2, 2024 •

edited

Loading