Skip to content
This repository has been archived by the owner on Oct 5, 2022. It is now read-only.

Labeled metrics vs. metricnames #13

Open
pimvanpelt opened this issue Jul 14, 2017 · 1 comment
Open

Labeled metrics vs. metricnames #13

pimvanpelt opened this issue Jul 14, 2017 · 1 comment

Comments

@pimvanpelt
Copy link

Thank you for your work on this little gem. I installed the prometheus nagios exporter and have a specific question about your metric naming scheme: Rather than encoding the nagios command in the metric name, have you considered having it be a label instead?

nagios_command_exec_time{hostname="absynth.ipng.nl", command="paphosting_check_ssh", service="sshd:22"} 0.093268
nagios_command_latency{hostname="absynth.ipng.nl", command="paphosting_check_ssh", service="sshd:22"} 0.133
nagios_command_state{hostname="absynth.ipng.nl", command="paphosting_check_ssh", service="sshd:22"} 0
nagios_command_flapping{hostname="absynth.ipng.nl", command="paphosting_check_ssh", service="sshd:22"} 0
nagios_command_acknowledged{hostname="absynth.ipng.nl", command="paphosting_check_ssh", service="sshd:22"} 0
nagios_command_perf_data_value{hostname="absynth.ipng.nl", command="paphosting_check_ssh", service="sshd:22", key="time"} 0.084531

The reason why this is useful is because of aggregation on the prometheus side -- with the command as a label, the only rulesets we'd need asre ones for 'exec_time', 'latency', 'state', 'flapping', 'acknowledged' and possibly 'perf_data_value'. When new commands are added, no prometheus changes are necessary.

If you keep the command in the metric name, each new command requires rulesets on Prometheus (or its graphing consoles, or Grafana). Would you be willing to accept a PR that makes this configurable behind a boolean flag, say --command_labels, which defaults to false (in which case there is no difference), but flipping the flag to true outputs metrics as per above.

pimvanpelt added a commit to pimvanpelt/prometheus-nagios-exporter that referenced this issue Jul 14, 2017
labels with the nagios command, or if the metrics are named with the
nagios command.

--command_labels=false (the default)
nagios_paphosting_check_ssh_exec_time{hostname="absynth.ipng.nl", service="sshd:22"} 0.093268

--command_labels=true
nagios_command_exec_time{hostname="absynth.ipng.nl", command="paphosting_check_ssh", service="sshd:22"} 0.093268

See discussion in:
m-lab#13
@stephen-soltesz
Copy link
Contributor

@pimvanpelt Hey, sorry for the delay -- I really like this idea. This had not occurred to me.

From my experience so far working with Prometheus, my rule of thumb has become: a single query, rule, or alert in prometheus is worth between 10-100 lines of code in an exporter. So, if your proposal makes defining recording rules easier (or no worse if one wants to be specific), then that sounds like the right trade off.

I would welcome a PR. If you prefer to merge to master, then please ensure that coverage remains 100%; this may require adding new tests. If you don't have time for that, then please target the PR to the dev branch, and we'll sort it out later.

Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants