Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linkerd-viz helm: add support for metric_relabel_configs #12248

Merged
merged 2 commits into from Apr 11, 2024

Conversation

hvoigt
Copy link
Contributor

@hvoigt hvoigt commented Mar 12, 2024

It is currently not possible to easily drop metrics with the Prometheus configuration that is included with the linkerd-viz helm chart. To enable this we extend the Prometheus configuration template to add a metric_relabel_config section. This configuration is usually used when dropping metrics and thus allows users to customize their recorded metrics to avoid high cardinality.

E.g. there is a blog post that describes the problem in depth: https://itnext.io/optimizing-linkerd-metrics-in-prometheus-de607ec10f6b

With this change we are able to deploy the helm chart without the need to do custom modifications to the templates.

Fixes #12194

@hvoigt hvoigt requested a review from a team as a code owner March 12, 2024 13:19
@adleong
Copy link
Member

adleong commented Mar 12, 2024

Thanks for the PR, @hvoigt, and thanks for sharing that blog post. That's really cool. I would hesitate a bit to make the metric relabeling configurable like this because the linkerd-viz CLI and dashboard rely on certain labels and the exact set of required labels isn't exactly documented. Linkerd-viz is intended to be something that "just works" and we've been encouraging users who need to tune their prometheus to use an external promethus: https://linkerd.io/2.15/tasks/external-prometheus/#prometheus-scrape-configuration

@hvoigt
Copy link
Contributor Author

hvoigt commented Mar 13, 2024

Thanks for the PR, @hvoigt, and thanks for sharing that blog post. That's really cool. I would hesitate a bit to make the metric relabeling configurable like this because the linkerd-viz CLI and dashboard rely on certain labels and the exact set of required labels isn't exactly documented. Linkerd-viz is intended to be something that "just works" and we've been encouraging users who need to tune their prometheus to use an external promethus: https://linkerd.io/2.15/tasks/external-prometheus/#prometheus-scrape-configuration

Hey @adleong thanks for taking a look at my PR. The thing is we actually have our own separate Prometheus. We intentionally do not directly scrape linkerd metrics from there because we had problems with cardinality in the past. As there are some potentially unbounded label values we kept the linkerd Prometheus and federate the metrics we need to our external Prometheus. That way if we experience problems with cardinality not all metrics are affected. I do not see harm in adding a configuration option with a default that "just works". We could add a warning in a comment that you need to "know what you are doing". If you bring your own Prometheus you would have the same problem that the labels are not documented, right?

@adleong
Copy link
Member

adleong commented Mar 14, 2024

@hvoigt This sounds good. Would you mind adding that warning to the comment? Note that you'll also need to run bin/helm-docs to regenerate the helm docs with your changes.

It is currently not possible to easily drop metrics with the Prometheus
configuration that is included with the linkerd-viz helm chart. To
enable this we extend the Prometheus configuration template to add a
metric_relabel_config section. This configuration is usually used when
dropping metrics and thus allows users to customize their recorded
metrics to avoid high cardinality.

E.g. there is a blog post that describes the problem in depth:
https://itnext.io/optimizing-linkerd-metrics-in-prometheus-de607ec10f6b

With this change we are able to deploy the helm chart without the need
to do custom modifications to the templates.

Fixes linkerd#12194

Signed-off-by: Heiko Voigt <heiko.voigt@jimdo.com>
@hvoigt
Copy link
Contributor Author

hvoigt commented Mar 19, 2024

@hvoigt This sounds good. Would you mind adding that warning to the comment? Note that you'll also need to run bin/helm-docs to regenerate the helm docs with your changes.

@adleong Great! Here is the updated PR. helm-docs also updated.

@adleong adleong merged commit f283fc3 into linkerd:main Apr 11, 2024
35 checks passed
the-wondersmith pushed a commit to the-wondersmith/linkerd2 that referenced this pull request Apr 24, 2024
It is currently not possible to easily drop metrics with the Prometheus
configuration that is included with the linkerd-viz helm chart. To
enable this we extend the Prometheus configuration template to add a
metric_relabel_config section. This configuration is usually used when
dropping metrics and thus allows users to customize their recorded
metrics to avoid high cardinality.

E.g. there is a blog post that describes the problem in depth:
https://itnext.io/optimizing-linkerd-metrics-in-prometheus-de607ec10f6b

With this change we are able to deploy the helm chart without the need
to do custom modifications to the templates.

Fixes linkerd#12194

Signed-off-by: Heiko Voigt <heiko.voigt@jimdo.com>
Signed-off-by: Mark S <the@wondersmith.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow custom metric_relabel_config for Prometheus
2 participants