Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose config-reloader prometheus metrics #147

Merged
merged 1 commit into from
Oct 29, 2020

Conversation

tommasopozzetti
Copy link
Contributor

Currently, when config-reloader fails to validate the fluentd configurations
for a given namespace, an error is logged and is attached to the namespace
via an annotation. This system, however, requires users to actively look
at namespace annotations or logs to ensure no errors are present whenever
a change in a fluentd configuration is deployed.
Users that are already utilizing Prometheus in their Kubernetes clusters,
are most likely measuring applications' stability via metrics and routing
alerts to a central location, where they are more visible.

This commit introduces the capability for config-reloader to expose its
own Prometheus metrics, which provide a simple boolean metric per
namespace exposing whether the fluentd configs in that namespace passed
validation or not.
These simple metrics allow users to define rules in Prometheus to alert
in case a namespace is throwing errors in fluentd configurations validation.
Such a rule can look something like

alert: FluentdConfigValidationFailure
expr: kube_fluentd_operator_namespace_config_status{} == 0

This commit also adds the possibility of configuring the port that the
config-reloader will be listening on to expose such metrics via a flag.
Furthermore it adds the necessary resources to the helm chart to ensure
that, if the prometheusEnabled value is set to true, the corresponding
Services and, if enabled, ServiceMonitors are also created.

Currently, when config-reloader fails to validate the fluentd configurations
for a given namespace, an error is logged and is attached to the namespace
via an annotation. This system, however, requires users to actively look
at namespace annotations or logs to ensure no errors are present whenever
a change in a fluentd configuration is deployed.
Users that are already utlizing Prometheus in their Kubernetes clusters,
are most likely measuring applications' stability via metrics and routing
alerts to a central location, where they are more visible.

This commit introduces the capability for config-reloader to expose its
own Prometheus metrics, which provide a simple boolean metric per
namespace exposing whether the fluentd configs in that namespace passed
validation or not.
This simple metrics allow users to define rules in Prometheus to alert
in case a namespace is throwing errors in fluentd configurations validation.
Such a rule can look something like
```
alert: FluentdConfigValidationFailure
expr: kube_fluentd_operator_namespace_config_status{} == 0
```

This commit also adds the possibility of configuring the port that the
config-reloader will be listening on to expose such metrics via a flag.
Furthermore it adds the necessary resources to the helm chart to ensure
that, if the `prometheusEnabled` value is set to `true`, the corresponding
Services and, if enabled, ServiceMonitors are also created.

Signed-off-by: Tommaso Pozzetti <tommypozzetti@hotmail.it>
@OrlinVasilev
Copy link
Contributor

OrlinVasilev commented Oct 16, 2020

@tommasopozzetti great contribution! We will review and test in our labs we will cut a build as soon as we can. Thanks for all the contributions and you effort to this project is appreciated !

Copy link
Member

@viveksyngh viveksyngh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@viveksyngh viveksyngh merged commit fa7d2fa into vmware:master Oct 29, 2020
@viveksyngh
Copy link
Member

@tommasopozzetti Thanks for the contribution !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants