Best practice to configure alarming in cluster #1233
-
|
Hello, I,m searching for the best practice for monitoring and handling alerts in cluster. I found out two ways for alerting:
I want to know which way for configure alerting is better and what are some positive and negative aspects of both configurations? I will be very grateful for any advice or suggestion. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
Hello, OKD Logging is handled by the cluster-logging-operator, which effectively sets up a fluentd deployment that can be used to forward logs to an Elasticsearch instance (which can be either in-cluster with the Openshift ElasticSearch operator or external). OKD Monitoring built-in to every OKD cluster (and cannot be disabled). Here's a simplified example of what our config looks like: kind: Secret
apiVersion: v1
metadata:
name: alertmanager-main
namespace: openshift-monitoring
type: Opaque
stringData:
alertmanager.yaml: >-
# The root route on which each incoming alert enters.
# The root route with all parameters, which are inherited by the child
# routes if they are not overwritten.
route:
group_by: ['cluster', 'alertname']
# By default, send all alerts to this receiver
receiver: 'email_admins'
# Overwrite the receiver for specific alerts
routes:
# The Watchdog alert is intended to fire at all times. It makes sure that the entire alerting pipeline actually works.
# Basically it verifies, that Prometheus can reach Alertmanager, and Alertmanager can reach the notification provider.
# For this, we redirect this alarm to a black hole email address, in order to not receive dozens of notifications.
# Because it validates the entire alerting pipeline, a receiver needs to be configured and we cannot use the 'null' receiver.
- matchers:
- alertname="Watchdog"
receiver: 'email_nowhere'
# Disabled some built-in alerts
- matchers:
- alertname="UpdateAvailable"
receiver: 'null'
- matchers:
- alertname="CannotRetrieveUpdates"
receiver: 'null'
- matchers:
- alertname="MultipleContainersOOMKilled"
receiver: 'null'
receivers:
- name: 'null'
- name: 'email_nowhere'
email_configs:
- to: nowhere@example.com
from: noreply@example.com
smarthost: example.com:25
- name: 'email_admins'
email_configs:
- to: admins@example.com
from: noreply@example.com
smarthost: example.com:25
headers:
Subject: '[OKD4]{{ template "email.default.subject" . }}'I hope this helps. |
Beta Was this translation helpful? Give feedback.
Hello,
first of all I think you should distinguish between logging and metrics.
OKD Logging is handled by the cluster-logging-operator, which effectively sets up a fluentd deployment that can be used to forward logs to an Elasticsearch instance (which can be either in-cluster with the Openshift ElasticSearch operator or external).
This component does not provide any alerting out-of-the-box.
It is also not installed by default on an OKD cluster (you need to install the operators separately).
OKD Monitoring built-in to every OKD cluster (and cannot be disabled).
The cluster-monitoring-operator effectively deploys a customized version of the kube-prometheus operator: Prometheus along with va…