diff --git a/modules/monitoring-configuring-external-alertmanagers.adoc b/modules/monitoring-configuring-external-alertmanagers.adoc new file mode 100644 index 000000000000..0600118bb63b --- /dev/null +++ b/modules/monitoring-configuring-external-alertmanagers.adoc @@ -0,0 +1,163 @@ +// Module included in the following assemblies: +// +// * monitoring/configuring-the-monitoring-stack.adoc + +[id="monitoring-configuring-external-alertmanagers_{context}"] += Configuring external alertmanager instances + +The {product-title} monitoring stack includes a local Alertmanager instance that routes alerts from Prometheus. +You can add external Alertmanager instances by configuring the `cluster-monitoring-config` config map in either the `openshift-monitoring` project or the `user-workload-monitoring-config` project. + +If you add the same external Alertmanager configuration for multiple clusters and disable the local instance for each cluster, you can then manage alert routing for multiple clusters by using a single external Alertmanager instance. + +.Prerequisites + +* You have installed the OpenShift CLI (`oc`). +* *If you are configuring core {product-title} monitoring components in the `openshift-monitoring` project*: +** You have access to the cluster as a user with the `cluster-admin` role. +** You have created the `cluster-monitoring-config` config map. +* *If you are configuring components that monitor user-defined projects*: +** You have access to the cluster as a user with the `cluster-admin` role, or as a user with the `user-workload-monitoring-config-edit` role in the `openshift-user-workload-monitoring` project. +** You have created the `user-workload-monitoring-config` config map. + +.Procedure + +. Edit the `ConfigMap` object. +** *To configure additional Alertmanagers for routing alerts from core {product-title} projects*: +.. Edit the `cluster-monitoring-config` config map in the `openshift-monitoring` project: ++ +[source,terminal] +---- +$ oc -n openshift-monitoring edit configmap cluster-monitoring-config +---- + +.. Add an `additionalAlertmanagerConfigs:` section under `data/config.yaml/prometheusK8s`. + +.. Add the configuration details for additional Alertmanagers in this section: ++ +[source,yaml] +---- +apiVersion: v1 +kind: ConfigMap +metadata: + name: cluster-monitoring-config + namespace: openshift-monitoring +data: + config.yaml: | + prometheusK8s: + additionalAlertmanagerConfigs: + - +---- ++ +For ``, substitute authentication and other configuration details for additional Alertmanager instances. +Currently supported authentication methods are bearer token (`bearerToken`) and client TLS (`tlsConfig`). +The following sample config map configures an additional Alertmanager using a bearer token with client TLS authentication: ++ +[source,yaml] +---- +apiVersion: v1 +kind: ConfigMap +metadata: + name: cluster-monitoring-config + namespace: openshift-monitoring +data: + config.yaml: | + prometheusK8s: + additionalAlertmanagerConfigs: + - scheme: https + pathPrefix: / + timeout: "30s" + apiVersion: v1 + bearerToken: + name: alertmanager-bearer-token + key: token + tlsConfig: + key: + name: alertmanager-tls + key: tls.key + cert: + name: alertmanager-tls + key: tls.crt + ca: + name: alertmanager-tls + key: tls.ca + staticConfigs: + - external-alertmanager1-remote.com + - external-alertmanager1-remote2.com +---- + +** *To configure additional Alertmanager instances for routing alerts from user-defined projects*: + +.. Edit the `user-workload-monitoring-config` config map in the `openshift-user-workload-monitoring` project: ++ +[source,terminal] +---- +$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config +---- + +.. Add a `/additionalAlertmanagerConfigs:` section under `data/config.yaml/`. + +.. Add the configuration details for additional Alertmanagers in this section: ++ +[source,yaml] +---- +apiVersion: v1 +kind: ConfigMap +metadata: + name: user-workload-monitoring-config + namespace: openshift-user-workload-monitoring +data: + config.yaml: | + : + additionalAlertmanagerConfigs: + - +---- ++ +For ``, substitute one of two supported external Alertmanager components: `prometheus` or `thanosRuler`. ++ +For ``, substitute authentication and other configuration details for additional Alertmanager instances. +Currently supported authentication methods are bearer token (`bearerToken`) and client TLS (`tlsConfig`). +The following sample config map configures an additional Alertmanager using Thanos Ruler with a bearer token and client TLS authentication: ++ +[source,yaml] +---- +apiVersion: v1 +kind: ConfigMap +metadata: + name: user-workload-monitoring-config + namespace: openshift-user-workload-monitoring +data: + config.yaml: | + thanosRuler: + additionalAlertmanagerConfigs: + - scheme: https + pathPrefix: / + timeout: "30s" + apiVersion: v1 + bearerToken: + name: alertmanager-bearer-token + key: token + tlsConfig: + key: + name: alertmanager-tls + key: tls.key + cert: + name: alertmanager-tls + key: tls.crt + ca: + name: alertmanager-tls + key: tls.ca + staticConfigs: + - external-alertmanager1-remote.com + - external-alertmanager1-remote2.com +---- ++ +[NOTE] +==== +Configurations applied to the `user-workload-monitoring-config` `ConfigMap` object are not activated unless a cluster administrator has enabled monitoring for user-defined projects. +==== + +. Save the file to apply the changes to the `ConfigMap` object. +The new component placement configuration is applied automatically. + + diff --git a/monitoring/configuring-the-monitoring-stack.adoc b/monitoring/configuring-the-monitoring-stack.adoc index 5970e14edb22..e195bc1bd433 100644 --- a/monitoring/configuring-the-monitoring-stack.adoc +++ b/monitoring/configuring-the-monitoring-stack.adoc @@ -125,6 +125,9 @@ include::modules/monitoring-creating-scrape-sample-alerts.adoc[leveloffset=+2] * xref:../monitoring/enabling-monitoring-for-user-defined-projects.adoc#enabling-monitoring-for-user-defined-projects[Enabling monitoring for user-defined projects] * See xref:../monitoring/troubleshooting-monitoring-issues.html#determining-why-prometheus-is-consuming-disk-space_troubleshooting-monitoring-issues[Determining why Prometheus is consuming a lot of disk space] for steps to query which metrics have the highest number of scrape samples +//Configuring external alertmanagers +include::modules/monitoring-configuring-external-alertmanagers.adoc[leveloffset=1] + //Attaching additional labels to your time series and alerts include::modules/monitoring-attaching-additional-labels-to-your-time-series-and-alerts.adoc[leveloffset=+1]