Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
163 changes: 163 additions & 0 deletions modules/monitoring-configuring-external-alertmanagers.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
// Module included in the following assemblies:
//
// * monitoring/configuring-the-monitoring-stack.adoc

[id="monitoring-configuring-external-alertmanagers_{context}"]
= Configuring external alertmanager instances

The {product-title} monitoring stack includes a local Alertmanager instance that routes alerts from Prometheus.
You can add external Alertmanager instances by configuring the `cluster-monitoring-config` config map in either the `openshift-monitoring` project or the `user-workload-monitoring-config` project.

If you add the same external Alertmanager configuration for multiple clusters and disable the local instance for each cluster, you can then manage alert routing for multiple clusters by using a single external Alertmanager instance.

.Prerequisites

* You have installed the OpenShift CLI (`oc`).
* *If you are configuring core {product-title} monitoring components in the `openshift-monitoring` project*:
** You have access to the cluster as a user with the `cluster-admin` role.
** You have created the `cluster-monitoring-config` config map.
* *If you are configuring components that monitor user-defined projects*:
** You have access to the cluster as a user with the `cluster-admin` role, or as a user with the `user-workload-monitoring-config-edit` role in the `openshift-user-workload-monitoring` project.
** You have created the `user-workload-monitoring-config` config map.

.Procedure

. Edit the `ConfigMap` object.
** *To configure additional Alertmanagers for routing alerts from core {product-title} projects*:
.. Edit the `cluster-monitoring-config` config map in the `openshift-monitoring` project:
+
[source,terminal]
----
$ oc -n openshift-monitoring edit configmap cluster-monitoring-config
----

.. Add an `additionalAlertmanagerConfigs:` section under `data/config.yaml/prometheusK8s`.

.. Add the configuration details for additional Alertmanagers in this section:
+
[source,yaml]
----
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
prometheusK8s:
additionalAlertmanagerConfigs:
- <alertmanager_specification>
----
+
For `<alertmanager_specification>`, substitute authentication and other configuration details for additional Alertmanager instances.
Currently supported authentication methods are bearer token (`bearerToken`) and client TLS (`tlsConfig`).
The following sample config map configures an additional Alertmanager using a bearer token with client TLS authentication:
+
[source,yaml]
----
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
prometheusK8s:
additionalAlertmanagerConfigs:
- scheme: https
pathPrefix: /
timeout: "30s"
apiVersion: v1
bearerToken:
name: alertmanager-bearer-token
key: token
tlsConfig:
key:
name: alertmanager-tls
key: tls.key
cert:
name: alertmanager-tls
key: tls.crt
ca:
name: alertmanager-tls
key: tls.ca
staticConfigs:
- external-alertmanager1-remote.com
- external-alertmanager1-remote2.com
----

** *To configure additional Alertmanager instances for routing alerts from user-defined projects*:

.. Edit the `user-workload-monitoring-config` config map in the `openshift-user-workload-monitoring` project:
+
[source,terminal]
----
$ oc -n openshift-user-workload-monitoring edit configmap user-workload-monitoring-config
----

.. Add a `<component>/additionalAlertmanagerConfigs:` section under `data/config.yaml/`.

.. Add the configuration details for additional Alertmanagers in this section:
+
[source,yaml]
----
apiVersion: v1
kind: ConfigMap
metadata:
name: user-workload-monitoring-config
namespace: openshift-user-workload-monitoring
data:
config.yaml: |
<component>:
additionalAlertmanagerConfigs:
- <alertmanager_specification>
----
+
For `<component>`, substitute one of two supported external Alertmanager components: `prometheus` or `thanosRuler`.
+
For `<alertmanager_specification>`, substitute authentication and other configuration details for additional Alertmanager instances.
Currently supported authentication methods are bearer token (`bearerToken`) and client TLS (`tlsConfig`).
The following sample config map configures an additional Alertmanager using Thanos Ruler with a bearer token and client TLS authentication:
+
[source,yaml]
----
apiVersion: v1
kind: ConfigMap
metadata:
name: user-workload-monitoring-config
namespace: openshift-user-workload-monitoring
data:
config.yaml: |
thanosRuler:
additionalAlertmanagerConfigs:
- scheme: https
pathPrefix: /
timeout: "30s"
apiVersion: v1
bearerToken:
name: alertmanager-bearer-token
key: token
tlsConfig:
key:
name: alertmanager-tls
key: tls.key
cert:
name: alertmanager-tls
key: tls.crt
ca:
name: alertmanager-tls
key: tls.ca
staticConfigs:
- external-alertmanager1-remote.com
- external-alertmanager1-remote2.com
----
+
[NOTE]
====
Configurations applied to the `user-workload-monitoring-config` `ConfigMap` object are not activated unless a cluster administrator has enabled monitoring for user-defined projects.
====

. Save the file to apply the changes to the `ConfigMap` object.
The new component placement configuration is applied automatically.


3 changes: 3 additions & 0 deletions monitoring/configuring-the-monitoring-stack.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,9 @@ include::modules/monitoring-creating-scrape-sample-alerts.adoc[leveloffset=+2]
* xref:../monitoring/enabling-monitoring-for-user-defined-projects.adoc#enabling-monitoring-for-user-defined-projects[Enabling monitoring for user-defined projects]
* See xref:../monitoring/troubleshooting-monitoring-issues.html#determining-why-prometheus-is-consuming-disk-space_troubleshooting-monitoring-issues[Determining why Prometheus is consuming a lot of disk space] for steps to query which metrics have the highest number of scrape samples

//Configuring external alertmanagers
include::modules/monitoring-configuring-external-alertmanagers.adoc[leveloffset=1]

//Attaching additional labels to your time series and alerts
include::modules/monitoring-attaching-additional-labels-to-your-time-series-and-alerts.adoc[leveloffset=+1]

Expand Down