diff --git a/modules/monitoring-configurable-monitoring-components.adoc b/modules/monitoring-configurable-monitoring-components.adoc index 173a0ad2e257..0d4a95d7f1fa 100644 --- a/modules/monitoring-configurable-monitoring-components.adoc +++ b/modules/monitoring-configurable-monitoring-components.adoc @@ -2,53 +2,89 @@ // // * observability/monitoring/configuring-the-monitoring-stack.adoc -[id="configurable-monitoring-components_{context}"] -= Configurable monitoring components +:_mod-docs-content-type: REFERENCE -This table shows the monitoring components you can configure and the keys used to specify the components in the -ifndef::openshift-dedicated,openshift-rosa[] -`cluster-monitoring-config` and -endif::openshift-dedicated,openshift-rosa[] -`user-workload-monitoring-config` `ConfigMap` objects. +// The ultimate solution DOES NOT NEED separate IDs, it is just needed for now so that the tests will not break + +// tag::CPM[] +[id="configurable-monitoring-components-cpm_{context}"] += Configurable monitoring components for core platform monitoring +// end::CPM[] + +// tag::UWM[] +[id="configurable-monitoring-components-uwm_{context}"] += Configurable monitoring components for monitoring for user-defined projects +// end::UWM[] + +// Set attributes to distinguish between cluster monitoring example (core platform monitoring - CPM) and user workload monitoring (UWM) examples. +// tag::CPM[] +:configmap-name: cluster-monitoring-config +:alertmanager: alertmanagerMain +:prometheus: prometheusK8s +:thanosname: Thanos Querier +:thanos: thanosQuerier +// end::CPM[] +// tag::UWM[] +:configmap-name: user-workload-monitoring-config +:alertmanager: alertmanager +:prometheus: prometheus +:thanosname: Thanos Ruler +:thanos: thanosRuler +// end::UWM[] + +This table shows the monitoring components you can configure and the keys used to specify the components in the `{configmap-name}` config map. +// tag::UWM[] ifdef::openshift-dedicated,openshift-rosa[] [WARNING] ==== -Do not modify the monitoring components in the `cluster-monitoring-config` `ConfigMap` object. Red Hat Site Reliability Engineers (SRE) use these components to monitor the core cluster components and Kubernetes services. +Do not modify the monitoring components in the `cluster-monitoring-config` `ConfigMap` object. Red{nbsp}Hat Site Reliability Engineers (SRE) use these components to monitor the core cluster components and Kubernetes services. ==== endif::openshift-dedicated,openshift-rosa[] +// end::UWM[] -ifndef::openshift-dedicated,openshift-rosa[] -.Configurable monitoring components +// tag::CPM[] +.Configurable core platform monitoring components +// end::CPM[] +// tag::UWM[] +.Configurable monitoring components for user-defined projects +// end::UWM[] [options="header"] |==== -|Component |cluster-monitoring-config config map key |user-workload-monitoring-config config map key -|Prometheus Operator |`prometheusOperator` |`prometheusOperator` -|Prometheus |`prometheusK8s` |`prometheus` -|Alertmanager |`alertmanagerMain` | `alertmanager` -|kube-state-metrics |`kubeStateMetrics` | -|monitoring-plugin | `monitoringPlugin` | -|openshift-state-metrics |`openshiftStateMetrics` | -|Telemeter Client |`telemeterClient` | -|Metrics Server |`metricsServer` | -|Thanos Querier |`thanosQuerier` | -|Thanos Ruler | |`thanosRuler` +|Component |{configmap-name} config map key +|Prometheus Operator |`prometheusOperator` +|Prometheus |`{prometheus}` +|Alertmanager |`{alertmanager}` +|{thanosname} | `{thanos}` +// tag::CPM[] +|kube-state-metrics |`kubeStateMetrics` +|monitoring-plugin | `monitoringPlugin` +|openshift-state-metrics |`openshiftStateMetrics` +|Telemeter Client |`telemeterClient` +|Metrics Server |`metricsServer` +// end::CPM[] |==== -[NOTE] +[WARNING] ==== -The Prometheus key is called `prometheusK8s` in the `cluster-monitoring-config` `ConfigMap` object and `prometheus` in the `user-workload-monitoring-config` `ConfigMap` object. +Different configuration changes to the `ConfigMap` object result in different outcomes: + +* The pods are not redeployed. Therefore, there is no service outage. + +* The affected pods are redeployed: + +** For single-node clusters, this results in temporary service outage. + +** For multi-node clusters, because of high-availability, the affected pods are gradually rolled out and the monitoring stack remains available. + +** Configuring and resizing a persistent volume always results in a service outage, regardless of high availability. + +Each procedure that requires a change in the config map includes its expected outcome. ==== -endif::openshift-dedicated,openshift-rosa[] -ifdef::openshift-dedicated,openshift-rosa[] -.Configurable monitoring components -[options="header"] -|=== -|Component |user-workload-monitoring-config config map key -|Alertmanager |`alertmanager` -|Prometheus Operator |`prometheusOperator` -|Prometheus |`prometheus` -|Thanos Ruler |`thanosRuler` -|=== -endif::openshift-dedicated,openshift-rosa[] +// Unset the source code block attributes just to be safe. +:!configmap-name: +:!alertmanager: +:!prometheus: +:!thanosname: +:!thanos: diff --git a/modules/monitoring-configuring-metrics-collection-profiles.adoc b/modules/monitoring-configuring-metrics-collection-profiles.adoc index 63a30051f1c2..c210db5b871b 100644 --- a/modules/monitoring-configuring-metrics-collection-profiles.adoc +++ b/modules/monitoring-configuring-metrics-collection-profiles.adoc @@ -6,15 +6,8 @@ [id="configuring-metrics-collection-profiles_{context}"] = Configuring metrics collection profiles -[IMPORTANT] -==== -[subs="attributes+"] -Using a metrics collection profile is a Technology Preview feature only. Technology Preview features are not supported with Red Hat production service level agreements (SLAs) and might not be functionally complete. -Red Hat does not recommend using them in production. -These features provide early access to upcoming product features, enabling customers to test functionality and provide feedback during the development process. - -For more information about the support scope of Red Hat Technology Preview features, see link:https://access.redhat.com/support/offerings/techpreview[https://access.redhat.com/support/offerings/techpreview]. -==== +:FeatureName: Metrics collection profile +include::snippets/technology-preview.adoc[] By default, Prometheus collects metrics exposed by all default metrics targets in {product-title} components. However, you might want Prometheus to collect fewer metrics from a cluster in certain scenarios: diff --git a/modules/monitoring-granting-users-permission-to-monitor-user-defined-projects.adoc b/modules/monitoring-granting-users-permission-to-monitor-user-defined-projects.adoc index ec9f13cdec3c..674cc2ac979e 100644 --- a/modules/monitoring-granting-users-permission-to-monitor-user-defined-projects.adoc +++ b/modules/monitoring-granting-users-permission-to-monitor-user-defined-projects.adoc @@ -4,7 +4,7 @@ :_mod-docs-content-type: CONCEPT [id="granting-users-permission-to-monitor-user-defined-projects_{context}"] -= Granting users permission to monitor user-defined projects += Granting users permissions for monitoring for user-defined projects As a cluster administrator, you can monitor all core {product-title} and user-defined projects. diff --git a/modules/monitoring-moving-monitoring-components-to-different-nodes.adoc b/modules/monitoring-moving-monitoring-components-to-different-nodes.adoc index d5609e59065b..c0844c4b22ba 100644 --- a/modules/monitoring-moving-monitoring-components-to-different-nodes.adoc +++ b/modules/monitoring-moving-monitoring-components-to-different-nodes.adoc @@ -111,3 +111,7 @@ If monitoring components remain in a `Pending` state after configuring the `node ==== . Save the file to apply the changes. The components specified in the new configuration are automatically moved to the new nodes, and the pods affected by the new configuration are redeployed. + +// Unset the source code block attributes just to be safe. +:!configmap-name: +:!namespace-name: diff --git a/observability/monitoring/configuring-core-platform-monitoring/before-you-begin.adoc b/observability/monitoring/configuring-core-platform-monitoring/before-you-begin.adoc index 640c5042982f..99b3809c9a28 100644 --- a/observability/monitoring/configuring-core-platform-monitoring/before-you-begin.adoc +++ b/observability/monitoring/configuring-core-platform-monitoring/before-you-begin.adoc @@ -6,7 +6,36 @@ include::_attributes/common-attributes.adoc[] toc::[] -TBD +The {product-title} installation program provides only a low number of configuration options before installation. Configuring most {product-title} framework components, including the cluster monitoring stack, happens after the installation. +This section explains which monitoring components can be configured and how to prepare to configure the monitoring stack. +[IMPORTANT] +==== +* Not all configuration parameters for the monitoring stack are exposed. +Only the parameters and fields listed in the xref:../../../observability/monitoring/config-map-reference-for-the-cluster-monitoring-operator.adoc#cluster-monitoring-operator-configuration-reference[Config map reference for the {cmo-full}] are supported for configuration. + +* The monitoring stack imposes additional resource requirements. Consult the computing resources recommendations in xref:../../../scalability_and_performance/recommended-performance-scale-practices/recommended-infrastructure-practices.adoc#scaling-cluster-monitoring-operator[Scaling the {cmo-full}] and verify that you have sufficient resources. +==== + +// Configurable monitoring components +include::modules/monitoring-configurable-monitoring-components.adoc[leveloffset=+1,tags=**;CPM;!UWM] + +// Preparing to configure the monitoring stack +[id="preparing-to-configure-the-monitoring-stack_{context}"] +== Preparing to configure the monitoring stack + +You can configure the core platform monitoring by creating and updating the `cluster-monitoring-config` config map. This config map configures the {cmo-first}, which in turn configures the components of the default monitoring stack. + +include::modules/monitoring-creating-cluster-monitoring-configmap.adoc[leveloffset=+2] + +// Granting users permissions for core platform monitoring +include::modules/monitoring-granting-users-permissions-for-core-platform-monitoring.adoc[leveloffset=+1] + +[role="_additional-resources"] +.Additional resources +* TBD + +include::modules/monitoring-granting-user-permissions-using-the-web-console.adoc[leveloffset=+2] +include::modules/monitoring-granting-user-permissions-using-the-cli.adoc[leveloffset=+2] diff --git a/observability/monitoring/configuring-core-platform-monitoring/configuring-alerts-and-notifications.adoc b/observability/monitoring/configuring-core-platform-monitoring/configuring-alerts-and-notifications.adoc index 6e7b89561dce..0ff636af431a 100644 --- a/observability/monitoring/configuring-core-platform-monitoring/configuring-alerts-and-notifications.adoc +++ b/observability/monitoring/configuring-core-platform-monitoring/configuring-alerts-and-notifications.adoc @@ -6,7 +6,28 @@ include::_attributes/common-attributes.adoc[] toc::[] -TBD +You can configure a local or external Alertmanager instance to route alerts from Prometheus to endpoint receivers. You can also attach custom labels to all time series and alerts to add useful metadata information. +//Configuring external Alertmanager instances +include::modules/monitoring-configuring-external-alertmanagers.adoc[leveloffset=1,tags=**;CPM;!UWM] +//Configuring secrets for Alertmanager +include::modules/monitoring-configuring-secrets-for-alertmanager.adoc[leveloffset=1] +include::modules/monitoring-adding-a-secret-to-the-alertmanager-configuration.adoc[leveloffset=2,tags=**;CPM;!UWM] + +//Attaching additional labels to your time series and alerts +include::modules/monitoring-attaching-additional-labels-to-your-time-series-and-alerts.adoc[leveloffset=+1,tags=**;CPM;!UWM] + +[role="_additional-resources"] +.Additional resources + +* TBD + +// Disabling the local Alertmanager +include::modules/monitoring-disabling-the-local-alertmanager.adoc[leveloffset=+1] + +[role="_additional-resources"] +.Additional resources + +* TBD \ No newline at end of file diff --git a/observability/monitoring/configuring-core-platform-monitoring/configuring-metrics.adoc b/observability/monitoring/configuring-core-platform-monitoring/configuring-metrics.adoc index a8fea0bdaa80..a3ecf3ca723c 100644 --- a/observability/monitoring/configuring-core-platform-monitoring/configuring-metrics.adoc +++ b/observability/monitoring/configuring-core-platform-monitoring/configuring-metrics.adoc @@ -6,7 +6,30 @@ include::_attributes/common-attributes.adoc[] toc::[] -TBD +Configure the collection of metrics to monitor how cluster components and your own workloads are performing. +You can send ingested metrics to remote systems for long-term storage and add cluster ID labels to the metrics to identify the data coming from different clusters. +// Configuring remote write storage +include::modules/monitoring-configuring-remote-write-storage.adoc[leveloffset=+1,tags=**;CPM;!UWM] +include::modules/monitoring-supported-remote-write-authentication-settings.adoc[leveloffset=+2] + +include::modules/monitoring-example-remote-write-authentication-settings.adoc[leveloffset=+2,tags=**;CPM;!UWM] + +include::modules/monitoring-example-remote-write-queue-configuration.adoc[leveloffset=+2,tags=**;CPM;!UWM] + +[role="_additional-resources"] +.Additional resources + +* TBD + +// Adding cluster ID labels to metrics +include::modules/monitoring-adding-cluster-id-labels-to-metrics.adoc[leveloffset=+1] + +include::modules/monitoring-creating-cluster-id-labels-for-metrics.adoc[leveloffset=+2,tags=**;CPM;!UWM] + +[role="_additional-resources"] +.Additional resources + +* TBD \ No newline at end of file diff --git a/observability/monitoring/configuring-core-platform-monitoring/configuring-performance-and-scalability.adoc b/observability/monitoring/configuring-core-platform-monitoring/configuring-performance-and-scalability.adoc index 23f5ec77fb52..46a27ea5f992 100644 --- a/observability/monitoring/configuring-core-platform-monitoring/configuring-performance-and-scalability.adoc +++ b/observability/monitoring/configuring-core-platform-monitoring/configuring-performance-and-scalability.adoc @@ -6,7 +6,67 @@ include::_attributes/common-attributes.adoc[] toc::[] -TBD +You can configure the monitoring stack to optimize the performance and scale of your clusters. The following documentation provides information about how to distribute the monitoring components and control the impact of the monitoring stack on CPU and memory resources. +// Using node selectors to move monitoring components + +include::modules/monitoring-using-node-selectors-to-move-monitoring-components.adoc[leveloffset=+1]] + +[role="_additional-resources"] +.Additional resources + +* TBD + +include::modules/monitoring-moving-monitoring-components-to-different-nodes.adoc[leveloffset=+2,tags=**;CPM;!UWM] + +[role="_additional-resources"] +.Additional resources + +* TBD + +include::modules/monitoring-assigning-tolerations-to-monitoring-components.adoc[leveloffset=+2,tags=**;CPM;!UWM] + +[role="_additional-resources"] +.Additional resources + +* TBD + +// Setting the body size limit for metrics scraping +include::modules/monitoring-setting-the-body-size-limit-for-metrics-scraping.adoc[leveloffset=+1] + +[role="_additional-resources"] +.Additional resources + +* TBD + +[id="managing-cpu-and-memory-resources-for-monitoring-components_{context}"] +== Managing CPU and memory resources for monitoring components + +You can ensure that the containers that run monitoring components have enough CPU and memory resources by specifying values for resource limits and requests for those components. + +You can configure these limits and requests for core platform monitoring components in the `openshift-monitoring` namespace. + +include::modules/monitoring-about-specifying-limits-and-requests-for-monitoring-components.adoc[leveloffset=+2,tags=**;CPM;!UWM] + +include::modules/monitoring-specifying-limits-and-requests-for-monitoring-components.adoc[leveloffset=+2,tags=**;CPM;!UWM] + +// Configuring metrics collection profiles +include::modules/monitoring-configuring-metrics-collection-profiles.adoc[leveloffset=+1] +include::modules/monitoring-choosing-a-metrics-collection-profile.adoc[leveloffset=+2] + +[role="_additional-resources"] +.Additional resources + +* TBD + +// Using pod topology spread constraints for monitoring components +include::modules/monitoring-using-pod-topology-spread-constraints-for-monitoring.adoc[leveloffset=1] + +[role="_additional-resources"] +.Additional resources + +* TBD + +include::modules/monitoring-configuring-pod-topology-spread-constraints.adoc[leveloffset=2,tags=**;CPM;!UWM] diff --git a/observability/monitoring/configuring-core-platform-monitoring/storing-and-recording-data.adoc b/observability/monitoring/configuring-core-platform-monitoring/storing-and-recording-data.adoc index 1860e49c5f5e..26fbcd036896 100644 --- a/observability/monitoring/configuring-core-platform-monitoring/storing-and-recording-data.adoc +++ b/observability/monitoring/configuring-core-platform-monitoring/storing-and-recording-data.adoc @@ -6,7 +6,50 @@ include::_attributes/common-attributes.adoc[] toc::[] -TBD +Store and record your metrics and alerting data, configure logs to specify which activities are recorded, control how long Prometheus retains stored data, and set the maximum amount of disk space for the data. These actions help you protect your data and use them for troubleshooting. +// Configuring persistent storage +include::modules/monitoring-configuring-persistent-storage.adoc[leveloffset=+1] +include::modules/monitoring-configuring-a-persistent-volume-claim.adoc[leveloffset=+2,tags=**;CPM;!UWM] + +[role="_additional-resources"] +.Additional resources + +* TBD + +include::modules/monitoring-resizing-a-persistent-volume.adoc[leveloffset=+2,tags=**;CPM;!UWM] + +[role="_additional-resources"] +.Additional resources + +* TBD + +// Modifying the retention time and size for Prometheus metrics data + +include::modules/monitoring-modifying-retention-time-and-size-for-prometheus-metrics-data.adoc[leveloffset=+1,tags=**;CPM;!UWM] + +include::modules/monitoring-modifying-the-retention-time-for-thanos-ruler-metrics-data.adoc[leveloffset=+2] + +// Configuring audit logs for Metrics Server +include::modules/monitoring-configuring-audit-logs-for-metrics-server.adoc[leveloffset=+1] + +// Setting log levels for monitoring components +include::modules/monitoring-setting-log-levels-for-monitoring-components.adoc[leveloffset=+1,tags=**;CPM;!UWM] + +// Enabling the query log file for Prometheus +include::modules/monitoring-setting-query-log-file-for-prometheus.adoc[leveloffset=+1,tags=**;CPM;!UWM] + +[role="_additional-resources"] +.Additional resources + +* TBD + +// Enabling query logging for Thanos Querier +include::modules/monitoring-enabling-query-logging-for-thanos-querier.adoc[leveloffset=+1] + +[role="_additional-resources"] +.Additional resources + +* TBD diff --git a/observability/monitoring/configuring-the-monitoring-stack.adoc b/observability/monitoring/configuring-the-monitoring-stack.adoc index a3c40f66cffe..82ec5f40c1d5 100644 --- a/observability/monitoring/configuring-the-monitoring-stack.adoc +++ b/observability/monitoring/configuring-the-monitoring-stack.adoc @@ -2,7 +2,7 @@ [id="configuring-the-monitoring-stack"] = Configuring the monitoring stack include::_attributes/common-attributes.adoc[] -:context: configuring-the-monitoring-stack +:context: configuring-the-monitoring-stack toc::[] @@ -103,7 +103,10 @@ ifndef::openshift-dedicated,openshift-rosa[] endif::openshift-dedicated,openshift-rosa[] // Configurable monitoring components -include::modules/monitoring-configurable-monitoring-components.adoc[leveloffset=+1] +// The following module should only include core platform monitoring (CPM tags) +include::modules/monitoring-configurable-monitoring-components.adoc[leveloffset=+1,tags=**;CPM;!UWM] +// The following module should only include monitoring for user-defined projects (UWM tags) +include::modules/monitoring-configurable-monitoring-components.adoc[leveloffset=+1,tags=**;!CPM;UWM] // Moving monitoring components to different nodes include::modules/monitoring-using-node-selectors-to-move-monitoring-components.adoc[leveloffset=+1] @@ -376,7 +379,7 @@ include::modules/monitoring-setting-log-levels-for-monitoring-components.adoc[le // The following module should only include monitoring for user-defined projects (UWM tags) include::modules/monitoring-setting-log-levels-for-monitoring-components.adoc[leveloffset=+1,tags=**;!CPM;UWM] -// Setting query log for Prometheus +// Enabling the query log file for Prometheus // The following module should only include core platform monitoring (CPM tags) include::modules/monitoring-setting-query-log-file-for-prometheus.adoc[leveloffset=+1,tags=**;CPM;!UWM] // The following module should only include monitoring for user-defined projects (UWM tags)