diff --git a/_topic_maps/_topic_map.yml b/_topic_maps/_topic_map.yml index 1706b3de024a..4f2bd693a88c 100644 --- a/_topic_maps/_topic_map.yml +++ b/_topic_maps/_topic_map.yml @@ -2849,8 +2849,6 @@ Topics: File: core-platform-monitoring-first-steps - Name: User workload monitoring first steps File: user-workload-monitoring-first-steps - - Name: Common monitoring scenarios - File: common-monitoring-scenarios - Name: Developer and non-administrator steps File: developer-and-non-administrator-steps - Name: Configuring core platform monitoring diff --git a/modules/monitoring-querying-metrics-for-user-defined-projects-with-mon-dashboard.adoc b/modules/monitoring-querying-metrics-for-user-defined-projects-with-mon-dashboard.adoc index 6fba3d2a9b25..dd2c7cbd5d4e 100644 --- a/modules/monitoring-querying-metrics-for-user-defined-projects-with-mon-dashboard.adoc +++ b/modules/monitoring-querying-metrics-for-user-defined-projects-with-mon-dashboard.adoc @@ -7,8 +7,6 @@ [id="querying-metrics-for-user-defined-projects-with-mon-dashboard_{context}"] = Querying metrics for user-defined projects with the {product-title} web console -// The following section will be included in the developer section, hence there is no need to include "developer" in the title - You can use the {product-title} metrics query browser to run Prometheus Query Language (PromQL) queries to examine metrics visualized on a plot. This functionality provides information about any user-defined workloads that you are monitoring. As a developer, you must specify a project name when querying metrics. You must have the required privileges to view metrics for the selected project. diff --git a/observability/monitoring/common-monitoring-configuration-scenarios.adoc b/observability/monitoring/common-monitoring-configuration-scenarios.adoc index 57d72d370a49..484ac585e927 100644 --- a/observability/monitoring/common-monitoring-configuration-scenarios.adoc +++ b/observability/monitoring/common-monitoring-configuration-scenarios.adoc @@ -69,7 +69,7 @@ Cluster administrators typically complete the following activities to configure * xref:../../observability/monitoring/enabling-monitoring-for-user-defined-projects.adoc#enabling-monitoring-for-user-defined-projects_enabling-monitoring-for-user-defined-projects[Enable user-defined projects]. * xref:../../observability/monitoring/enabling-monitoring-for-user-defined-projects.adoc#granting-users-permission-to-monitor-user-defined-projects_enabling-monitoring-for-user-defined-projects[Assign the `monitoring-rules-view`, `monitoring-rules-edit`, or `monitoring-edit` cluster roles] to grant non-administrator users permissions to monitor user-defined projects. -* xref:../../observability/monitoring/enabling-monitoring-for-user-defined-projects.adoc#granting-users-permission-to-configure-monitoring-for-user-defined-projects_enabling-monitoring-for-user-defined-projects[Assign the `user-workload-monitoring-config-edit` role] to grant non-administrator users permission to configure user-defined projects. +* xref:../../observability/monitoring/enabling-monitoring-for-user-defined-projects.adoc#granting-users-permission-to-configure-monitoring-for-user-defined-projects_enabling-monitoring-for-user-defined-projects[Grant non-administrator users permission to configure user-defined projects] by assigning the `user-workload-monitoring-config-edit` role. * xref:../../observability/monitoring/enabling-alert-routing-for-user-defined-projects.adoc#enabling-alert-routing-for-user-defined-projects[Enable alert routing for user-defined projects] so that developers and other users can configure custom alerts and alert routing for their projects. * If needed, configure alert routing for user-defined projects to xref:../../observability/monitoring/enabling-alert-routing-for-user-defined-projects.adoc#enabling-a-separate-alertmanager-instance-for-user-defined-alert-routing_enabling-alert-routing-for-user-defined-projects[use an optional Alertmanager instance dedicated for use only by user-defined projects]. * xref:../../observability/monitoring/managing-alerts.adoc#configuring-different-alert-receivers-for-default-platform-alerts-and-user-defined-alerts_managing-alerts[Configure alert receivers] for user-defined projects. diff --git a/observability/monitoring/getting-started/common-monitoring-scenarios.adoc b/observability/monitoring/getting-started/common-monitoring-scenarios.adoc deleted file mode 100644 index c0eb701bfeea..000000000000 --- a/observability/monitoring/getting-started/common-monitoring-scenarios.adoc +++ /dev/null @@ -1,11 +0,0 @@ -:_mod-docs-content-type: ASSEMBLY -include::_attributes/common-attributes.adoc[] -[id="common-monitoring-scenarios"] -= Common monitoring scenarios -:context: common-monitoring-scenarios - -toc::[] - -TBD - - diff --git a/observability/monitoring/getting-started/core-platform-monitoring-first-steps.adoc b/observability/monitoring/getting-started/core-platform-monitoring-first-steps.adoc index 2831fde2b011..38caff2d9c9c 100644 --- a/observability/monitoring/getting-started/core-platform-monitoring-first-steps.adoc +++ b/observability/monitoring/getting-started/core-platform-monitoring-first-steps.adoc @@ -6,8 +6,53 @@ include::_attributes/common-attributes.adoc[] toc::[] -TBD +After {product-title} is installed, core platform monitoring components immediately begin collecting metrics, which you can query and view. +The default in-cluster monitoring stack includes the core platform Prometheus instance that collects metrics from your cluster and the core Alertmanager instance that routes alerts, among other components. +Depending on who will use the monitoring stack and for what purposes, as a cluster administrator, you can further configure these monitoring components to suit the needs of different users in various scenarios. +[id="configuring-core-platform-monitoring-postinstallation-steps_{context}"] +== Configuring core platform monitoring: Postinstallation steps +After {product-title} is installed, cluster administrators typically configure core platform monitoring to suit their needs. +These activities include setting up storage and configuring options for Prometheus, Alertmanager, and other monitoring components. + +[NOTE] +==== +By default, in a newly installed {product-title} system, users can query and view collected metrics. +You need only configure an alert receiver if you want users to receive alert notifications. +Any other configuration options listed here are optional. +==== + +* xref:../../../observability/monitoring/configuring-core-platform-monitoring/before-you-begin.adoc#creating-cluster-monitoring-configmap_before-you-begin[Create the `cluster-monitoring-config` `ConfigMap` object] if it does not exist. +* xref:../../../observability/monitoring/configuring-core-platform-monitoring/configuring-alerts-and-notifications.adoc#configuring-alert-notifications_configuring-alerts-and-notifications[Configure notifications for default platform alerts] so that Alertmanager can send alerts to an external notification system such as email, Slack, or PagerDuty. +* For shorter term data retention, xref:../../../observability/monitoring/configuring-core-platform-monitoring/storing-and-recording-data.adoc#configuring-persistent-storage_storing-and-recording-data[configure persistent storage] for Prometheus and Alertmanager to store metrics and alert data. +Specify the metrics data retention parameters for Prometheus and Thanos Ruler. ++ +[IMPORTANT] +==== +* In multi-node clusters, you must configure persistent storage for Prometheus, Alertmanager, and Thanos Ruler to ensure high availability. + +* By default, in a newly installed {product-title} system, the monitoring `ClusterOperator` resource reports a `PrometheusDataPersistenceNotConfigured` status message to remind you that storage is not configured. +==== ++ +* For longer term data retention, xref:../../../observability/monitoring/configuring-core-platform-monitoring/configuring-metrics.adoc#configuring-remote-write-storage-cpm_configuring-metrics[configure the remote write feature] to enable Prometheus to send ingested metrics to remote systems for storage. ++ +[IMPORTANT] +==== +Be sure to xref:../../../observability/monitoring/configuring-core-platform-monitoring/configuring-metrics.adoc#adding-cluster-id-labels-to-metrics_configuring-metrics[add cluster ID labels to metrics] for use with your remote write storage configuration. +==== ++ +* xref:../../../observability/monitoring/configuring-core-platform-monitoring/before-you-begin.adoc#granting-users-permissions-for-core-platform-monitoring_before-you-begin[Grant monitoring cluster roles] to any non-administrator users that need to access certain monitoring features. +* xref:../../../observability/monitoring/configuring-core-platform-monitoring/configuring-performance-and-scalability.adoc#assigning-tolerations-to-monitoring-components-cpm_configuring-performance-and-scalability[Assign tolerations] to monitoring stack components so that administrators can move them to tainted nodes. +* xref:../../../observability/monitoring/configuring-core-platform-monitoring/configuring-performance-and-scalability.adoc#setting-the-body-size-limit-for-metrics-scraping_configuring-performance-and-scalability[Set the body size limit] for metrics collection to help avoid situations in which Prometheus consumes excessive amounts of memory when scraped targets return a response that contains a large amount of data. +* xref:../../../observability/monitoring/managing-alerts/managing-alerts-as-an-administrator.adoc#managing-alerting-rules-for-core-platform-monitoring_managing-alerts-as-an-administrator[Modify or create alerting rules] for your cluster. +These rules specify the conditions that trigger alerts, such as high CPU or memory usage, network latency, and so forth. +* xref:../../../observability/monitoring/configuring-core-platform-monitoring/configuring-performance-and-scalability.adoc#managing-cpu-and-memory-resources-for-monitoring-components_configuring-performance-and-scalability[Specify resource limits and requests for monitoring components] to ensure that the containers that run monitoring components have enough CPU and memory resources. + +With the monitoring stack configured to suit your needs, Prometheus collects metrics from the specified services and stores these metrics according to your settings. +You can go to the *Observe* pages in the {product-title} web console to view and query collected metrics, manage alerts, identify performance bottlenecks, and scale resources as needed: + +* xref:../../../observability/monitoring/accessing-metrics/accessing-metrics-as-an-administrator.adoc#reviewing-monitoring-dashboards-admin_accessing-metrics-as-an-administrator[View dashboards] to visualize collected metrics, troubleshoot alerts, and monitor other information about your cluster. +* xref:../../../observability/monitoring/accessing-metrics/accessing-metrics-as-an-administrator.adoc#querying-metrics-for-all-projects-with-mon-dashboard_accessing-metrics-as-an-administrator[Query collected metrics] by creating PromQL queries or using predefined queries. diff --git a/observability/monitoring/getting-started/developer-and-non-administrator-steps.adoc b/observability/monitoring/getting-started/developer-and-non-administrator-steps.adoc index 1de97c7a3048..43aa98574405 100644 --- a/observability/monitoring/getting-started/developer-and-non-administrator-steps.adoc +++ b/observability/monitoring/getting-started/developer-and-non-administrator-steps.adoc @@ -6,8 +6,11 @@ include::_attributes/common-attributes.adoc[] toc::[] -TBD - - - +After monitoring for user-defined projects is enabled and configured, developers and other non-administrator users can then perform the following activities to set up and use monitoring for their own projects: +* xref:../../../observability/monitoring/configuring-user-workload-monitoring/configuring-metrics-uwm.adoc#setting-up-metrics-collection-for-user-defined-projects_configuring-metrics-uwm[Deploy and monitor services]. +* xref:../../../observability/monitoring/managing-alerts/managing-alerts-as-a-developer.adoc#managing-alerting-rules-for-user-defined-projects-uwm_managing-alerts-as-a-developer[Create and manage alerting rules]. +* xref:../../../observability/monitoring/managing-alerts/managing-alerts-as-a-developer.adoc#managing-alerts-as-a-developer[Receive and manage alerts] for your projects. +* If granted the `alert-routing-edit` cluster role, xref:../../../observability/monitoring/configuring-user-workload-monitoring/configuring-alerts-and-notifications-uwm.adoc#configuring-alert-routing-for-user-defined-projects_configuring-alerts-and-notifications-uwm[configure alert routing]. +* xref:../../../observability/monitoring/accessing-metrics/accessing-metrics-as-a-developer.adoc#reviewing-monitoring-dashboards-developer_accessing-metrics-as-a-developer[View dashboards] by using the {product-title} web console. +* xref:../../../observability/monitoring/accessing-metrics/accessing-metrics-as-a-developer.adoc#querying-metrics-for-user-defined-projects-with-mon-dashboard_accessing-metrics-as-a-developer[Query the collected metrics] by creating PromQL queries or using predefined queries. diff --git a/observability/monitoring/getting-started/user-workload-monitoring-first-steps.adoc b/observability/monitoring/getting-started/user-workload-monitoring-first-steps.adoc index 11ae6f093367..2c841c1e5ceb 100644 --- a/observability/monitoring/getting-started/user-workload-monitoring-first-steps.adoc +++ b/observability/monitoring/getting-started/user-workload-monitoring-first-steps.adoc @@ -6,7 +6,15 @@ include::_attributes/common-attributes.adoc[] toc::[] -TBD - +As a cluster administrator, you can optionally enable monitoring for user-defined projects in addition to core platform monitoring. +Non-administrator users such as developers can then monitor their own projects outside of core platform monitoring. +Cluster administrators typically complete the following activities to configure user-defined projects so that users can view collected metrics, query these metrics, and receive alerts for their own projects: +* xref:../../../observability/monitoring/configuring-user-workload-monitoring/before-you-begin-uwm.adoc#enabling-monitoring-for-user-defined-projects-uwm_before-you-begin-uwm[Enable user workload monitoring]. +* xref:../../../observability/monitoring/configuring-user-workload-monitoring/before-you-begin-uwm.adoc#granting-users-permission-to-monitor-user-defined-projects_before-you-begin-uwm[Grant non-administrator users permissions to monitor user-defined projects] by assigning the `monitoring-rules-view`, `monitoring-rules-edit`, or `monitoring-edit` cluster roles. +* xref:../../../observability/monitoring/configuring-user-workload-monitoring/before-you-begin-uwm.adoc#granting-users-permission-to-configure-alert-routing-for-user-defined-projects_before-you-begin-uwm[Assign the `user-workload-monitoring-config-edit` role] to grant non-administrator users permission to configure user-defined projects. +* xref:../../../observability/monitoring/configuring-user-workload-monitoring/before-you-begin-uwm.adoc#enabling-alert-routing-for-user-defined-projects_before-you-begin-uwm[Enable alert routing for user-defined projects] so that developers and other users can configure custom alerts and alert routing for their projects. +* If needed, configure alert routing for user-defined projects to xref:../../../observability/monitoring/configuring-user-workload-monitoring/before-you-begin-uwm.adoc#enabling-a-separate-alertmanager-instance-for-user-defined-alert-routing_before-you-begin-uwm[use an optional Alertmanager instance dedicated for use only by user-defined projects]. +* xref:../../../observability/monitoring/configuring-user-workload-monitoring/configuring-alerts-and-notifications-uwm.adoc#configuring-alert-notifications_configuring-alerts-and-notifications-uwm[Configure notifications for user-defined alerts]. +* If you use the platform Alertmanager instance for user-defined alert routing, xref:../../../observability/monitoring/configuring-user-workload-monitoring/configuring-alerts-and-notifications-uwm.adoc#configuring-different-alert-receivers-for-default-platform-alerts-and-user-defined-alerts_configuring-alerts-and-notifications-uwm[configure different alert receivers] for default platform alerts and user-defined alerts.