From 4800b724ba51e546861a4c0afcd38d88ab35ba26 Mon Sep 17 00:00:00 2001 From: Eliska Romanova Date: Mon, 24 Nov 2025 09:08:00 +0100 Subject: [PATCH] OBSDOCS-2253: fix config map reference generation --- ...e-for-the-cluster-monitoring-operator.adoc | 779 +----------------- ...ring-operator-configuration-reference.adoc | 738 +++++++++++++++++ 2 files changed, 744 insertions(+), 773 deletions(-) create mode 100644 modules/monitoring-cluster-monitoring-operator-configuration-reference.adoc diff --git a/config-map-reference/config-map-reference-for-the-cluster-monitoring-operator.adoc b/config-map-reference/config-map-reference-for-the-cluster-monitoring-operator.adoc index b779866d91a5..223e1b78f44f 100644 --- a/config-map-reference/config-map-reference-for-the-cluster-monitoring-operator.adoc +++ b/config-map-reference/config-map-reference-for-the-cluster-monitoring-operator.adoc @@ -1,781 +1,14 @@ -// DO NOT EDIT THE CONTENT IN THIS FILE. It is automatically generated from the -// source code for the Cluster Monitoring Operator. Any changes made to this -// file will be overwritten when the content is re-generated. If you wish to -// make edits, read the docgen utility instructions in the source code for the -// CMO. -:_mod-docs-content-type: REFERENCE -include::_attributes/common-attributes.adoc[] +:_mod-docs-content-type: ASSEMBLY [id="config-map-reference-for-the-cluster-monitoring-operator"] = Config map reference for the Cluster Monitoring Operator + +include::_attributes/common-attributes.adoc[] + :context: config-map-reference-for-the-cluster-monitoring-operator toc::[] -[id="cluster-monitoring-operator-configuration-reference"] -== Cluster Monitoring Operator configuration reference - [role="_abstract"] -Parts of {ocp} cluster monitoring are configurable. -The API is accessible by setting parameters defined in various config maps. - -ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[] -* To configure monitoring components, edit the `ConfigMap` object named `cluster-monitoring-config` in the `openshift-monitoring` namespace. -These configurations are defined by link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration]. -endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[] - -* To configure monitoring components that monitor user-defined projects, edit the `ConfigMap` object named `user-workload-monitoring-config` in the `openshift-user-workload-monitoring` namespace. -These configurations are defined by link:#userworkloadconfiguration[UserWorkloadConfiguration]. - -The configuration file is always defined under the `config.yaml` key in the config map data. - -[IMPORTANT] -==== -* Not all configuration parameters for the monitoring stack are exposed. -Only the parameters and fields listed in this reference are supported for configuration. -For more information about supported configurations, see xref:../support-for-monitoring/maintenance-and-support-for-monitoring.adoc#maintenance-and-support-for-monitoring[Maintenance and support for monitoring]. - -* Configuring cluster monitoring is optional. -* If a configuration does not exist or is empty, default values are used. -* If the configuration has invalid YAML data, or if it contains unsupported or duplicated fields that bypassed early validation, the Cluster Monitoring Operator stops reconciling the resources and reports the `Degraded=True` status in the status conditions of the Operator. -==== - -== AdditionalAlertmanagerConfig - -=== Description - -The `AdditionalAlertmanagerConfig` resource defines settings for how a component communicates with additional Alertmanager instances. - -=== Required -* `apiVersion` - -Appears in: link:#prometheusk8sconfig[PrometheusK8sConfig], -link:#prometheusrestrictedconfig[PrometheusRestrictedConfig], -link:#thanosrulerconfig[ThanosRulerConfig] - -[options="header"] -|=== -| Property | Type | Description -|apiVersion|string|Defines the API version of Alertmanager. `v1` is no longer supported, `v2` is set as the default value. - -|bearerToken|*v1.SecretKeySelector|Defines the secret key reference containing the bearer token to use when authenticating to Alertmanager. - -|pathPrefix|string|Defines the path prefix to add in front of the push endpoint path. - -|scheme|string|Defines the URL scheme to use when communicating with Alertmanager instances. Possible values are `http` or `https`. The default value is `http`. - -|staticConfigs|[]string|A list of statically configured Alertmanager endpoints in the form of `:`. - -|timeout|*string|Defines the timeout value used when sending alerts. - -|tlsConfig|link:#tlsconfig[TLSConfig]|Defines the TLS settings to use for Alertmanager connections. - -|=== - -== AlertmanagerMainConfig - -=== Description - -The `AlertmanagerMainConfig` resource defines settings for the Alertmanager component in the `openshift-monitoring` namespace. - -Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] - -[options="header"] -|=== -| Property | Type | Description -|enabled|*bool|A Boolean flag that enables or disables the main Alertmanager instance in the `openshift-monitoring` namespace. The default value is `true`. - -|enableUserAlertmanagerConfig|bool|A Boolean flag that enables or disables user-defined namespaces to be selected for `AlertmanagerConfig` lookups. This setting only applies if the user workload monitoring instance of Alertmanager is not enabled. The default value is `false`. - -|logLevel|string|Defines the log level setting for Alertmanager. The possible values are: `error`, `warn`, `info`, `debug`. The default value is `info`. - -|nodeSelector|map[string]string|Defines the nodes on which the Pods are scheduled. - -|resources|*v1.ResourceRequirements|Defines resource requests and limits for the Alertmanager container. - -|secrets|[]string|Defines a list of secrets to be mounted into Alertmanager. The secrets must reside within the same namespace as the Alertmanager object. They are added as volumes named `secret-` and mounted at `/etc/alertmanager/secrets/` in the `alertmanager` container of the Alertmanager pods. - -|tolerations|[]v1.Toleration|Defines tolerations for the pods. - -|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines a pod's topology spread constraints. - -|volumeClaimTemplate|*monv1.EmbeddedPersistentVolumeClaim|Defines persistent storage for Alertmanager. Use this setting to configure the persistent volume claim, including storage class, volume size, and name. - -|=== - -== AlertmanagerUserWorkloadConfig - -=== Description - -The `AlertmanagerUserWorkloadConfig` resource defines the settings for the Alertmanager instance used for user-defined projects. - -Appears in: link:#userworkloadconfiguration[UserWorkloadConfiguration] - -[options="header"] -|=== -| Property | Type | Description -|enabled|bool|A Boolean flag that enables or disables a dedicated instance of Alertmanager for user-defined alerts in the `openshift-user-workload-monitoring` namespace. The default value is `false`. - -|enableAlertmanagerConfig|bool|A Boolean flag to enable or disable user-defined namespaces to be selected for `AlertmanagerConfig` lookup. The default value is `false`. - -|logLevel|string|Defines the log level setting for Alertmanager for user workload monitoring. The possible values are `error`, `warn`, `info`, and `debug`. The default value is `info`. - -|resources|*v1.ResourceRequirements|Defines resource requests and limits for the Alertmanager container. - -|secrets|[]string|Defines a list of secrets to be mounted into Alertmanager. The secrets must be located within the same namespace as the Alertmanager object. They are added as volumes named `secret-` and mounted at `/etc/alertmanager/secrets/` in the `alertmanager` container of the Alertmanager pods. - -|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. - -|tolerations|[]v1.Toleration|Defines tolerations for the pods. - -|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines a pod's topology spread constraints. - -|volumeClaimTemplate|*monv1.EmbeddedPersistentVolumeClaim|Defines persistent storage for Alertmanager. Use this setting to configure the persistent volume claim, including storage class, volume size and name. - -|=== - -== ClusterMonitoringConfiguration - -=== Description - -The `ClusterMonitoringConfiguration` resource defines settings that customize the default platform monitoring stack through the `cluster-monitoring-config` config map in the `openshift-monitoring` namespace. - -[options="header"] -|=== -| Property | Type | Description -|alertmanagerMain|*link:#alertmanagermainconfig[AlertmanagerMainConfig]|`AlertmanagerMainConfig` defines settings for the Alertmanager component in the `openshift-monitoring` namespace. - -|enableUserWorkload|*bool|`UserWorkloadEnabled` is a Boolean flag that enables monitoring for user-defined projects. - -|userWorkload|*link:#userworkloadconfig[UserWorkloadConfig]|`UserWorkload` defines settings for the monitoring of user-defined projects. - -|kubeStateMetrics|*link:#kubestatemetricsconfig[KubeStateMetricsConfig]|`KubeStateMetricsConfig` defines settings for the `kube-state-metrics` agent. - -|metricsServer|*link:#metricsserverconfig[MetricsServerConfig]|`MetricsServer` defines settings for the Metrics Server component. - -|prometheusK8s|*link:#prometheusk8sconfig[PrometheusK8sConfig]|`PrometheusK8sConfig` defines settings for the Prometheus component. - -|prometheusOperator|*link:#prometheusoperatorconfig[PrometheusOperatorConfig]|`PrometheusOperatorConfig` defines settings for the Prometheus Operator component. - -|prometheusOperatorAdmissionWebhook|*link:#prometheusoperatoradmissionwebhookconfig[PrometheusOperatorAdmissionWebhookConfig]|`PrometheusOperatorAdmissionWebhookConfig` defines settings for the admission webhook component of Prometheus Operator. - -|openshiftStateMetrics|*link:#openshiftstatemetricsconfig[OpenShiftStateMetricsConfig]|`OpenShiftMetricsConfig` defines settings for the `openshift-state-metrics` agent. - -|telemeterClient|*link:#telemeterclientconfig[TelemeterClientConfig]|`TelemeterClientConfig` defines settings for the Telemeter Client component. - -|thanosQuerier|*link:#thanosquerierconfig[ThanosQuerierConfig]|`ThanosQuerierConfig` defines settings for the Thanos Querier component. - -|nodeExporter|link:#nodeexporterconfig[NodeExporterConfig]|`NodeExporterConfig` defines settings for the `node-exporter` agent. - -|monitoringPlugin|*link:#monitoringpluginconfig[MonitoringPluginConfig]|`MonitoringPluginConfig` defines settings for the monitoring `console-plugin` component. - -|=== - -== KubeStateMetricsConfig - -=== Description - -The `KubeStateMetricsConfig` resource defines settings for the `kube-state-metrics` agent. - -Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] - -[options="header"] -|=== -| Property | Type | Description -|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. - -|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `KubeStateMetrics` container. - -|tolerations|[]v1.Toleration|Defines tolerations for the pods. - -|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines a pod's topology spread constraints. - -|=== - -== MetricsServerConfig - -=== Description - -The `MetricsServerConfig` resource defines settings for the Metrics Server component. - -Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] - -[options="header"] -|=== -| Property | Type | Description -|audit|*Audit|Defines the audit configuration used by the Metrics Server instance. Possible profile values are `Metadata`, `Request`, `RequestResponse`, and `None`. The default value is `Metadata`. - -|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. - -|tolerations|[]v1.Toleration|Defines tolerations for the pods. - -| verbosity | uint8 | Defines the verbosity of log messages for Metrics Server. Valid values are positive integers, values over 10 are usually unnecessary. The default value is `0`. - -|resources|*v1.ResourceRequirements|Defines resource requests and limits for the Metrics Server container. - -|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines a pod's topology spread constraints. - -|=== - -== MonitoringPluginConfig - -=== Description - -The `MonitoringPluginConfig` resource defines settings for the web console plugin component in the `openshift-monitoring` namespace. - -Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] - -[options="header"] -|=== -| Property | Type | Description -|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. - -|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `console-plugin` container. - -|tolerations|[]v1.Toleration|Defines tolerations for the pods. - -|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines a pod's topology spread constraints. - -|=== - -== NodeExporterCollectorBuddyInfoConfig - -=== Description - -The `NodeExporterCollectorBuddyInfoConfig` resource works as an on/off switch for the `buddyinfo` collector of the `node-exporter` agent. By default, the `buddyinfo` collector is disabled. - -Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] - -[options="header"] -|=== -| Property | Type | Description -|enabled|bool|A Boolean flag that enables or disables the `buddyinfo` collector. - -|=== - -== NodeExporterCollectorConfig - -=== Description - -The `NodeExporterCollectorConfig` resource defines settings for individual collectors of the `node-exporter` agent. - -Appears in: link:#nodeexporterconfig[NodeExporterConfig] - -[options="header"] -|=== -| Property | Type | Description -|cpufreq|link:#nodeexportercollectorcpufreqconfig[NodeExporterCollectorCpufreqConfig]|Defines the configuration of the `cpufreq` collector, which collects CPU frequency statistics. Disabled by default. - -|tcpstat|link:#nodeexportercollectortcpstatconfig[NodeExporterCollectorTcpStatConfig]|Defines the configuration of the `tcpstat` collector, which collects TCP connection statistics. Disabled by default. - -|netdev|link:#nodeexportercollectornetdevconfig[NodeExporterCollectorNetDevConfig]|Defines the configuration of the `netdev` collector, which collects network devices statistics. Enabled by default. - -|netclass|link:#nodeexportercollectornetclassconfig[NodeExporterCollectorNetClassConfig]|Defines the configuration of the `netclass` collector, which collects information about network devices. Enabled by default. - -|buddyinfo|link:#nodeexportercollectorbuddyinfoconfig[NodeExporterCollectorBuddyInfoConfig]|Defines the configuration of the `buddyinfo` collector, which collects statistics about memory fragmentation from the `node_buddyinfo_blocks` metric. This metric collects data from `/proc/buddyinfo`. Disabled by default. - -|mountstats|link:#nodeexportercollectormountstatsconfig[NodeExporterCollectorMountStatsConfig]|Defines the configuration of the `mountstats` collector, which collects statistics about NFS volume I/O activities. Disabled by default. - -|ksmd|link:#nodeexportercollectorksmdconfig[NodeExporterCollectorKSMDConfig]|Defines the configuration of the `ksmd` collector, which collects statistics from the kernel same-page merger daemon. Disabled by default. - -|processes|link:#nodeexportercollectorprocessesconfig[NodeExporterCollectorProcessesConfig]|Defines the configuration of the `processes` collector, which collects statistics from processes and threads running in the system. Disabled by default. - -|systemd|link:#nodeexportercollectorsystemdconfig[NodeExporterCollectorSystemdConfig]|Defines the configuration of the `systemd` collector, which collects statistics on the systemd daemon and its managed services. Disabled by default. - -|=== - -== NodeExporterCollectorCpufreqConfig - -=== Description - -Use the `NodeExporterCollectorCpufreqConfig` resource to enable or disable the `cpufreq` collector of the `node-exporter` agent. By default, the `cpufreq` collector is disabled. Under certain circumstances, enabling the `cpufreq` collector increases CPU usage on machines with many cores. If you enable this collector and have machines with many cores, monitor your systems closely for excessive CPU usage. - -Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] - -[options="header"] -|=== -| Property | Type | Description -|enabled|bool|A Boolean flag that enables or disables the `cpufreq` collector. - -|=== - -== NodeExporterCollectorKSMDConfig - -=== Description - -Use the `NodeExporterCollectorKSMDConfig` resource to enable or disable the `ksmd` collector of the `node-exporter` agent. By default, the `ksmd` collector is disabled. - -Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] - -[options="header"] -|=== -| Property | Type | Description -|enabled|bool|A Boolean flag that enables or disables the `ksmd` collector. - -|=== - -== NodeExporterCollectorMountStatsConfig - -=== Description - -Use the `NodeExporterCollectorMountStatsConfig` resource to enable or disable the `mountstats` collector of the `node-exporter` agent. By default, the `mountstats` collector is disabled. If you enable the collector, the following metrics become available: `node_mountstats_nfs_read_bytes_total`, `node_mountstats_nfs_write_bytes_total`, and `node_mountstats_nfs_operations_requests_total`. Be aware that these metrics can have a high cardinality. If you enable this collector, closely monitor any increases in memory usage for the `prometheus-k8s` pods. - -Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] - -[options="header"] -|=== -| Property | Type | Description -|enabled|bool|A Boolean flag that enables or disables the `mountstats` collector. - -|=== - -== NodeExporterCollectorNetClassConfig - -=== Description - -Use the `NodeExporterCollectorNetClassConfig` resource to enable or disable the `netclass` collector of the `node-exporter` agent. By default, the `netclass` collector is enabled. If you disable this collector, these metrics become unavailable: `node_network_info`, `node_network_address_assign_type`, `node_network_carrier`, `node_network_carrier_changes_total`, `node_network_carrier_up_changes_total`, `node_network_carrier_down_changes_total`, `node_network_device_id`, `node_network_dormant`, `node_network_flags`, `node_network_iface_id`, `node_network_iface_link`, `node_network_iface_link_mode`, `node_network_mtu_bytes`, `node_network_name_assign_type`, `node_network_net_dev_group`, `node_network_speed_bytes`, `node_network_transmit_queue_length`, and `node_network_protocol_type`. - -Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] - -[options="header"] -|=== -| Property | Type | Description -|enabled|bool|A Boolean flag that enables or disables the `netclass` collector. - -|useNetlink|bool|A Boolean flag that activates the `netlink` implementation of the `netclass` collector. The default value is `true`, which activates the `netlink` mode. This implementation improves the performance of the `netclass` collector. - -|=== - -== NodeExporterCollectorNetDevConfig - -=== Description - -Use the `NodeExporterCollectorNetDevConfig` resource to enable or disable the `netdev` collector of the `node-exporter` agent. By default, the `netdev` collector is enabled. If disabled, these metrics become unavailable: `node_network_receive_bytes_total`, `node_network_receive_compressed_total`, `node_network_receive_drop_total`, `node_network_receive_errs_total`, `node_network_receive_fifo_total`, `node_network_receive_frame_total`, `node_network_receive_multicast_total`, `node_network_receive_nohandler_total`, `node_network_receive_packets_total`, `node_network_transmit_bytes_total`, `node_network_transmit_carrier_total`, `node_network_transmit_colls_total`, `node_network_transmit_compressed_total`, `node_network_transmit_drop_total`, `node_network_transmit_errs_total`, `node_network_transmit_fifo_total`, and `node_network_transmit_packets_total`. - -Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] - -[options="header"] -|=== -| Property | Type | Description -|enabled|bool|A Boolean flag that enables or disables the `netdev` collector. - -|=== - -== NodeExporterCollectorProcessesConfig - -=== Description - -Use the `NodeExporterCollectorProcessesConfig` resource to enable or disable the `processes` collector of the `node-exporter` agent. If the collector is enabled, the following metrics become available: `node_processes_max_processes`, `node_processes_pids`, `node_processes_state`, `node_processes_threads`, `node_processes_threads_state`. The metric `node_processes_state` and `node_processes_threads_state` can have up to five series each, depending on the state of the processes and threads. The possible states of a process or a thread are: `D` (UNINTERRUPTABLE_SLEEP), `R` (RUNNING & RUNNABLE), `S` (INTERRUPTABLE_SLEEP), `T` (STOPPED), or `Z` (ZOMBIE). By default, the `processes` collector is disabled. - -Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] - -[options="header"] -|=== -| Property | Type | Description -|enabled|bool|A Boolean flag that enables or disables the `processes` collector. - -|=== - -== NodeExporterCollectorSystemdConfig - -=== Description - -Use the `NodeExporterCollectorSystemdConfig` resource to enable or disable the `systemd` collector of the `node-exporter` agent. By default, the `systemd` collector is disabled. If enabled, the following metrics become available: `node_systemd_system_running`, `node_systemd_units`, `node_systemd_version`. If the unit uses a socket, it also generates the following metrics: `node_systemd_socket_accepted_connections_total`, `node_systemd_socket_current_connections`, `node_systemd_socket_refused_connections_total`. You can use the `units` parameter to select the `systemd` units to be included by the `systemd` collector. The selected units are used to generate the `node_systemd_unit_state` metric, which shows the state of each `systemd` unit. However, this metric's cardinality might be high (at least five series per unit per node). If you enable this collector with a long list of selected units, closely monitor the `prometheus-k8s` deployment for excessive memory usage. Note that the `node_systemd_timer_last_trigger_seconds` metric is only shown if you have configured the value of the `units` parameter as `logrotate.timer`. - -Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] - -[options="header"] -|=== -| Property | Type | Description -|enabled|bool|A Boolean flag that enables or disables the `systemd` collector. - -|units|[]string|A list of regular expression (regex) patterns that match systemd units to be included by the `systemd` collector. By default, the list is empty, so the collector exposes no metrics for systemd units. - -|=== - -== NodeExporterCollectorTcpStatConfig - -=== Description - -The `NodeExporterCollectorTcpStatConfig` resource works as an on/off switch for the `tcpstat` collector of the `node-exporter` agent. By default, the `tcpstat` collector is disabled. - -Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] - -[options="header"] -|=== -| Property | Type | Description -|enabled|bool|A Boolean flag that enables or disables the `tcpstat` collector. - -|=== - -== NodeExporterConfig - -=== Description - -The `NodeExporterConfig` resource defines settings for the `node-exporter` agent. - -Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] - -[options="header"] -|=== -| Property | Type | Description -|collectors|link:#nodeexportercollectorconfig[NodeExporterCollectorConfig]|Defines which collectors are enabled and their additional configuration parameters. - -|maxProcs|uint32|The target number of CPUs on which the node-exporter's process will run. The default value is `0`, which means that node-exporter runs on all CPUs. If a kernel deadlock occurs or if performance degrades when reading from `sysfs` concurrently, you can change this value to `1`, which limits node-exporter to running on one CPU. For nodes with a high CPU count, you can set the limit to a low number, which saves resources by preventing Go routines from being scheduled to run on all CPUs. However, I/O performance degrades if the `maxProcs` value is set too low and there are many metrics to collect. - -|ignoredNetworkDevices|*[]string|A list of network devices, defined as regular expressions, that you want to exclude from the relevant collector configuration such as `netdev` and `netclass`. If no list is specified, the Cluster Monitoring Operator uses a predefined list of devices to be excluded to minimize the impact on memory usage. If the list is empty, no devices are excluded. If you modify this setting, monitor the `prometheus-k8s` deployment closely for excessive memory usage. - -|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `NodeExporter` container. - -|=== - -== OpenShiftStateMetricsConfig - -=== Description - -The `OpenShiftStateMetricsConfig` resource defines settings for the `openshift-state-metrics` agent. - -Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] - -[options="header"] -|=== -| Property | Type | Description -|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. - -|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `OpenShiftStateMetrics` container. - -|tolerations|[]v1.Toleration|Defines tolerations for the pods. - -|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines the pod's topology spread constraints. - -|=== - -== PrometheusK8sConfig - -=== Description - -The `PrometheusK8sConfig` resource defines settings for the Prometheus component. - -Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] - -[options="header"] -|=== -| Property | Type | Description -|additionalAlertmanagerConfigs|[]link:#additionalalertmanagerconfig[AdditionalAlertmanagerConfig]|Configures additional Alertmanager instances that receive alerts from the Prometheus component. By default, no additional Alertmanager instances are configured. - -|enforcedBodySizeLimit|string|Enforces a body size limit for Prometheus scraped metrics. If a scraped target's body response is larger than the limit, the scrape will fail. The following values are valid: an empty value to specify no limit, a numeric value in Prometheus size format (such as `64MB`), or the string `automatic`, which indicates that the limit will be automatically calculated based on cluster capacity. The default value is empty, which indicates no limit. - -|externalLabels|ExternalLabels|Defines labels to be added to any time series or alerts when communicating with external systems such as federation, remote storage, and Alertmanager. The type is map[string]string. By default, no labels are added. - -|logLevel|string|Defines the log level setting for Prometheus. The possible values are: `error`, `warn`, `info`, and `debug`. The default value is `info`. - -|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. - -|queryLogFile|string|Specifies the file to which PromQL queries are logged. This setting can be either a filename, in which case the queries are saved to an `emptyDir` volume at `/var/log/prometheus`, or a full path to a location where an `emptyDir` volume will be mounted and the queries saved. Writing to `/dev/stderr`, `/dev/stdout` or `/dev/null` is supported, but writing to any other `/dev/` path is not supported. Relative paths are also not supported. By default, PromQL queries are not logged. - -|remoteWrite|[]link:#remotewritespec[RemoteWriteSpec]|Defines the remote write configuration, including URL, authentication, and relabeling settings. - -|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `Prometheus` container. - -|retention|string|Defines the duration for which Prometheus retains data. This definition must be specified using the following regular expression pattern: `[0-9]+(ms\|s\|m\|h\|d\|w\|y)` (ms = milliseconds, s= seconds,m = minutes, h = hours, d = days, w = weeks, y = years). The default value is `15d`. - -|retentionSize|string|Defines the maximum amount of disk space used by data blocks plus the write-ahead log (WAL). Supported values are `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`, `EB`, and `EiB`. By default, no limit is defined. - -|tolerations|[]v1.Toleration|Defines tolerations for the pods. - -|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines the pod's topology spread constraints. - -|collectionProfile|CollectionProfile|Defines the metrics collection profile that Prometheus uses to collect metrics from the platform components. Supported values are `full` or `minimal`. In the `full` profile (default), Prometheus collects all metrics that are exposed by the platform components. In the `minimal` profile, Prometheus only collects metrics necessary for the default platform alerts, recording rules, telemetry, and console dashboards. - -|volumeClaimTemplate|*monv1.EmbeddedPersistentVolumeClaim|Defines persistent storage for Prometheus. Use this setting to configure the persistent volume claim, including storage class, volume size and name. - -|=== - -== PrometheusOperatorConfig - -=== Description - -The `PrometheusOperatorConfig` resource defines settings for the Prometheus Operator component. - -Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration], -link:#userworkloadconfiguration[UserWorkloadConfiguration] - -[options="header"] -|=== -| Property | Type | Description -|logLevel|string|Defines the log level settings for Prometheus Operator. The possible values are `error`, `warn`, `info`, and `debug`. The default value is `info`. - -|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. - -|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `PrometheusOperator` container. - -|tolerations|[]v1.Toleration|Defines tolerations for the pods. - -|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines the pod's topology spread constraints. - -|=== - -== PrometheusOperatorAdmissionWebhookConfig - -=== Description - -The `PrometheusOperatorAdmissionWebhookConfig` resource defines settings for the admission webhook workload for Prometheus Operator. - -Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] - -[options="header"] -|=== -| Property | Type | Description - -|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `prometheus-operator-admission-webhook` container. - -|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines a pod's topology spread constraints. - -|=== - -== PrometheusRestrictedConfig - -=== Description - -The `PrometheusRestrictedConfig` resource defines the settings for the Prometheus component that monitors user-defined projects. - -Appears in: link:#userworkloadconfiguration[UserWorkloadConfiguration] - -[options="header"] -|=== -| Property | Type | Description -|scrapeInterval|string|Configures the default interval between consecutive scrapes in case the `ServiceMonitor` or `PodMonitor` resource does not specify any value. The interval must be set between 5 seconds and 5 minutes. The value can be expressed in: seconds (for example `30s`), minutes (for example `1m`) or a mix of minutes and seconds (for example `1m30s`). The default value is `30s`. - -|evaluationInterval|string|Configures the default interval between rule evaluations in case the `PrometheusRule` resource does not specify any value. The interval must be set between 5 seconds and 5 minutes. The value can be expressed in: seconds (for example `30s`), minutes (for example `1m`) or a mix of minutes and seconds (for example `1m30s`). It only applies to `PrometheusRule` resources with the `openshift.io/prometheus-rule-evaluation-scope=\"leaf-prometheus\"` label. The default value is `30s`. - -|additionalAlertmanagerConfigs|[]link:#additionalalertmanagerconfig[AdditionalAlertmanagerConfig]|Configures additional Alertmanager instances that receive alerts from the Prometheus component. By default, no additional Alertmanager instances are configured. - -|enforcedLabelLimit|*uint64|Specifies a per-scrape limit on the number of labels accepted for a sample. If the number of labels exceeds this limit after metric relabeling, the entire scrape is treated as failed. The default value is `0`, which means that no limit is set. - -|enforcedLabelNameLengthLimit|*uint64|Specifies a per-scrape limit on the length of a label name for a sample. If the length of a label name exceeds this limit after metric relabeling, the entire scrape is treated as failed. The default value is `0`, which means that no limit is set. - -|enforcedLabelValueLengthLimit|*uint64|Specifies a per-scrape limit on the length of a label value for a sample. If the length of a label value exceeds this limit after metric relabeling, the entire scrape is treated as failed. The default value is `0`, which means that no limit is set. - -|enforcedSampleLimit|*uint64|Specifies a global limit on the number of scraped samples that will be accepted. This setting overrides the `SampleLimit` value set in any user-defined `ServiceMonitor` or `PodMonitor` object if the value is greater than `enforcedTargetLimit`. Administrators can use this setting to keep the overall number of samples under control. The default value is `0`, which means that no limit is set. - -|enforcedTargetLimit|*uint64|Specifies a global limit on the number of scraped targets. This setting overrides the `TargetLimit` value set in any user-defined `ServiceMonitor` or `PodMonitor` object if the value is greater than `enforcedSampleLimit`. Administrators can use this setting to keep the overall number of targets under control. The default value is `0`. - -|externalLabels|ExternalLabels|Defines labels to be added to any time series or alerts when communicating with external systems such as federation, remote storage, and Alertmanager. The type is map[string]string. By default, no labels are added. - -|logLevel|string|Defines the log level setting for Prometheus. The possible values are `error`, `warn`, `info`, and `debug`. The default setting is `info`. - -|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. - -|queryLogFile|string|Specifies the file to which PromQL queries are logged. This setting can be either a filename, in which case the queries are saved to an `emptyDir` volume at `/var/log/prometheus`, or a full path to a location where an `emptyDir` volume will be mounted and the queries saved. Writing to `/dev/stderr`, `/dev/stdout` or `/dev/null` is supported, but writing to any other `/dev/` path is not supported. Relative paths are also not supported. By default, PromQL queries are not logged. - -|remoteWrite|[]link:#remotewritespec[RemoteWriteSpec]|Defines the remote write configuration, including URL, authentication, and relabeling settings. - -|resources|*v1.ResourceRequirements|Defines resource requests and limits for the Prometheus container. - -|retention|string|Defines the duration for which Prometheus retains data. This definition must be specified using the following regular expression pattern: `[0-9]+(ms\|s\|m\|h\|d\|w\|y)` (ms = milliseconds, s= seconds,m = minutes, h = hours, d = days, w = weeks, y = years). The default value is `24h`. - -|retentionSize|string|Defines the maximum amount of disk space used by data blocks plus the write-ahead log (WAL). Supported values are `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`, `EB`, and `EiB`. The default value is `nil`. - -|tolerations|[]v1.Toleration|Defines tolerations for the pods. - -|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines the pod's topology spread constraints. - -|volumeClaimTemplate|*monv1.EmbeddedPersistentVolumeClaim|Defines persistent storage for Prometheus. Use this setting to configure the storage class and size of a volume. - -|=== - -== RemoteWriteSpec - -=== Description - -The `RemoteWriteSpec` resource defines the settings for remote write storage. - -=== Required -* `url` - -Appears in: link:#prometheusk8sconfig[PrometheusK8sConfig], -link:#prometheusrestrictedconfig[PrometheusRestrictedConfig] - -[options="header"] -|=== -| Property | Type | Description -|authorization|*monv1.SafeAuthorization|Defines the authorization settings for remote write storage. - -|basicAuth|*monv1.BasicAuth|Defines Basic authentication settings for the remote write endpoint URL. - -|bearerTokenFile|string|Defines the file that contains the bearer token for the remote write endpoint. However, because you cannot mount secrets in a pod, in practice you can only reference the token of the service account. - -|headers|map[string]string|Specifies the custom HTTP headers to be sent along with each remote write request. Headers set by Prometheus cannot be overwritten. - -|metadataConfig|*monv1.MetadataConfig|Defines settings for sending series metadata to remote write storage. - -|name|string|Defines the name of the remote write queue. This name is used in metrics and logging to differentiate queues. If specified, this name must be unique. - -|oauth2|*monv1.OAuth2|Defines OAuth2 authentication settings for the remote write endpoint. - -|proxyUrl|string|Defines an optional proxy URL. If the cluster-wide proxy is enabled, it replaces the proxyUrl setting. The cluster-wide proxy supports both HTTP and HTTPS proxies, with HTTPS taking precedence. - -|queueConfig|*monv1.QueueConfig|Allows tuning configuration for remote write queue parameters. - -|remoteTimeout|string|Defines the timeout value for requests to the remote write endpoint. - -|sendExemplars|*bool|Enables sending exemplars via remote write. When enabled, this setting configures Prometheus to store a maximum of 100,000 exemplars in memory. -This setting only applies to user-defined monitoring and is not applicable to core platform monitoring. - -|sigv4|*monv1.Sigv4|Defines AWS Signature Version 4 authentication settings. - -|tlsConfig|*monv1.SafeTLSConfig|Defines TLS authentication settings for the remote write endpoint. - -|url|string|Defines the URL of the remote write endpoint to which samples will be sent. - -|writeRelabelConfigs|[]monv1.RelabelConfig|Defines the list of remote write relabel configurations. - -|=== - -== TLSConfig - -=== Description - -The `TLSConfig` resource configures the settings for TLS connections. - -=== Required -* `insecureSkipVerify` - -Appears in: link:#additionalalertmanagerconfig[AdditionalAlertmanagerConfig] - -[options="header"] -|=== -| Property | Type | Description -|ca|*v1.SecretKeySelector|Defines the secret key reference containing the Certificate Authority (CA) to use for the remote host. - -|cert|*v1.SecretKeySelector|Defines the secret key reference containing the public certificate to use for the remote host. - -|key|*v1.SecretKeySelector|Defines the secret key reference containing the private key to use for the remote host. - -|serverName|string|Used to verify the hostname on the returned certificate. - -|insecureSkipVerify|bool|When set to `true`, disables the verification of the remote host's certificate and name. - -|=== - -== TelemeterClientConfig - -=== Description - -`TelemeterClientConfig` defines settings for the Telemeter Client component. - -=== Required -* `nodeSelector` -* `tolerations` - -Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] - -[options="header"] -|=== -| Property | Type | Description -|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. - -|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `TelemeterClient` container. - -|tolerations|[]v1.Toleration|Defines tolerations for the pods. - -|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines the pod's topology spread constraints. - -|=== - -== ThanosQuerierConfig - -=== Description - -The `ThanosQuerierConfig` resource defines settings for the Thanos Querier component. - -Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] - -[options="header"] -|=== -| Property | Type | Description -|enableRequestLogging|bool|A Boolean flag that enables or disables request logging. The default value is `false`. - -|logLevel|string|Defines the log level setting for Thanos Querier. The possible values are `error`, `warn`, `info`, and `debug`. The default value is `info`. - -|enableCORS|bool|A Boolean flag that enables setting CORS headers. The headers allow access from any origin. The default value is `false`. - -|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. - -|resources|*v1.ResourceRequirements|Defines resource requests and limits for the Thanos Querier container. - -|tolerations|[]v1.Toleration|Defines tolerations for the pods. - -|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines the pod's topology spread constraints. - -|=== - -== ThanosRulerConfig - -=== Description - -The `ThanosRulerConfig` resource defines configuration for the Thanos Ruler instance for user-defined projects. - -Appears in: link:#userworkloadconfiguration[UserWorkloadConfiguration] - -[options="header"] -|=== -| Property | Type | Description -|additionalAlertmanagerConfigs|[]link:#additionalalertmanagerconfig[AdditionalAlertmanagerConfig]|Configures how the Thanos Ruler component communicates with additional Alertmanager instances. The Cluster Monitoring Operator reads the cluster-wide proxy settings and configures the appropriate proxy URL for the Alertmanager endpoints. All Alertmanager endpoints in this group are expected to use the same proxy URL. Endpoints that bypass the cluster proxy should be placed in a separate group. The default value is `nil`. - -|evaluationInterval|string|Configures the default interval between Prometheus rule evaluations in case the `PrometheusRule` resource does not specify any value. The interval must be set between 5 seconds and 5 minutes. The value can be expressed in: seconds (for example `30s`), minutes (for example `1m`) or a mix of minutes and seconds (for example `1m30s`). It applies to `PrometheusRule` resources without the `openshift.io/prometheus-rule-evaluation-scope=\"leaf-prometheus\"` label. The default value is `15s`. - -|logLevel|string|Defines the log level setting for Thanos Ruler. The possible values are `error`, `warn`, `info`, and `debug`. The default value is `info`. - -|nodeSelector|map[string]string|Defines the nodes on which the Pods are scheduled. - -|resources|*v1.ResourceRequirements|Defines resource requests and limits for the Alertmanager container. - -|retention|string|Defines the duration for which Prometheus retains data. This definition must be specified using the following regular expression pattern: `[0-9]+(ms\|s\|m\|h\|d\|w\|y)` (ms = milliseconds, s= seconds,m = minutes, h = hours, d = days, w = weeks, y = years). The default value is `24h`. - -|tolerations|[]v1.Toleration|Defines tolerations for the pods. - -|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines the pod's topology spread constraints. - -|volumeClaimTemplate|*monv1.EmbeddedPersistentVolumeClaim|Defines persistent storage for Thanos Ruler. Use this setting to configure the storage class and size of a volume. - -|=== - -== UserWorkloadConfig - -=== Description - -The `UserWorkloadConfig` resource defines settings for the monitoring of user-defined projects. - -Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] - -[options="header"] -|=== -| Property | Type | Description -|rulesWithoutLabelEnforcementAllowed|*bool|A Boolean flag that enables or disables the ability to deploy user-defined `PrometheusRules` objects for which the `namespace` label is not enforced to the namespace of the object. Such objects should be created in a namespace configured under the `namespacesWithoutLabelEnforcement` property of the `UserWorkloadConfiguration` resource. The default value is `true`. - -|=== - -== UserWorkloadConfiguration - -=== Description - -The `UserWorkloadConfiguration` resource defines the settings responsible for user-defined projects in the `user-workload-monitoring-config` config map in the `openshift-user-workload-monitoring` namespace. You can only enable `UserWorkloadConfiguration` after you have set `enableUserWorkload` to `true` in the `cluster-monitoring-config` config map under the `openshift-monitoring` namespace. - -[options="header"] -|=== -| Property | Type | Description -|alertmanager|*link:#alertmanageruserworkloadconfig[AlertmanagerUserWorkloadConfig]|Defines the settings for the Alertmanager component in user workload monitoring. - -|prometheus|*link:#prometheusrestrictedconfig[PrometheusRestrictedConfig]|Defines the settings for the Prometheus component in user workload monitoring. - -|prometheusOperator|*link:#prometheusoperatorconfig[PrometheusOperatorConfig]|Defines the settings for the Prometheus Operator component in user workload monitoring. - -|thanosRuler|*link:#thanosrulerconfig[ThanosRulerConfig]|Defines the settings for the Thanos Ruler component in user workload monitoring. - -|namespacesWithoutLabelEnforcement|[]string|Defines the list of namespaces for which Prometheus and Thanos Ruler in user-defined monitoring do not enforce the `namespace` label value in `PrometheusRule` objects. - -The `namespacesWithoutLabelEnforcement` property allows users to define recording and alerting rules that can query across multiple projects (not limited to user-defined projects) instead of deploying identical `PrometheusRule` objects in each user project. - -To make the resulting alerts and metrics visible to project users, the query expressions should return a `namespace` label with a non-empty value. +You can configure parts of {ocp} cluster monitoring. The API is accessible by setting parameters in config maps. -|=== +include::modules/monitoring-cluster-monitoring-operator-configuration-reference.adoc[leveloffset=+1] diff --git a/modules/monitoring-cluster-monitoring-operator-configuration-reference.adoc b/modules/monitoring-cluster-monitoring-operator-configuration-reference.adoc new file mode 100644 index 000000000000..488ed3d3f312 --- /dev/null +++ b/modules/monitoring-cluster-monitoring-operator-configuration-reference.adoc @@ -0,0 +1,738 @@ +:_mod-docs-content-type: CONCEPT +[id="cluster-monitoring-operator-configuration-reference"] += {cmo-full} configuration reference + +[role="_abstract"] +To customize how {ocp} monitors both the platform and user-defined projects, edit the relevant `ConfigMap` objects that define the cluster and user workload monitoring configurations: + +ifndef::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[] +* To configure default monitoring components, edit the `ConfigMap` object named `cluster-monitoring-config` in the `openshift-monitoring` namespace. +These configurations are defined by link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration]. +endif::openshift-dedicated,openshift-rosa,openshift-rosa-hcp[] + +* To configure monitoring components that monitor user-defined projects, edit the `ConfigMap` object named `user-workload-monitoring-config` in the `openshift-user-workload-monitoring` namespace. +These configurations are defined by link:#userworkloadconfiguration[UserWorkloadConfiguration]. + +The configuration file is always defined under the `config.yaml` key in the config map data. + +[IMPORTANT] +==== +* Not all configuration parameters for the monitoring stack are exposed. +Only the parameters and fields listed in this reference are supported for configuration. +For more information about supported configurations, see xref:../support-for-monitoring/maintenance-and-support-for-monitoring.adoc#maintenance-and-support-for-monitoring[Maintenance and support for monitoring]. + +* Configuring cluster monitoring is optional. +* If a configuration does not exist or is empty, default values are used. +* If the configuration has invalid YAML data, or if it contains unsupported or duplicated fields that bypassed early validation, the Cluster Monitoring Operator stops reconciling the resources and reports the `Degraded=True` status in the status conditions of the Operator. +==== + +== AdditionalAlertmanagerConfig + +Description:: +The `AdditionalAlertmanagerConfig` resource defines settings for how a component communicates with additional Alertmanager instances. + +Required:: +* `apiVersion` + +Appears in: link:#prometheusk8sconfig[PrometheusK8sConfig], +link:#prometheusrestrictedconfig[PrometheusRestrictedConfig], +link:#thanosrulerconfig[ThanosRulerConfig] + +[options="header"] +|=== +| Property | Type | Description +|apiVersion|string|Defines the API version of Alertmanager. `v1` is no longer supported, `v2` is set as the default value. + +|bearerToken|*v1.SecretKeySelector|Defines the secret key reference containing the bearer token to use when authenticating to Alertmanager. + +|pathPrefix|string|Defines the path prefix to add in front of the push endpoint path. + +|scheme|string|Defines the URL scheme to use when communicating with Alertmanager instances. Possible values are `http` or `https`. The default value is `http`. + +|staticConfigs|[]string|A list of statically configured Alertmanager endpoints in the form of `:`. + +|timeout|*string|Defines the timeout value used when sending alerts. + +|tlsConfig|link:#tlsconfig[TLSConfig]|Defines the TLS settings to use for Alertmanager connections. + +|=== + +== AlertmanagerMainConfig + +Description:: +The `AlertmanagerMainConfig` resource defines settings for the Alertmanager component in the `openshift-monitoring` namespace. + +Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] + +[options="header"] +|=== +| Property | Type | Description +|enabled|*bool|A Boolean flag that enables or disables the main Alertmanager instance in the `openshift-monitoring` namespace. The default value is `true`. + +|enableUserAlertmanagerConfig|bool|A Boolean flag that enables or disables user-defined namespaces to be selected for `AlertmanagerConfig` lookups. This setting only applies if the user workload monitoring instance of Alertmanager is not enabled. The default value is `false`. + +|logLevel|string|Defines the log level setting for Alertmanager. The possible values are: `error`, `warn`, `info`, `debug`. The default value is `info`. + +|nodeSelector|map[string]string|Defines the nodes on which the Pods are scheduled. + +|resources|*v1.ResourceRequirements|Defines resource requests and limits for the Alertmanager container. + +|secrets|[]string|Defines a list of secrets to be mounted into Alertmanager. The secrets must reside within the same namespace as the Alertmanager object. They are added as volumes named `secret-` and mounted at `/etc/alertmanager/secrets/` in the `alertmanager` container of the Alertmanager pods. + +|tolerations|[]v1.Toleration|Defines tolerations for the pods. + +|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines a pod's topology spread constraints. + +|volumeClaimTemplate|*monv1.EmbeddedPersistentVolumeClaim|Defines persistent storage for Alertmanager. Use this setting to configure the persistent volume claim, including storage class, volume size, and name. + +|=== + +== AlertmanagerUserWorkloadConfig + +Description:: +The `AlertmanagerUserWorkloadConfig` resource defines the settings for the Alertmanager instance used for user-defined projects. + +Appears in: link:#userworkloadconfiguration[UserWorkloadConfiguration] + +[options="header"] +|=== +| Property | Type | Description +|enabled|bool|A Boolean flag that enables or disables a dedicated instance of Alertmanager for user-defined alerts in the `openshift-user-workload-monitoring` namespace. The default value is `false`. + +|enableAlertmanagerConfig|bool|A Boolean flag to enable or disable user-defined namespaces to be selected for `AlertmanagerConfig` lookup. The default value is `false`. + +|logLevel|string|Defines the log level setting for Alertmanager for user workload monitoring. The possible values are `error`, `warn`, `info`, and `debug`. The default value is `info`. + +|resources|*v1.ResourceRequirements|Defines resource requests and limits for the Alertmanager container. + +|secrets|[]string|Defines a list of secrets to be mounted into Alertmanager. The secrets must be located within the same namespace as the Alertmanager object. They are added as volumes named `secret-` and mounted at `/etc/alertmanager/secrets/` in the `alertmanager` container of the Alertmanager pods. + +|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. + +|tolerations|[]v1.Toleration|Defines tolerations for the pods. + +|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines a pod's topology spread constraints. + +|volumeClaimTemplate|*monv1.EmbeddedPersistentVolumeClaim|Defines persistent storage for Alertmanager. Use this setting to configure the persistent volume claim, including storage class, volume size and name. + +|=== + +== ClusterMonitoringConfiguration + +Description:: +The `ClusterMonitoringConfiguration` resource defines settings that customize the default platform monitoring stack through the `cluster-monitoring-config` config map in the `openshift-monitoring` namespace. + +[options="header"] +|=== +| Property | Type | Description +|alertmanagerMain|*link:#alertmanagermainconfig[AlertmanagerMainConfig]|`AlertmanagerMainConfig` defines settings for the Alertmanager component in the `openshift-monitoring` namespace. + +|enableUserWorkload|*bool|`UserWorkloadEnabled` is a Boolean flag that enables monitoring for user-defined projects. + +|userWorkload|*link:#userworkloadconfig[UserWorkloadConfig]|`UserWorkload` defines settings for the monitoring of user-defined projects. + +|kubeStateMetrics|*link:#kubestatemetricsconfig[KubeStateMetricsConfig]|`KubeStateMetricsConfig` defines settings for the `kube-state-metrics` agent. + +|metricsServer|*link:#metricsserverconfig[MetricsServerConfig]|`MetricsServer` defines settings for the Metrics Server component. + +|prometheusK8s|*link:#prometheusk8sconfig[PrometheusK8sConfig]|`PrometheusK8sConfig` defines settings for the Prometheus component. + +|prometheusOperator|*link:#prometheusoperatorconfig[PrometheusOperatorConfig]|`PrometheusOperatorConfig` defines settings for the Prometheus Operator component. + +|prometheusOperatorAdmissionWebhook|*link:#prometheusoperatoradmissionwebhookconfig[PrometheusOperatorAdmissionWebhookConfig]|`PrometheusOperatorAdmissionWebhookConfig` defines settings for the admission webhook component of Prometheus Operator. + +|openshiftStateMetrics|*link:#openshiftstatemetricsconfig[OpenShiftStateMetricsConfig]|`OpenShiftMetricsConfig` defines settings for the `openshift-state-metrics` agent. + +|telemeterClient|*link:#telemeterclientconfig[TelemeterClientConfig]|`TelemeterClientConfig` defines settings for the Telemeter Client component. + +|thanosQuerier|*link:#thanosquerierconfig[ThanosQuerierConfig]|`ThanosQuerierConfig` defines settings for the Thanos Querier component. + +|nodeExporter|link:#nodeexporterconfig[NodeExporterConfig]|`NodeExporterConfig` defines settings for the `node-exporter` agent. + +|monitoringPlugin|*link:#monitoringpluginconfig[MonitoringPluginConfig]|`MonitoringPluginConfig` defines settings for the monitoring `console-plugin` component. + +|=== + +== KubeStateMetricsConfig + +Description:: +The `KubeStateMetricsConfig` resource defines settings for the `kube-state-metrics` agent. + +Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] + +[options="header"] +|=== +| Property | Type | Description +|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. + +|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `KubeStateMetrics` container. + +|tolerations|[]v1.Toleration|Defines tolerations for the pods. + +|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines a pod's topology spread constraints. + +|=== + +== MetricsServerConfig + +Description:: +The `MetricsServerConfig` resource defines settings for the Metrics Server component. + +Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] + +[options="header"] +|=== +| Property | Type | Description +|audit|*Audit|Defines the audit configuration used by the Metrics Server instance. Possible profile values are `Metadata`, `Request`, `RequestResponse`, and `None`. The default value is `Metadata`. + +|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. + +|tolerations|[]v1.Toleration|Defines tolerations for the pods. + +| verbosity | uint8 | Defines the verbosity of log messages for Metrics Server. Valid values are positive integers, values over 10 are usually unnecessary. The default value is `0`. + +|resources|*v1.ResourceRequirements|Defines resource requests and limits for the Metrics Server container. + +|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines a pod's topology spread constraints. + +|=== + +== MonitoringPluginConfig + +Description:: +The `MonitoringPluginConfig` resource defines settings for the web console plugin component in the `openshift-monitoring` namespace. + +Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] + +[options="header"] +|=== +| Property | Type | Description +|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. + +|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `console-plugin` container. + +|tolerations|[]v1.Toleration|Defines tolerations for the pods. + +|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines a pod's topology spread constraints. + +|=== + +== NodeExporterCollectorBuddyInfoConfig + +Description:: +The `NodeExporterCollectorBuddyInfoConfig` resource works as an on/off switch for the `buddyinfo` collector of the `node-exporter` agent. By default, the `buddyinfo` collector is disabled. + +Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] + +[options="header"] +|=== +| Property | Type | Description +|enabled|bool|A Boolean flag that enables or disables the `buddyinfo` collector. + +|=== + +== NodeExporterCollectorConfig + +Description:: +The `NodeExporterCollectorConfig` resource defines settings for individual collectors of the `node-exporter` agent. + +Appears in: link:#nodeexporterconfig[NodeExporterConfig] + +[options="header"] +|=== +| Property | Type | Description +|cpufreq|link:#nodeexportercollectorcpufreqconfig[NodeExporterCollectorCpufreqConfig]|Defines the configuration of the `cpufreq` collector, which collects CPU frequency statistics. Disabled by default. + +|tcpstat|link:#nodeexportercollectortcpstatconfig[NodeExporterCollectorTcpStatConfig]|Defines the configuration of the `tcpstat` collector, which collects TCP connection statistics. Disabled by default. + +|netdev|link:#nodeexportercollectornetdevconfig[NodeExporterCollectorNetDevConfig]|Defines the configuration of the `netdev` collector, which collects network devices statistics. Enabled by default. + +|netclass|link:#nodeexportercollectornetclassconfig[NodeExporterCollectorNetClassConfig]|Defines the configuration of the `netclass` collector, which collects information about network devices. Enabled by default. + +|buddyinfo|link:#nodeexportercollectorbuddyinfoconfig[NodeExporterCollectorBuddyInfoConfig]|Defines the configuration of the `buddyinfo` collector, which collects statistics about memory fragmentation from the `node_buddyinfo_blocks` metric. This metric collects data from `/proc/buddyinfo`. Disabled by default. + +|mountstats|link:#nodeexportercollectormountstatsconfig[NodeExporterCollectorMountStatsConfig]|Defines the configuration of the `mountstats` collector, which collects statistics about NFS volume I/O activities. Disabled by default. + +|ksmd|link:#nodeexportercollectorksmdconfig[NodeExporterCollectorKSMDConfig]|Defines the configuration of the `ksmd` collector, which collects statistics from the kernel same-page merger daemon. Disabled by default. + +|processes|link:#nodeexportercollectorprocessesconfig[NodeExporterCollectorProcessesConfig]|Defines the configuration of the `processes` collector, which collects statistics from processes and threads running in the system. Disabled by default. + +|systemd|link:#nodeexportercollectorsystemdconfig[NodeExporterCollectorSystemdConfig]|Defines the configuration of the `systemd` collector, which collects statistics on the systemd daemon and its managed services. Disabled by default. + +|=== + +== NodeExporterCollectorCpufreqConfig + +Description:: +Use the `NodeExporterCollectorCpufreqConfig` resource to enable or disable the `cpufreq` collector of the `node-exporter` agent. By default, the `cpufreq` collector is disabled. Under certain circumstances, enabling the `cpufreq` collector increases CPU usage on machines with many cores. If you enable this collector and have machines with many cores, monitor your systems closely for excessive CPU usage. + +Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] + +[options="header"] +|=== +| Property | Type | Description +|enabled|bool|A Boolean flag that enables or disables the `cpufreq` collector. + +|=== + +== NodeExporterCollectorKSMDConfig + +Description:: +Use the `NodeExporterCollectorKSMDConfig` resource to enable or disable the `ksmd` collector of the `node-exporter` agent. By default, the `ksmd` collector is disabled. + +Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] + +[options="header"] +|=== +| Property | Type | Description +|enabled|bool|A Boolean flag that enables or disables the `ksmd` collector. + +|=== + +== NodeExporterCollectorMountStatsConfig + +Description:: +Use the `NodeExporterCollectorMountStatsConfig` resource to enable or disable the `mountstats` collector of the `node-exporter` agent. By default, the `mountstats` collector is disabled. If you enable the collector, the following metrics become available: `node_mountstats_nfs_read_bytes_total`, `node_mountstats_nfs_write_bytes_total`, and `node_mountstats_nfs_operations_requests_total`. Be aware that these metrics can have a high cardinality. If you enable this collector, closely monitor any increases in memory usage for the `prometheus-k8s` pods. + +Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] + +[options="header"] +|=== +| Property | Type | Description +|enabled|bool|A Boolean flag that enables or disables the `mountstats` collector. + +|=== + +== NodeExporterCollectorNetClassConfig + +Description:: +Use the `NodeExporterCollectorNetClassConfig` resource to enable or disable the `netclass` collector of the `node-exporter` agent. By default, the `netclass` collector is enabled. If you disable this collector, these metrics become unavailable: `node_network_info`, `node_network_address_assign_type`, `node_network_carrier`, `node_network_carrier_changes_total`, `node_network_carrier_up_changes_total`, `node_network_carrier_down_changes_total`, `node_network_device_id`, `node_network_dormant`, `node_network_flags`, `node_network_iface_id`, `node_network_iface_link`, `node_network_iface_link_mode`, `node_network_mtu_bytes`, `node_network_name_assign_type`, `node_network_net_dev_group`, `node_network_speed_bytes`, `node_network_transmit_queue_length`, and `node_network_protocol_type`. + +Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] + +[options="header"] +|=== +| Property | Type | Description +|enabled|bool|A Boolean flag that enables or disables the `netclass` collector. + +|useNetlink|bool|A Boolean flag that activates the `netlink` implementation of the `netclass` collector. The default value is `true`, which activates the `netlink` mode. This implementation improves the performance of the `netclass` collector. + +|=== + +== NodeExporterCollectorNetDevConfig + +Description:: +Use the `NodeExporterCollectorNetDevConfig` resource to enable or disable the `netdev` collector of the `node-exporter` agent. By default, the `netdev` collector is enabled. If disabled, these metrics become unavailable: `node_network_receive_bytes_total`, `node_network_receive_compressed_total`, `node_network_receive_drop_total`, `node_network_receive_errs_total`, `node_network_receive_fifo_total`, `node_network_receive_frame_total`, `node_network_receive_multicast_total`, `node_network_receive_nohandler_total`, `node_network_receive_packets_total`, `node_network_transmit_bytes_total`, `node_network_transmit_carrier_total`, `node_network_transmit_colls_total`, `node_network_transmit_compressed_total`, `node_network_transmit_drop_total`, `node_network_transmit_errs_total`, `node_network_transmit_fifo_total`, and `node_network_transmit_packets_total`. + +Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] + +[options="header"] +|=== +| Property | Type | Description +|enabled|bool|A Boolean flag that enables or disables the `netdev` collector. + +|=== + +== NodeExporterCollectorProcessesConfig + +Description:: +Use the `NodeExporterCollectorProcessesConfig` resource to enable or disable the `processes` collector of the `node-exporter` agent. If the collector is enabled, the following metrics become available: `node_processes_max_processes`, `node_processes_pids`, `node_processes_state`, `node_processes_threads`, `node_processes_threads_state`. The metric `node_processes_state` and `node_processes_threads_state` can have up to five series each, depending on the state of the processes and threads. The possible states of a process or a thread are: `D` (UNINTERRUPTABLE_SLEEP), `R` (RUNNING & RUNNABLE), `S` (INTERRUPTABLE_SLEEP), `T` (STOPPED), or `Z` (ZOMBIE). By default, the `processes` collector is disabled. + +Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] + +[options="header"] +|=== +| Property | Type | Description +|enabled|bool|A Boolean flag that enables or disables the `processes` collector. + +|=== + +== NodeExporterCollectorSystemdConfig + +Description:: +Use the `NodeExporterCollectorSystemdConfig` resource to enable or disable the `systemd` collector of the `node-exporter` agent. By default, the `systemd` collector is disabled. If enabled, the following metrics become available: `node_systemd_system_running`, `node_systemd_units`, `node_systemd_version`. If the unit uses a socket, it also generates the following metrics: `node_systemd_socket_accepted_connections_total`, `node_systemd_socket_current_connections`, `node_systemd_socket_refused_connections_total`. You can use the `units` parameter to select the `systemd` units to be included by the `systemd` collector. The selected units are used to generate the `node_systemd_unit_state` metric, which shows the state of each `systemd` unit. However, this metric's cardinality might be high (at least five series per unit per node). If you enable this collector with a long list of selected units, closely monitor the `prometheus-k8s` deployment for excessive memory usage. Note that the `node_systemd_timer_last_trigger_seconds` metric is only shown if you have configured the value of the `units` parameter as `logrotate.timer`. + +Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] + +[options="header"] +|=== +| Property | Type | Description +|enabled|bool|A Boolean flag that enables or disables the `systemd` collector. + +|units|[]string|A list of regular expression (regex) patterns that match systemd units to be included by the `systemd` collector. By default, the list is empty, so the collector exposes no metrics for systemd units. + +|=== + +== NodeExporterCollectorTcpStatConfig + +Description:: +The `NodeExporterCollectorTcpStatConfig` resource works as an on/off switch for the `tcpstat` collector of the `node-exporter` agent. By default, the `tcpstat` collector is disabled. + +Appears in: link:#nodeexportercollectorconfig[NodeExporterCollectorConfig] + +[options="header"] +|=== +| Property | Type | Description +|enabled|bool|A Boolean flag that enables or disables the `tcpstat` collector. + +|=== + +== NodeExporterConfig + +Description:: +The `NodeExporterConfig` resource defines settings for the `node-exporter` agent. + +Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] + +[options="header"] +|=== +| Property | Type | Description +|collectors|link:#nodeexportercollectorconfig[NodeExporterCollectorConfig]|Defines which collectors are enabled and their additional configuration parameters. + +|maxProcs|uint32|The target number of CPUs on which the node-exporter's process will run. The default value is `0`, which means that node-exporter runs on all CPUs. If a kernel deadlock occurs or if performance degrades when reading from `sysfs` concurrently, you can change this value to `1`, which limits node-exporter to running on one CPU. For nodes with a high CPU count, you can set the limit to a low number, which saves resources by preventing Go routines from being scheduled to run on all CPUs. However, I/O performance degrades if the `maxProcs` value is set too low and there are many metrics to collect. + +|ignoredNetworkDevices|*[]string|A list of network devices, defined as regular expressions, that you want to exclude from the relevant collector configuration such as `netdev` and `netclass`. If no list is specified, the Cluster Monitoring Operator uses a predefined list of devices to be excluded to minimize the impact on memory usage. If the list is empty, no devices are excluded. If you modify this setting, monitor the `prometheus-k8s` deployment closely for excessive memory usage. + +|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `NodeExporter` container. + +|=== + +== OpenShiftStateMetricsConfig + +Description:: +The `OpenShiftStateMetricsConfig` resource defines settings for the `openshift-state-metrics` agent. + +Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] + +[options="header"] +|=== +| Property | Type | Description +|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. + +|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `OpenShiftStateMetrics` container. + +|tolerations|[]v1.Toleration|Defines tolerations for the pods. + +|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines the pod's topology spread constraints. + +|=== + +== PrometheusK8sConfig + +Description:: +The `PrometheusK8sConfig` resource defines settings for the Prometheus component. + +Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] + +[options="header"] +|=== +| Property | Type | Description +|additionalAlertmanagerConfigs|[]link:#additionalalertmanagerconfig[AdditionalAlertmanagerConfig]|Configures additional Alertmanager instances that receive alerts from the Prometheus component. By default, no additional Alertmanager instances are configured. + +|enforcedBodySizeLimit|string|Enforces a body size limit for Prometheus scraped metrics. If a scraped target's body response is larger than the limit, the scrape will fail. The following values are valid: an empty value to specify no limit, a numeric value in Prometheus size format (such as `64MB`), or the string `automatic`, which indicates that the limit will be automatically calculated based on cluster capacity. The default value is empty, which indicates no limit. + +|externalLabels|ExternalLabels|Defines labels to be added to any time series or alerts when communicating with external systems such as federation, remote storage, and Alertmanager. The type is map[string]string. By default, no labels are added. + +|logLevel|string|Defines the log level setting for Prometheus. The possible values are: `error`, `warn`, `info`, and `debug`. The default value is `info`. + +|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. + +|queryLogFile|string|Specifies the file to which PromQL queries are logged. This setting can be either a filename, in which case the queries are saved to an `emptyDir` volume at `/var/log/prometheus`, or a full path to a location where an `emptyDir` volume will be mounted and the queries saved. Writing to `/dev/stderr`, `/dev/stdout` or `/dev/null` is supported, but writing to any other `/dev/` path is not supported. Relative paths are also not supported. By default, PromQL queries are not logged. + +|remoteWrite|[]link:#remotewritespec[RemoteWriteSpec]|Defines the remote write configuration, including URL, authentication, and relabeling settings. + +|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `Prometheus` container. + +|retention|string|Defines the duration for which Prometheus retains data. This definition must be specified using the following regular expression pattern: `[0-9]+(ms\|s\|m\|h\|d\|w\|y)` (ms = milliseconds, s= seconds,m = minutes, h = hours, d = days, w = weeks, y = years). The default value is `15d`. + +|retentionSize|string|Defines the maximum amount of disk space used by data blocks plus the write-ahead log (WAL). Supported values are `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`, `EB`, and `EiB`. By default, no limit is defined. + +|tolerations|[]v1.Toleration|Defines tolerations for the pods. + +|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines the pod's topology spread constraints. + +|collectionProfile|CollectionProfile|Defines the metrics collection profile that Prometheus uses to collect metrics from the platform components. Supported values are `full` or `minimal`. In the `full` profile (default), Prometheus collects all metrics that are exposed by the platform components. In the `minimal` profile, Prometheus only collects metrics necessary for the default platform alerts, recording rules, telemetry, and console dashboards. + +|volumeClaimTemplate|*monv1.EmbeddedPersistentVolumeClaim|Defines persistent storage for Prometheus. Use this setting to configure the persistent volume claim, including storage class, volume size and name. + +|=== + +== PrometheusOperatorConfig + +Description:: +The `PrometheusOperatorConfig` resource defines settings for the Prometheus Operator component. + +Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration], +link:#userworkloadconfiguration[UserWorkloadConfiguration] + +[options="header"] +|=== +| Property | Type | Description +|logLevel|string|Defines the log level settings for Prometheus Operator. The possible values are `error`, `warn`, `info`, and `debug`. The default value is `info`. + +|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. + +|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `PrometheusOperator` container. + +|tolerations|[]v1.Toleration|Defines tolerations for the pods. + +|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines the pod's topology spread constraints. + +|=== + +== PrometheusOperatorAdmissionWebhookConfig + +Description:: +The `PrometheusOperatorAdmissionWebhookConfig` resource defines settings for the admission webhook workload for Prometheus Operator. + +Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] + +[options="header"] +|=== +| Property | Type | Description + +|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `prometheus-operator-admission-webhook` container. + +|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines a pod's topology spread constraints. + +|=== + +== PrometheusRestrictedConfig + +Description:: +The `PrometheusRestrictedConfig` resource defines the settings for the Prometheus component that monitors user-defined projects. + +Appears in: link:#userworkloadconfiguration[UserWorkloadConfiguration] + +[options="header"] +|=== +| Property | Type | Description +|scrapeInterval|string|Configures the default interval between consecutive scrapes in case the `ServiceMonitor` or `PodMonitor` resource does not specify any value. The interval must be set between 5 seconds and 5 minutes. The value can be expressed in: seconds (for example `30s`), minutes (for example `1m`) or a mix of minutes and seconds (for example `1m30s`). The default value is `30s`. + +|evaluationInterval|string|Configures the default interval between rule evaluations in case the `PrometheusRule` resource does not specify any value. The interval must be set between 5 seconds and 5 minutes. The value can be expressed in: seconds (for example `30s`), minutes (for example `1m`) or a mix of minutes and seconds (for example `1m30s`). It only applies to `PrometheusRule` resources with the `openshift.io/prometheus-rule-evaluation-scope=\"leaf-prometheus\"` label. The default value is `30s`. + +|additionalAlertmanagerConfigs|[]link:#additionalalertmanagerconfig[AdditionalAlertmanagerConfig]|Configures additional Alertmanager instances that receive alerts from the Prometheus component. By default, no additional Alertmanager instances are configured. + +|enforcedLabelLimit|*uint64|Specifies a per-scrape limit on the number of labels accepted for a sample. If the number of labels exceeds this limit after metric relabeling, the entire scrape is treated as failed. The default value is `0`, which means that no limit is set. + +|enforcedLabelNameLengthLimit|*uint64|Specifies a per-scrape limit on the length of a label name for a sample. If the length of a label name exceeds this limit after metric relabeling, the entire scrape is treated as failed. The default value is `0`, which means that no limit is set. + +|enforcedLabelValueLengthLimit|*uint64|Specifies a per-scrape limit on the length of a label value for a sample. If the length of a label value exceeds this limit after metric relabeling, the entire scrape is treated as failed. The default value is `0`, which means that no limit is set. + +|enforcedSampleLimit|*uint64|Specifies a global limit on the number of scraped samples that will be accepted. This setting overrides the `SampleLimit` value set in any user-defined `ServiceMonitor` or `PodMonitor` object if the value is greater than `enforcedTargetLimit`. Administrators can use this setting to keep the overall number of samples under control. The default value is `0`, which means that no limit is set. + +|enforcedTargetLimit|*uint64|Specifies a global limit on the number of scraped targets. This setting overrides the `TargetLimit` value set in any user-defined `ServiceMonitor` or `PodMonitor` object if the value is greater than `enforcedSampleLimit`. Administrators can use this setting to keep the overall number of targets under control. The default value is `0`. + +|externalLabels|ExternalLabels|Defines labels to be added to any time series or alerts when communicating with external systems such as federation, remote storage, and Alertmanager. The type is map[string]string. By default, no labels are added. + +|logLevel|string|Defines the log level setting for Prometheus. The possible values are `error`, `warn`, `info`, and `debug`. The default setting is `info`. + +|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. + +|queryLogFile|string|Specifies the file to which PromQL queries are logged. This setting can be either a filename, in which case the queries are saved to an `emptyDir` volume at `/var/log/prometheus`, or a full path to a location where an `emptyDir` volume will be mounted and the queries saved. Writing to `/dev/stderr`, `/dev/stdout` or `/dev/null` is supported, but writing to any other `/dev/` path is not supported. Relative paths are also not supported. By default, PromQL queries are not logged. + +|remoteWrite|[]link:#remotewritespec[RemoteWriteSpec]|Defines the remote write configuration, including URL, authentication, and relabeling settings. + +|resources|*v1.ResourceRequirements|Defines resource requests and limits for the Prometheus container. + +|retention|string|Defines the duration for which Prometheus retains data. This definition must be specified using the following regular expression pattern: `[0-9]+(ms\|s\|m\|h\|d\|w\|y)` (ms = milliseconds, s= seconds,m = minutes, h = hours, d = days, w = weeks, y = years). The default value is `24h`. + +|retentionSize|string|Defines the maximum amount of disk space used by data blocks plus the write-ahead log (WAL). Supported values are `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`, `EB`, and `EiB`. The default value is `nil`. + +|tolerations|[]v1.Toleration|Defines tolerations for the pods. + +|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines the pod's topology spread constraints. + +|volumeClaimTemplate|*monv1.EmbeddedPersistentVolumeClaim|Defines persistent storage for Prometheus. Use this setting to configure the storage class and size of a volume. + +|=== + +== RemoteWriteSpec + +Description:: +The `RemoteWriteSpec` resource defines the settings for remote write storage. + +Required:: +* `url` + +Appears in: link:#prometheusk8sconfig[PrometheusK8sConfig], +link:#prometheusrestrictedconfig[PrometheusRestrictedConfig] + +[options="header"] +|=== +| Property | Type | Description +|authorization|*monv1.SafeAuthorization|Defines the authorization settings for remote write storage. + +|basicAuth|*monv1.BasicAuth|Defines Basic authentication settings for the remote write endpoint URL. + +|bearerTokenFile|string|Defines the file that contains the bearer token for the remote write endpoint. However, because you cannot mount secrets in a pod, in practice you can only reference the token of the service account. + +|headers|map[string]string|Specifies the custom HTTP headers to be sent along with each remote write request. Headers set by Prometheus cannot be overwritten. + +|metadataConfig|*monv1.MetadataConfig|Defines settings for sending series metadata to remote write storage. + +|name|string|Defines the name of the remote write queue. This name is used in metrics and logging to differentiate queues. If specified, this name must be unique. + +|oauth2|*monv1.OAuth2|Defines OAuth2 authentication settings for the remote write endpoint. + +|proxyUrl|string|Defines an optional proxy URL. If the cluster-wide proxy is enabled, it replaces the proxyUrl setting. The cluster-wide proxy supports both HTTP and HTTPS proxies, with HTTPS taking precedence. + +|queueConfig|*monv1.QueueConfig|Allows tuning configuration for remote write queue parameters. + +|remoteTimeout|string|Defines the timeout value for requests to the remote write endpoint. + +|sendExemplars|*bool|Enables sending exemplars via remote write. When enabled, this setting configures Prometheus to store a maximum of 100,000 exemplars in memory. +This setting only applies to user-defined monitoring and is not applicable to core platform monitoring. + +|sigv4|*monv1.Sigv4|Defines AWS Signature Version 4 authentication settings. + +|tlsConfig|*monv1.SafeTLSConfig|Defines TLS authentication settings for the remote write endpoint. + +|url|string|Defines the URL of the remote write endpoint to which samples will be sent. + +|writeRelabelConfigs|[]monv1.RelabelConfig|Defines the list of remote write relabel configurations. + +|=== + +== TLSConfig + +Description:: +The `TLSConfig` resource configures the settings for TLS connections. + +Required:: +* `insecureSkipVerify` + +Appears in: link:#additionalalertmanagerconfig[AdditionalAlertmanagerConfig] + +[options="header"] +|=== +| Property | Type | Description +|ca|*v1.SecretKeySelector|Defines the secret key reference containing the Certificate Authority (CA) to use for the remote host. + +|cert|*v1.SecretKeySelector|Defines the secret key reference containing the public certificate to use for the remote host. + +|key|*v1.SecretKeySelector|Defines the secret key reference containing the private key to use for the remote host. + +|serverName|string|Used to verify the hostname on the returned certificate. + +|insecureSkipVerify|bool|When set to `true`, disables the verification of the remote host's certificate and name. + +|=== + +== TelemeterClientConfig + +Description:: +`TelemeterClientConfig` defines settings for the Telemeter Client component. + +Required:: +* `nodeSelector` +* `tolerations` + +Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] + +[options="header"] +|=== +| Property | Type | Description +|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. + +|resources|*v1.ResourceRequirements|Defines resource requests and limits for the `TelemeterClient` container. + +|tolerations|[]v1.Toleration|Defines tolerations for the pods. + +|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines the pod's topology spread constraints. + +|=== + +== ThanosQuerierConfig + +Description:: +The `ThanosQuerierConfig` resource defines settings for the Thanos Querier component. + +Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] + +[options="header"] +|=== +| Property | Type | Description +|enableRequestLogging|bool|A Boolean flag that enables or disables request logging. The default value is `false`. + +|logLevel|string|Defines the log level setting for Thanos Querier. The possible values are `error`, `warn`, `info`, and `debug`. The default value is `info`. + +|enableCORS|bool|A Boolean flag that enables setting CORS headers. The headers allow access from any origin. The default value is `false`. + +|nodeSelector|map[string]string|Defines the nodes on which the pods are scheduled. + +|resources|*v1.ResourceRequirements|Defines resource requests and limits for the Thanos Querier container. + +|tolerations|[]v1.Toleration|Defines tolerations for the pods. + +|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines the pod's topology spread constraints. + +|=== + +== ThanosRulerConfig + +Description:: +The `ThanosRulerConfig` resource defines configuration for the Thanos Ruler instance for user-defined projects. + +Appears in: link:#userworkloadconfiguration[UserWorkloadConfiguration] + +[options="header"] +|=== +| Property | Type | Description +|additionalAlertmanagerConfigs|[]link:#additionalalertmanagerconfig[AdditionalAlertmanagerConfig]|Configures how the Thanos Ruler component communicates with additional Alertmanager instances. The Cluster Monitoring Operator reads the cluster-wide proxy settings and configures the appropriate proxy URL for the Alertmanager endpoints. All Alertmanager endpoints in this group are expected to use the same proxy URL. Endpoints that bypass the cluster proxy should be placed in a separate group. The default value is `nil`. + +|evaluationInterval|string|Configures the default interval between Prometheus rule evaluations in case the `PrometheusRule` resource does not specify any value. The interval must be set between 5 seconds and 5 minutes. The value can be expressed in: seconds (for example `30s`), minutes (for example `1m`) or a mix of minutes and seconds (for example `1m30s`). It applies to `PrometheusRule` resources without the `openshift.io/prometheus-rule-evaluation-scope=\"leaf-prometheus\"` label. The default value is `15s`. + +|logLevel|string|Defines the log level setting for Thanos Ruler. The possible values are `error`, `warn`, `info`, and `debug`. The default value is `info`. + +|nodeSelector|map[string]string|Defines the nodes on which the Pods are scheduled. + +|resources|*v1.ResourceRequirements|Defines resource requests and limits for the Alertmanager container. + +|retention|string|Defines the duration for which Prometheus retains data. This definition must be specified using the following regular expression pattern: `[0-9]+(ms\|s\|m\|h\|d\|w\|y)` (ms = milliseconds, s= seconds,m = minutes, h = hours, d = days, w = weeks, y = years). The default value is `24h`. + +|tolerations|[]v1.Toleration|Defines tolerations for the pods. + +|topologySpreadConstraints|[]v1.TopologySpreadConstraint|Defines the pod's topology spread constraints. + +|volumeClaimTemplate|*monv1.EmbeddedPersistentVolumeClaim|Defines persistent storage for Thanos Ruler. Use this setting to configure the storage class and size of a volume. + +|=== + +== UserWorkloadConfig + +Description:: +The `UserWorkloadConfig` resource defines settings for the monitoring of user-defined projects. + +Appears in: link:#clustermonitoringconfiguration[ClusterMonitoringConfiguration] + +[options="header"] +|=== +| Property | Type | Description +|rulesWithoutLabelEnforcementAllowed|*bool|A Boolean flag that enables or disables the ability to deploy user-defined `PrometheusRules` objects for which the `namespace` label is not enforced to the namespace of the object. Such objects should be created in a namespace configured under the `namespacesWithoutLabelEnforcement` property of the `UserWorkloadConfiguration` resource. The default value is `true`. + +|=== + +== UserWorkloadConfiguration + +Description:: +The `UserWorkloadConfiguration` resource defines the settings responsible for user-defined projects in the `user-workload-monitoring-config` config map in the `openshift-user-workload-monitoring` namespace. You can only enable `UserWorkloadConfiguration` after you have set `enableUserWorkload` to `true` in the `cluster-monitoring-config` config map under the `openshift-monitoring` namespace. + +[options="header"] +|=== +| Property | Type | Description +|alertmanager|*link:#alertmanageruserworkloadconfig[AlertmanagerUserWorkloadConfig]|Defines the settings for the Alertmanager component in user workload monitoring. + +|prometheus|*link:#prometheusrestrictedconfig[PrometheusRestrictedConfig]|Defines the settings for the Prometheus component in user workload monitoring. + +|prometheusOperator|*link:#prometheusoperatorconfig[PrometheusOperatorConfig]|Defines the settings for the Prometheus Operator component in user workload monitoring. + +|thanosRuler|*link:#thanosrulerconfig[ThanosRulerConfig]|Defines the settings for the Thanos Ruler component in user workload monitoring. + +|namespacesWithoutLabelEnforcement|[]string|Defines the list of namespaces for which Prometheus and Thanos Ruler in user-defined monitoring do not enforce the `namespace` label value in `PrometheusRule` objects. + +The `namespacesWithoutLabelEnforcement` property allows users to define recording and alerting rules that can query across multiple projects (not limited to user-defined projects) instead of deploying identical `PrometheusRule` objects in each user project. + +To make the resulting alerts and metrics visible to project users, the query expressions should return a `namespace` label with a non-empty value. + +|===