title | description | ms.topic | ms.custom | ms.date | ms.reviewer |
---|---|---|---|---|---|
Custom metrics collected by Container insights |
Describes the custom metrics collected for a Kubernetes cluster by Container insights in Azure Monitor. |
conceptual |
devx-track-azurecli |
02/15/2024 |
viviandiec |
Container insights collects custom metrics from Azure Kubernetes Service (AKS) and Azure Arc-enabled Kubernetes cluster nodes and pods. With custom metrics, you can:
- Present timely aggregate calculations (average, count, maximum, minimum, and sum) in performance charts.
- Pin performance charts in Azure portal dashboards.
- Take advantage of metric alerts.
Important
These metrics will no longer be collected starting May 31, 2024 as described in Container insights recommended alerts (custom metrics) (preview) retirement moving up to 31 May 2024. See Enable Prometheus and Grafana to enable collection of Prometheus metrics.
Custom metrics collected by Container insights can be accessed with the same methods as custom metrics collected from other data sources, including metrics explorer and metrics alerts.
The following sections describe the metric values collected for your cluster.
Namespace: Insights.container/nodes
Dimensions: host
Metric | Description |
---|---|
cpuUsageMillicores | CPU utilization in millicores by host. |
cpuUsagePercentage, cpuUsageAllocatablePercentage (preview) | CPU usage percentage by node and allocatable, respectively. |
memoryRssBytes | Memory RSS utilization in bytes by host. |
memoryRssPercentage, memoryRssAllocatablePercentage (preview) | Memory RSS usage percentage by host and allocatable, respectively. |
memoryWorkingSetBytes | Memory Working Set utilization in bytes by host. |
memoryWorkingSetPercentage, memoryRssAllocatablePercentage (preview) | Memory Working Set usage percentage by host and allocatable, respectively. |
nodesCount | Count of nodes by status. |
diskUsedPercentage | Percentage of disk used on the node by device. |
Namespace: Insights.container/pods
Dimensions: controllerName
, Kubernetes namespace
Metric | Description |
---|---|
podCount | Count of pods by controller, namespace, node, and phase. |
completedJobsCount | Completed jobs count older user configurable threshold (default is six hours) by controller, Kubernetes namespace. |
restartingContainerCount | Count of container restarts by controller and Kubernetes namespace. |
oomKilledContainerCount | Count of OOMkilled containers by controller and Kubernetes namespace. |
podReadyPercentage | Percentage of pods in ready state by controller and Kubernetes namespace. |
Namespace: Insights.container/containers
Dimensions: containerName
, controllerName
, Kubernetes namespace
, podName
Metric | Description |
---|---|
(Old)cpuExceededPercentage | CPU utilization percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, and pod name. Collected |
(New)cpuThresholdViolated | Metric triggered when CPU utilization percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, and pod name. Collected |
(Old)memoryRssExceededPercentage | Memory RSS percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, and pod name. |
(New)memoryRssThresholdViolated | Metric triggered when Memory RSS percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, and pod name. |
(Old)memoryWorkingSetExceededPercentage | Memory Working Set percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, and pod name. |
(New)memoryWorkingSetThresholdViolated | Metric triggered when Memory Working Set percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, and pod name. |
Namespace: Insights.container/persistentvolumes
Dimensions: kubernetesNamespace
, node
, podName
, volumeName
Metric | Description |
---|---|
(Old)pvUsageExceededPercentage | Persistent volume (PV) utilization percentage for persistent volumes exceeding user configurable threshold (default is 60.0) by claim name, Kubernetes namespace, volume name, pod name, and node name. |
(New)pvUsageThresholdViolated | Metric triggered when PV utilization percentage for persistent volumes exceeding user configurable threshold (default is 60.0) by claim name, Kubernetes namespace, volume name, pod name, and node name. |
If your cluster uses managed identity authentication for Container insights, custom metrics will be enabled for you. If not, you need to enable custom metrics by using one of the following methods.
This process assigns the Monitoring Metrics Publisher role to the cluster's service principal. Monitoring Metrics Publisher has permission only to push metrics to the resource. It can't alter any state, update the resource, or read any data. For more information, see Monitoring Metrics Publisher role. The Monitoring Metrics Publisher role requirement doesn't apply to Azure Arc-enabled Kubernetes clusters.
Before you update your cluster, confirm that you're a member of the Owner role on the AKS cluster resource to enable collection of custom performance metrics for nodes and pods. This requirement doesn't apply to Azure Arc-enabled Kubernetes clusters.
Use one of the following methods to enable custom metrics for either a single cluster or all clusters in your subscription.
-
Select the Insights menu for the cluster in the Azure portal.
-
On the banner that appears at the top of the pane, select Enable to start the update.
:::image type="content" source="./media/container-insights-update-metrics/portal-banner-enable-01.png" alt-text="Screenshot that shows the Azure portal with the banner for upgrading an AKS cluster." lightbox="media/container-insights-update-metrics/portal-banner-enable-01.png":::
The process can take several seconds to finish. You can track its progress under Notifications from the menu.
In the following command, edit the values for subscriptionId
, resourceGroupName
, and clusterName
by using the values on the AKS Overview page for the AKS cluster. The value of clientIdOfSPN
is returned when you run the command az aks show
.
az login
az account set --subscription "<subscriptionName>"
az aks show -g <resourceGroupName> -n <clusterName> --query "servicePrincipalProfile"
az aks show -g <resourceGroupName> -n <clusterName> --query "addonProfiles.omsagent.identity"
az role assignment create --assignee <clientIdOfSPN> --scope <clusterResourceId> --role "Monitoring Metrics Publisher"
To get the value for clientIdOfSPNOrMsi
, run the command az aks show
as shown in the following example. If the servicePrincipalProfile
object has a valid objectid
value, you can use that. Otherwise, if it's set to msi
, pass in the object ID from addonProfiles.omsagent.identity.objectId
.
az login
az account set --subscription "<subscriptionName>"
az aks show -g <resourceGroupName> -n <clusterName> --query "servicePrincipalProfile"
az aks show -g <resourceGroupName> -n <clusterName> --query "addonProfiles.omsagent.identity"
az role assignment create --assignee <clientIdOfSPNOrMsi> --scope <clusterResourceId> --role "Monitoring Metrics Publisher"
Note
If you want to perform the role assignment with your user account, use the --assignee
parameter as shown in the example. If you want to perform the role assignment with a service principal name (SPN), use the --assignee-object-id
and --assignee-principal-type
parameters instead of the --assignee
parameter.
Run the following command to update all clusters in your subscription. Edit the value for subscriptionId
by using the value on the AKS Overview page for the AKS cluster.
az login
az account set --subscription "Subscription Name"
curl -sL https://aka.ms/ci-md-onboard-atscale | bash -s subscriptionId
The configuration change can take a few seconds to finish. When it's finished, a message like the following one appears and includes the result:
completed role assignments for all AKS clusters in subscription: <subscriptionId>
To enable custom metrics for a specific cluster:
-
Download the mdm_onboarding.ps1 script from GitHub and save it to a local folder.
-
Run the following command. Edit the values for
subscriptionId
,resourceGroupName
, andclusterName
by using the values on the AKS Overview page for the AKS cluster..\mdm_onboarding.ps1 subscriptionId <subscriptionId> resourceGroupName <resourceGroupName> clusterName <clusterName>
The configuration change can take a few seconds to finish. When it's finished, a message like the following one appears and includes the result:
Successfully added Monitoring Metrics Publisher role assignment to cluster : <clusterName>
To enable custom metrics for all clusters in your subscription:
-
Download the mdm_onboarding_atscale.ps1 script from GitHub and save it to a local folder.
-
Run the following command. Edit the value for
subscriptionId
by using the value on the AKS Overview page for the AKS cluster..\mdm_onboarding_atscale.ps1 subscriptionId
The configuration change can take a few seconds to finish. When it's finished, a message like the following one appears and includes the result:
Completed adding role assignment for the aks clusters in subscriptionId :<subscriptionId>
To verify that custom metrics are enabled, open metrics explorer and verify from Metric namespace that insights is listed.