Skip to content

chore(observability): fix cpu core usage gauge#2285

Merged
nevermarine merged 4 commits into
mainfrom
chore/observability/fix-cpu-metrics
Apr 28, 2026
Merged

chore(observability): fix cpu core usage gauge#2285
nevermarine merged 4 commits into
mainfrom
chore/observability/fix-cpu-metrics

Conversation

@nevermarine
Copy link
Copy Markdown
Collaborator

@nevermarine nevermarine commented Apr 27, 2026

Description

Adjust the VM CPU usage dashboard query to use the hypervisor CPU usage metric and deduplicate replicated virtualmachine_cpu_cores series exposed by multiple virtualization-controller pods in HA clusters. Also update the panel description to remove the CPU reservation mention.

Why do we need it, and what problem does it solve?

In multi-master clusters, d8_virtualization_virtualmachine_cpu_cores is exposed by each virtualization-controller replica. Using the raw gauge in the dashboard query can multiply the denominator depending on the number of controller pods and show misleading CPU usage values. This change makes the panel resilient to duplicated HA scrape targets and aligns the calculation with the hypervisor CPU usage metric.

What is the expected result?

  1. Open the VM dashboard for a virtual machine in a multi-master cluster.
  2. Check the CPU usage panel.
  3. Verify that the percentage is no longer affected by the number of virtualization-controller replicas.
  4. Verify that single-master clusters continue to show the expected value.

Checklist

  • The code is covered by unit tests.
  • e2e tests passed.
  • Documentation updated according to the changes.
  • Changes were tested in the Kubernetes cluster manually.

Changelog entries

section: observability
type: fix
summary: "Fix VM CPU usage dashboard calculation in HA clusters by deduplicating replicated controller metrics."
impact_level: low

Signed-off-by: Maksim Fedotov <maksim.fedotov@flant.com>
@nevermarine nevermarine added this to the v1.9.0 milestone Apr 27, 2026
@nevermarine nevermarine marked this pull request as ready for review April 27, 2026 15:29
Signed-off-by: Maksim Fedotov <maksim.fedotov@flant.com>
Signed-off-by: Maksim Fedotov <maksim.fedotov@flant.com>
Signed-off-by: Maksim Fedotov <maksim.fedotov@flant.com>
@nevermarine nevermarine modified the milestones: v1.9.0, v1.8.1 Apr 28, 2026
@nevermarine nevermarine merged commit 6d67100 into main Apr 28, 2026
43 of 51 checks passed
@nevermarine nevermarine deleted the chore/observability/fix-cpu-metrics branch April 28, 2026 14:21
deckhouse-BOaTswain added a commit that referenced this pull request Apr 28, 2026
chore(observability): fix cpu core usage gauge (#2285)

Signed-off-by: Maksim Fedotov <maksim.fedotov@flant.com>
Co-authored-by: Maksim Fedotov <maksim.fedotov@flant.com>
@deckhouse-BOaTswain
Copy link
Copy Markdown
Contributor

Cherry pick PR 2295 to the branch release-1.8 successful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants