Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect VMI OS info from the Guest agent #11283

Merged
merged 1 commit into from Apr 16, 2024

Conversation

assafad
Copy link
Contributor

@assafad assafad commented Feb 18, 2024

What this PR does

Before this PR:
No guest agent info was collected and exposed through vmi metrics.
After this PR:
kernelRelease, machine, name and versionId fields from vmi.status.guestOsInfo are collected and added as labels to kubevirt_vmi_phase_count metric.

Fixes # https://issues.redhat.com/browse/CNV-37369

Why we need it and why it was done in this way

We would like to have more visibility regarding VMIs OS, so we can differentiate between running VMs, and alert for specific OS issues (e.g. https://issues.redhat.com/browse/CNV-38482).

Special notes for your reviewer

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Release note

Collect VMI OS info from the Guest agent as `kubevirt_vmi_phase_count` metric labels

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/M area/monitoring labels Feb 18, 2024
@assafad assafad force-pushed the guest-agent-info branch 2 times, most recently from 5150c9f to 83b9c16 Compare February 18, 2024 11:35
@sradco
Copy link
Contributor

sradco commented Feb 19, 2024

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Feb 19, 2024
@assafad
Copy link
Contributor Author

assafad commented Feb 19, 2024

@enp0s3 can you please take a look?

@fabiand
Copy link
Member

fabiand commented Feb 19, 2024

Hey.

Many of our metrics are limited to infra aspects.
This metric change captures my attention, because we start to disclose details about the guest.

This can be potentially security relevant information, thus:

  1. What oter metrics do we have that are exporting guest related informations?
  2. Did we consider to add a cluster level knob to disable all guest related exposure via metrics?

To me metrics stand out (compared to i.e. CR level details), because metrics are designed to be formwared to remote systems, and it's much easier to - unintentionally - leak guest related informations.

/hold

to give us time to answer these questions.

@kubevirt-bot kubevirt-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 19, 2024
@assafad
Copy link
Contributor Author

assafad commented Feb 20, 2024

Hey.

Many of our metrics are limited to infra aspects. This metric change captures my attention, because we start to disclose details about the guest.

This can be potentially security relevant information, thus:

  1. What oter metrics do we have that are exporting guest related informations?
  2. Did we consider to add a cluster level knob to disable all guest related exposure via metrics?

To me metrics stand out (compared to i.e. CR level details), because metrics are designed to be formwared to remote systems, and it's much easier to - unintentionally - leak guest related informations.

/hold

to give us time to answer these questions.

@fabiand Hi, I'm not aware of any metrics exposing guest related information yet, thus we didn't consider implementing such knob. Are there specific GuestOSInfo fields exposed by this PR that you find problematic, or is it a general concern?

@fabiand
Copy link
Member

fabiand commented Feb 20, 2024

It's a general concern - by exposing any information about the guest we cross this boundary.

Thus to me this PR should be extended with a cluster level knob in order to turn off all guest related metric reporting.

@kubevirt-bot kubevirt-bot removed the lgtm Indicates that a PR is ready to be merged. label Feb 28, 2024
@fabiand
Copy link
Member

fabiand commented Apr 7, 2024

/unhold

Usually metric system already require authentication, aka not permitting arbitrary users to view metrics.
Filtering can also be applied when metrics are forwarded to an out-of-cluster metrics collection instance.

This should be good enough.

@kubevirt-bot kubevirt-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 7, 2024
@assafad
Copy link
Contributor Author

assafad commented Apr 7, 2024

/retest

@assafad
Copy link
Contributor Author

assafad commented Apr 8, 2024

/retest

@assafad assafad marked this pull request as draft April 9, 2024 10:41
@kubevirt-bot kubevirt-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 9, 2024
@assafad assafad force-pushed the guest-agent-info branch 6 times, most recently from 297c2a6 to 8115c2e Compare April 10, 2024 14:07
@machadovilaca
Copy link
Member

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 10, 2024
It("should have kubevirt_vmi_phase_count correctly configured with guest OS labels", func() {
agentVMI := createAgentVMI()
labels := map[string]string{
"guest_os_kernel_release": agentVMI.Status.GuestOSInfo.KernelRelease,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@assafad Can you please assert on non-empty strings on each one of the newly added fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Signed-off-by: assafad <aadmi@redhat.com>
@kubevirt-bot kubevirt-bot added sig/observability Denotes an issue or PR that relates to observability. and removed lgtm Indicates that a PR is ready to be merged. labels Apr 15, 2024
@enp0s3
Copy link
Contributor

enp0s3 commented Apr 16, 2024

/approve

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enp0s3

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 16, 2024
@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 16, 2024
@kubevirt-commenter-bot
Copy link

Required labels detected, running phase 2 presubmits:
/test pull-kubevirt-e2e-windows2016
/test pull-kubevirt-e2e-kind-1.27-vgpu
/test pull-kubevirt-e2e-kind-sriov
/test pull-kubevirt-e2e-k8s-1.29-ipv6-sig-network
/test pull-kubevirt-e2e-k8s-1.27-sig-network
/test pull-kubevirt-e2e-k8s-1.27-sig-storage
/test pull-kubevirt-e2e-k8s-1.27-sig-compute
/test pull-kubevirt-e2e-k8s-1.27-sig-operator
/test pull-kubevirt-e2e-k8s-1.28-sig-network
/test pull-kubevirt-e2e-k8s-1.28-sig-storage
/test pull-kubevirt-e2e-k8s-1.28-sig-compute
/test pull-kubevirt-e2e-k8s-1.28-sig-operator

@assafad
Copy link
Contributor Author

assafad commented Apr 16, 2024

/retest

@kubevirt-bot kubevirt-bot merged commit deaeac7 into kubevirt:main Apr 16, 2024
39 checks passed
@assafad
Copy link
Contributor Author

assafad commented Apr 16, 2024

/cherry-pick release-1.2

@kubevirt-bot
Copy link
Contributor

@assafad: new pull request created: #11717

In response to this:

/cherry-pick release-1.2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/monitoring dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/buildsystem Denotes an issue or PR that relates to changes in the build system. sig/observability Denotes an issue or PR that relates to observability. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants