Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

device manager: externally provided mediated devices should not be removed #9250

Merged
merged 3 commits into from
Feb 23, 2023

Conversation

vladikr
Copy link
Member

@vladikr vladikr commented Feb 15, 2023

What this PR does / why we need it:

virt-handler is deleting any mediated device that is created on the system even if it explicitly
configured as an externally provided resource.

permittedHostDevices:
  mediatedDevices:
  - externalResourceProvider: true
    mdevNameSelector: NVIDIA A10-24Q
    resourceName: nvidia.com/NVIDIA_A10-24Q

This PR prevents the deletion of externally configured mediated devices.

Mediated devices handling can be disabled altogether by setting a DisableMDEVConfiguration feature gate.
In this case, no mediated devices will be created or removed.

apiVersion: kubevirt.io/v1
kind: KubeVirt
metadata:
  name: kubevirt
  namespace: kubevirt
spec:
  configuration:
    developerConfiguration: 
      featureGates:
        - DisableMDEVConfiguration
        - .....
        - .....

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes # rhbz#2169880

Special notes for your reviewer:

Release note:

externally created mediated devices will not be deleted by virt-handler  

@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/S labels Feb 15, 2023
@vladikr vladikr marked this pull request as draft February 15, 2023 04:23
@kubevirt-bot kubevirt-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 15, 2023
@vladikr
Copy link
Member Author

vladikr commented Feb 15, 2023

/cc @cdesiniotis

@jean-edouard FYI

@kubevirt-bot
Copy link
Contributor

@vladikr: GitHub didn't allow me to request PR reviews from the following users: cdesiniotis.

Note that only kubevirt members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @cdesiniotis @jean-edouard

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@jean-edouard jean-edouard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for tackling this, and well done finding the offending code!
Since this is a draft I'm not sure it's ready for reviews, but see comment below if applicable.

}
removeUndesiredMDEVs(desiredTypesMap)

removeUndesiredMDEVs(doNotRemoveTypesMap)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is hard to read, and this call is particularly confusing (remove(doNotRemove)).
It's probably worth adding to the comment above to describe what's happening.

@cdesiniotis
Copy link
Contributor

@vladikr I have not closely reviewed this yet; however, does this also address the scenario where mdevs have been created on the system but are not listed under permittedHostDevices.mediatedDevices at all?

@@ -275,15 +275,31 @@ func (c *DeviceController) RefreshMediatedDeviceTypes() {
}()
}

func (c *DeviceController) getExternallyProvidedMdevs() map[string]struct{} {
externalMdevResourcesMap := make(map[string]struct{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use sets.NewString here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point @PiotrProkop
However the rest of the file uses similar make() call, let's keep it like that for consistency.
Maybe we should address that for the whole file/project in a separate PR?

func (c *DeviceController) getExternallyProvidedMdevs() map[string]struct{} {
externalMdevResourcesMap := make(map[string]struct{})
hostDevs := c.virtConfig.GetPermittedHostDevices()
if len(hostDevs.MediatedDevices) != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can skip this if, the for loop will just iterate over empty map and end result will be the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment on lines 95 to 96
// the following will remove all configured types that have not been
// created by an external provider and are not in the desiredTypesMap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intendation is off

func (c *DeviceController) getExternallyProvidedMdevs() map[string]struct{} {
externalMdevResourcesMap := make(map[string]struct{})
if hostDevs := c.virtConfig.GetPermittedHostDevices(); hostDevs != nil {
if len(hostDevs.MediatedDevices) != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can skip this if, the for loop will just won't iterate over hostDevs.MediatedDevices when len == 0

@vladikr vladikr force-pushed the mdev_external branch 2 times, most recently from 6647d89 to 2872ac5 Compare February 17, 2023 23:01
@vladikr
Copy link
Member Author

vladikr commented Feb 17, 2023

/test all

@vladikr
Copy link
Member Author

vladikr commented Feb 18, 2023

/test pull-kubevirt-e2e-kind-1.23.vgpu

@kubevirt-bot
Copy link
Contributor

@vladikr: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test pull-kubevirt-apidocs
  • /test pull-kubevirt-build
  • /test pull-kubevirt-build-arm64
  • /test pull-kubevirt-check-unassigned-tests
  • /test pull-kubevirt-client-python
  • /test pull-kubevirt-e2e-k8s-1.24-operator
  • /test pull-kubevirt-e2e-k8s-1.24-operator-root
  • /test pull-kubevirt-e2e-k8s-1.24-sig-compute
  • /test pull-kubevirt-e2e-k8s-1.24-sig-compute-cgroupsv2
  • /test pull-kubevirt-e2e-k8s-1.24-sig-compute-root
  • /test pull-kubevirt-e2e-k8s-1.24-sig-network
  • /test pull-kubevirt-e2e-k8s-1.24-sig-network-root
  • /test pull-kubevirt-e2e-k8s-1.24-sig-storage
  • /test pull-kubevirt-e2e-k8s-1.24-sig-storage-cgroupsv2
  • /test pull-kubevirt-e2e-k8s-1.24-sig-storage-root
  • /test pull-kubevirt-e2e-k8s-1.25-ipv6-sig-network
  • /test pull-kubevirt-e2e-k8s-1.25-sig-compute
  • /test pull-kubevirt-e2e-k8s-1.25-sig-compute-cgroupsv2
  • /test pull-kubevirt-e2e-k8s-1.25-sig-compute-migrations
  • /test pull-kubevirt-e2e-k8s-1.25-sig-compute-migrations-root
  • /test pull-kubevirt-e2e-k8s-1.25-sig-network
  • /test pull-kubevirt-e2e-k8s-1.25-sig-operator
  • /test pull-kubevirt-e2e-k8s-1.25-sig-performance
  • /test pull-kubevirt-e2e-k8s-1.25-sig-storage
  • /test pull-kubevirt-e2e-k8s-1.25-sig-storage-cgroupsv2
  • /test pull-kubevirt-e2e-k8s-1.26-sig-compute
  • /test pull-kubevirt-e2e-k8s-1.26-sig-compute-cgroupsv2
  • /test pull-kubevirt-e2e-k8s-1.26-sig-monitoring
  • /test pull-kubevirt-e2e-k8s-1.26-sig-network
  • /test pull-kubevirt-e2e-k8s-1.26-sig-operator
  • /test pull-kubevirt-e2e-k8s-1.26-sig-storage
  • /test pull-kubevirt-e2e-k8s-1.26-sig-storage-cgroupsv2
  • /test pull-kubevirt-e2e-kind-1.23-sriov
  • /test pull-kubevirt-e2e-kind-1.23-vgpu
  • /test pull-kubevirt-e2e-windows2016
  • /test pull-kubevirt-generate
  • /test pull-kubevirt-manifests
  • /test pull-kubevirt-prom-rules-verify
  • /test pull-kubevirt-unit-test
  • /test pull-kubevirt-verify-go-mod
  • /test pull-kubevirtci-bump-kubevirt

The following commands are available to trigger optional jobs:

  • /test build-kubevirt-builder
  • /test pull-kubevirt-check-tests-for-flakes
  • /test pull-kubevirt-code-lint
  • /test pull-kubevirt-e2e-arm64
  • /test pull-kubevirt-e2e-k8s-1.24-sig-compute-realtime
  • /test pull-kubevirt-e2e-k8s-1.24-sig-compute-realtime-root
  • /test pull-kubevirt-e2e-k8s-1.25-fips-sig-compute
  • /test pull-kubevirt-e2e-k8s-1.26-ipv6-sig-network
  • /test pull-kubevirt-e2e-k8s-1.26-sev
  • /test pull-kubevirt-e2e-k8s-1.26-single-node
  • /test pull-kubevirt-e2e-k8s-1.26-swap-enabled
  • /test pull-kubevirt-e2e-kind-1.23-vgpu-root
  • /test pull-kubevirt-fossa
  • /test pull-kubevirt-gosec
  • /test pull-kubevirt-goveralls
  • /test pull-kubevirt-unit-test-arm64
  • /test pull-kubevirt-verify-rpms

Use /test all to run the following jobs that were automatically triggered:

  • pull-kubevirt-apidocs
  • pull-kubevirt-build
  • pull-kubevirt-build-arm64
  • pull-kubevirt-check-tests-for-flakes
  • pull-kubevirt-check-unassigned-tests
  • pull-kubevirt-client-python
  • pull-kubevirt-code-lint
  • pull-kubevirt-e2e-arm64
  • pull-kubevirt-e2e-k8s-1.24-operator
  • pull-kubevirt-e2e-k8s-1.24-operator-root
  • pull-kubevirt-e2e-k8s-1.24-sig-compute
  • pull-kubevirt-e2e-k8s-1.24-sig-compute-root
  • pull-kubevirt-e2e-k8s-1.24-sig-network
  • pull-kubevirt-e2e-k8s-1.24-sig-network-root
  • pull-kubevirt-e2e-k8s-1.24-sig-storage
  • pull-kubevirt-e2e-k8s-1.24-sig-storage-root
  • pull-kubevirt-e2e-k8s-1.25-ipv6-sig-network
  • pull-kubevirt-e2e-k8s-1.25-sig-compute
  • pull-kubevirt-e2e-k8s-1.25-sig-compute-migrations
  • pull-kubevirt-e2e-k8s-1.25-sig-compute-migrations-root
  • pull-kubevirt-e2e-k8s-1.25-sig-network
  • pull-kubevirt-e2e-k8s-1.25-sig-operator
  • pull-kubevirt-e2e-k8s-1.25-sig-performance
  • pull-kubevirt-e2e-k8s-1.25-sig-storage
  • pull-kubevirt-e2e-k8s-1.26-sig-compute
  • pull-kubevirt-e2e-k8s-1.26-sig-network
  • pull-kubevirt-e2e-k8s-1.26-sig-operator
  • pull-kubevirt-e2e-k8s-1.26-sig-storage
  • pull-kubevirt-e2e-kind-1.23-sriov
  • pull-kubevirt-e2e-kind-1.23-vgpu
  • pull-kubevirt-e2e-windows2016
  • pull-kubevirt-fossa
  • pull-kubevirt-generate
  • pull-kubevirt-goveralls
  • pull-kubevirt-manifests
  • pull-kubevirt-prom-rules-verify
  • pull-kubevirt-unit-test
  • pull-kubevirt-unit-test-arm64
  • pull-kubevirt-verify-go-mod

In response to this:

/test pull-kubevirt-e2e-kind-1.23.vgpu

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vladikr
Copy link
Member Author

vladikr commented Feb 18, 2023

/test pull-kubevirt-e2e-kind-1.23-vgpu

@vladikr
Copy link
Member Author

vladikr commented Feb 18, 2023

/test pull-kubevirt-e2e-kind-1.23-vgpu

1 similar comment
@vladikr
Copy link
Member Author

vladikr commented Feb 18, 2023

/test pull-kubevirt-e2e-kind-1.23-vgpu

@vladikr vladikr marked this pull request as draft February 21, 2023 02:17
@kubevirt-bot kubevirt-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 21, 2023
@vladikr
Copy link
Member Author

vladikr commented Feb 21, 2023

/test pull-kubevirt-e2e-kind-1.23-vgpu

Signed-off-by: Vladik Romanovsky <vromanso@redhat.com>
Signed-off-by: Vladik Romanovsky <vromanso@redhat.com>
Signed-off-by: Vladik Romanovsky <vromanso@redhat.com>
@vladikr
Copy link
Member Author

vladikr commented Feb 21, 2023

/test pull-kubevirt-e2e-kind-1.23-vgpu

@vladikr vladikr marked this pull request as ready for review February 21, 2023 04:00
@kubevirt-bot kubevirt-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 21, 2023
@kubevirt-bot
Copy link
Contributor

kubevirt-bot commented Feb 21, 2023

@vladikr: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubevirt-e2e-k8s-1.26-sig-operator-upgrade 8e1d43d link true /test pull-kubevirt-e2e-k8s-1.26-sig-operator-upgrade
pull-kubevirt-e2e-k8s-1.26-sig-operator-configuration 8e1d43d link true /test pull-kubevirt-e2e-k8s-1.26-sig-operator-configuration
pull-kubevirt-fossa 0b9f6e4 link false /test pull-kubevirt-fossa

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Contributor

@jean-edouard jean-edouard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jean-edouard

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 21, 2023
@acardace
Copy link
Member

/lgtm

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Feb 22, 2023
@fossedihelm
Copy link
Contributor

@vladikr can you edit the release note please? Thanks

@kubevirt-commenter-bot
Copy link

/retest-required
This bot automatically retries required jobs that failed/flaked on approved PRs.
Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@kubevirt-bot kubevirt-bot merged commit fe38373 into kubevirt:main Feb 23, 2023
@acardace
Copy link
Member

acardace commented Mar 1, 2023

/cherrypick release-0.59

@kubevirt-bot
Copy link
Contributor

@acardace: new pull request created: #9343

In response to this:

/cherrypick release-0.59

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@acardace
Copy link
Member

acardace commented May 3, 2023

/cherrypick release-0.58

@kubevirt-bot
Copy link
Contributor

@acardace: #9250 failed to apply on top of branch "release-0.58":

Applying: device manager: do not remove externally provided mdevs
Using index info to reconstruct a base tree...
M	pkg/virt-handler/device-manager/device_controller.go
M	pkg/virt-handler/device-manager/mediated_devices_types.go
M	pkg/virt-handler/device-manager/mediated_devices_types_test.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/virt-handler/device-manager/mediated_devices_types_test.go
Auto-merging pkg/virt-handler/device-manager/mediated_devices_types.go
Auto-merging pkg/virt-handler/device-manager/device_controller.go
CONFLICT (content): Merge conflict in pkg/virt-handler/device-manager/device_controller.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 device manager: do not remove externally provided mdevs
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherrypick release-0.58

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@acardace
Copy link
Member

acardace commented May 3, 2023

Created manual backport at #9690

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants