New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sriov support #45
Sriov support #45
Conversation
We should review this post v1.1.2 |
4e48255
to
a07c314
Compare
I'm reviewing this now, and I'll take care of the merge conflicts and push up the rebase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a
- Rebase and resolve conflicts
- Renaming typos (srio ⟶ sriov) and capitalization (Sriov ⟶ SRIOV)
- Abstract over SRIOV sysfs interface to accomodate
max_vfs
kernel parameter
31eae85
to
26efc70
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
26efc70
to
43b6739
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
a9aac71
to
aad477a
Compare
validation webhook now prevents PCIDeviceClaim from being disabled if it is already in use by a workload related issue: harvester/harvester#3771 |
validation webhook prevents pcideviceclaim to be created if pcidevice has no iommuGroup information. This avoids spamming the pcidevice controller with failed reconciles as binding to vfio-pci driver will not be successful without a iommu group. related issue: harvester/harvester#3833 |
@ibrokethecloud Interesting, let me go find out how IOMMU Groups and SR-IOV VFs interact. |
Added additional check in nichelper to skip non pci nics. This will address: harvester/harvester#3650 |
The new change lgtm! |
cf2e75d
to
e57c88b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, and please help to squash the commits, thanks.
SriovDevice is used to identify and record all sr-iov capable nics across the cluster. End users can control the enable / disable / configuration of VF's of underlying hardware by interfacting with corresponding object. Once enabled, the configured VF's show up as normal PCI devices available for pass-through to workload VM's. Node type is just used to create an object corresponding to each node in the cluster and is used via the controller requeue mechanism to trigger recurring reconciles of the PCI and SRIOV enabled devices. node controller registration renamed sriovdevices crd to sriovnetworkdevices changes to naming of methods and objects handle changes to skip pcibridge devices during node reconcile drop unwanted imports fixed unit tests for pcidevice controller updated dapper to use downloads.opensuse.org, and fixed trivy ci failures minor change to status updates when sriov device is enabled changed test script copy VFList to allow override of default sysPciBus path for tests with umockdev renamed sriov handler based on codefactor report use /host/proc to query host network ns without host networking addition of validating webhooks and changes to sriov device status moved vm lookup to use lists rather than custom index removed duplicates from Dockerfile.dapper label pcidevices with sriov details, skip non pcidevice nics cleanup based on pr feedback moved NODE_NAME to env variable and cleaned up nodename key references added debug logging for pcidevices label lookup and sriov webhook
fd6b334
to
ea57ed3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Thanks
PR introduces new SriovDevice and Node types.
IMPORTANT: Please do not create a Pull Request without creating an issue first.
Problem:
Support for sriov in harvester
Solution:
SriovDevice is used to identify and record all sr-iov capable nics across the cluster.
End users can control the enable / disable / configuration of VF's of underlying hardware by interfacting with corresponding object.
Once enabled, the configured VF's show up as normal PCI devices available for pass-through to workload VM's.
Node type is just used to create an object corresponding to each node in the cluster and is used via the controller requeue mechanism to trigger recurring reconciles of the PCI and SRIOV enabled devices.
Related Issue:
harvester/harvester#2763
Test plan: