New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SRIOV support for non-network device (FEC) #196
SRIOV support for non-network device (FEC) #196
Conversation
…O device that support PCIe SIG SRIOV features. One such device is based on Intel® PAC N3000 FPGA which has Network Functionality of 2x25G and FPGA part that can be programmed to support hardware offload of the Wireless (4G/5G) Forward Error Correction (FEC). This Product has already been released by Intel and is being used/deployed by customers. Intel FPGA Device plugin (https://github.com/intel/intel-device-plugins-for-kubernetes/blob/master/cmd/fpga_plugin/README.md) does not support such devices as it dose not cover SRIOV devices using DPDK and also differ in the FPGA image that gets programmed. As part of this PR we are recommending to extend the SRIOV support for non-networking devices and also rename the device plugin from “sriov-network-device-plugin” to “sriov-device-plugin”. For the Intel® PAC N3000 FPGA supporting vRAN or Cloud-native RAN the FPGA PF that supports FEC is bound to DPDK igb_uio or vfio_pci. The Virtual Function resources are allocated to the PODs running RAN/FlexRAN like workload and uses vfio_pci.
+1 for the functionality :) |
// exclude device in-use in host | ||
if isDefaultRoute, _ := hasDefaultRoute(device.Address); !isDefaultRoute { | ||
// Network devices | ||
if devClass == netClass { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider refactoring the two branches into their own handler functions for increased readability
cmd/sriovdp/manager.go
Outdated
if newDevice, err := resources.NewPciNetDevice(device, rm.rFactory); err == nil { | ||
rm.netDeviceList = append(rm.netDeviceList, newDevice) | ||
//If dev ID = 0b30 it means it is a RSU device and not FEC | ||
if device.Product.ID != "0b30" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use a well named constant instead of hard-coding a hex in the middle of the code
BTW is this code manufacturer -i.e. Intel- specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to here it's the PCI device ID for the N3000 FPGA device.
It seems a little messy to need to be worrying about PCI device IDs in the middle of this code. Is there nothing else we can look at to determine the class of a device?
cmd/sriovdp/manager.go
Outdated
@@ -32,6 +32,7 @@ import ( | |||
const ( | |||
socketSuffix = "sock" | |||
netClass = 0x02 // Device class - Network controller. ref: https://pci-ids.ucw.cz/read/PD/02 (for Sub-Classes) | |||
fecClass = 0x12 // Device class - Processing accelerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename FecClass to a more generic name as this can appy vof devices exposing more generic baseband processing accelerator. "accClass" should be enough to distinguish from "netClass"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very valuable patch to make this plugin seamlessly more generic.
=> Fairly cosmetic comment below to not limit the change to "FEC" devices per se but more generally to lookaside processing accelerators. FEC is only one specific example of non-network devices seeing traction in the immediate short term.
Fixing hardcoded RSU device ID. Placing net dev and acc dev handling into two distinct functions.
Hi, @damiankopyto thanks for adding new device support! I have some basic questions regarding enabling this new type of device.
|
@@ -46,6 +48,7 @@ Network controller subclasses. ref: https://pci-ids.ucw.cz/read/PD/02 | |||
07 Infiniband controller | |||
08 Fabric controller | |||
80 Network controller | |||
12 Processing Accelerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is wrong. "12" is correct for the PCI device class for processing accelerators as specified at https://pci-ids.ucw.cz/read/PD/ and as I read it, the processing accelerator subclass should be 00 for this device.
However, "12" does not appear to be a valid option for a network controller subclass as described at the URL listed on line 40.
|
||
// Processing accelerators | ||
if devClass == accClass { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a little strange to have two totally separate if statements where only one of them can be true...normally I'd expect to see an "else if" here or even a switch statement.
glog.Warningf("discoverDevices(): unable to parse device ID for device %+v %q", dev, err) | ||
return | ||
} | ||
//Check if device ID = 0b30, it means it is a RSU (Remote System Update) device and not accelerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems messy to need to be worrying about vendor-specific PCI device IDs in the middle of this code. Is there nothing else we can look at to determine the class of a device, or to make this more generic so that adding other accelerators would be simpler?
netClass = 0x02 // Device class - Network controller. ref: https://pci-ids.ucw.cz/read/PD/02 (for Sub-Classes) | ||
netClass = 0x02 // Device class - Network controller. ref: https://pci-ids.ucw.cz/read/PD/02 (for Sub-Classes) | ||
accClass = 0x12 // Device class - Processing accelerator | ||
rsuDevice = 0x0b30 // RSU device - only used for programming the FPGA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "RSU" mean in this context, and do we need to document that somewhere? (Also at line 334.)
if newDevice.GetDriver() == "igb_uio" || newDevice.GetDriver() == "vfio-pci" { | ||
rm.netDeviceList = append(rm.netDeviceList, newDevice) | ||
} else { | ||
glog.Infof("Excluding FEC device %-12s because it is not bound to userspace driver.", dev.Address) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we were dropping the "FEC" terminology?
} | ||
|
||
if newDevice, err := resources.NewPciNetDevice(dev, rm.rFactory); err == nil { | ||
if newDevice.GetDriver() == "igb_uio" || newDevice.GetDriver() == "vfio-pci" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like more device-specific logic. How hard will it be to support other accelerators in the future?
} else { | ||
glog.Errorf("discoverDevices() error adding new device: %q", err) | ||
} | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to check if devID == rsuDevice { log and return } at line 318. I'm thinking of the (future) case of any other accelerator classed PCI devices that would not be appropriate to discover here. It might be useful to have a list of 'blacklisted' device Ids per class that can be looped through rather than individual checks.
We are currently assessing the impact of supporting this devices with this device plugin. What I see from this changes, this was quick and easy 'fix' to add support for this specific fpga device by adding them into the same global 'pciNetDevice' list. |
Found this link which answers some of my questions above. Paste it here in case anyone is interestedl in understanding detailed configuration needed for N3000 device, https://github.com/open-ness/specs/blob/master/doc/enhanced-platform-awareness/openness-fpga.md |
Please fix that sentence, it is incorrect. FPGA device plugin supports FPGA IP block of N3000 card in pre-programmed mode. Programming FPGA IP block is missing in the intel-fpga kernel driver for N3000, and as soon it will be added, FPGA plugin will be able to handle it in orchestration programmed mode as well. |
Closing this PR since this feature have been implemented by #214 |
SRIOV feature is not just applicable for Network interfaces but any IO device that support PCIe SIG SRIOV features. One such device is based on Intel® PAC N3000 FPGA which has Network Functionality of 2x25G and FPGA part that can be programmed to support hardware offload of the Wireless (4G/5G) Forward Error Correction (FEC). This Product has already been released by Intel and is being used/deployed by customers. Intel FPGA Device plugin (https://github.com/intel/intel-device-plugins-for-kubernetes/blob/master/cmd/fpga_plugin/README.md) does not support such devices as it dose not cover SRIOV devices using DPDK and also differ in the FPGA image that gets programmed. As part of this PR we are recommending to extend the SRIOV support for non-networking devices and also rename the device plugin from “sriov-network-device-plugin” to “sriov-device-plugin”. For the Intel® PAC N3000 FPGA supporting vRAN or Cloud-native RAN the FPGA PF that supports FEC is bound to DPDK igb_uio or vfio_pci. The Virtual Function resources are allocated to the PODs running RAN/FlexRAN like workload and uses vfio_pci.