-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot config numVfs for mellanox NICs on RHCOS node #43
Comments
SriovNetworkNodeState CR of test node. |
|
Run mstconfig manually from the config daemon
|
What is mstconfig version and what is the card FW? |
@moshe010 How to check the FW version without mstconfig? It's a RHCOS node, there is not mstconfig available in host operating system.
|
So this is CoreOS node or RHEL 8.0? How can we reproduce this by our self? |
@moshe010 It's a CoreOS node. I did some investigation on that node. Didn't find the root cause, however I found using PCI configuration cycles instead of the PCI address works. It could be a workaround.
According this https://github.com/Mellanox/mstflint/blob/ddb4350e32c37dcbe8fe0d295eac05f2a23762db/README#L134 |
@pliurh can you schedule a debug session ? when providing mstconfig a PCI device it accesses it through sysfs (/sys/bus/pci/devices/<d:b:d:f>/config) Also, how can we reproduce this issue in-house ? |
@adrianchiris There is not mstflint package available on CoreOS, and it is not allowed to install any rpm package either. The way I run That iso file is just a booting image, CoreOS requires ignition files to boot up. You can find more information at [1]. Currently, all the deployment tools are for downstream images, which are not available without an internal pull-secret file. We're still trying to find a way for partners to install the latest OCP 4.2 build in-house. |
The sysfs was mounted as There is a related upstream issue containerd/containerd#3221 |
We can try explicitly mounting /sys to the contiainer as a workaround per our Slack discussion |
in that case is shouldn't work also for other vendor as well right? |
@moshe010 sriov_numvfs is set through the container /host mount point |
Here is the bug of CRI-O cri-o/cri-o#2625 |
we tested it with the latest cri-o Release 1.14.10 |
Add Columbiaville E810-CQDA2 and E810-XXVDA2 in NicIdMap
Add Columbiaville E810-CQDA2 and E810-XXVDA2 in NicIdMap
https://bugzilla.redhat.com/show_bug.cgi?id=1733897
Description of problem:
When created Sriov Network Node Policy with vondor is 15b3, the VF cannot be initialized.
Version-Release number of selected component (if applicable):
How reproducible:
always
Steps to Reproduce:
oc get sriovnetworknodestates.sriovnetwork.openshift.io -o yaml
Actual results:
4. no 'Vfs' is created
5. oc logs sriov-network-config-daemon-pc8gl
daemon logs:
I0729 03:56:58.218766 15417 mellanox_plugin.go:59] mellanox-plugin OnNodeStateAdd()
I0729 03:56:58.218800 15417 mellanox_plugin.go:66] mellanox-Plugin OnNodeStateChange()
I0729 03:56:58.218813 15417 mellanox_plugin.go:267] mellanox-plugin isMlnxNicAndInNode(): device 0000:5e:00.0
I0729 03:56:58.218823 15417 mellanox_plugin.go:181] mellanox-plugin getMlnxNicFwData(): for device 0000:5e:00.0
I0729 03:56:58.218828 15417 mellanox_plugin.go:252] mellanox-plugin isSinglePortNic(): device 0000:5e:00.0
I0729 03:56:58.218831 15417 mellanox_plugin.go:157] mellanox-plugin mstconfigReadData(): try to read [LINK_TYPE] for device 0000:5e:00.0
I0729 03:56:58.218854 15417 mellanox_plugin.go:169] mellanox-plugin runCommand(): mstconfig [-d 0000:5e:00.0 q LINK_TYPE]
I0729 03:56:58.225057 15417 writer.go:107] setNodeStateStatus(): syncStatus: InProgress, lastSyncError:
E0729 03:56:58.235747 15417 mellanox_plugin.go:163] mellanox-plugin mstconfigReadData(): failed : exit status 3 : -E- Failed to open the device
I0729 03:56:58.235796 15417 mellanox_plugin.go:157] mellanox-plugin mstconfigReadData(): try to read [LINK_TYPE_P2] for device 0000:5e:00.0
I0729 03:56:58.235819 15417 mellanox_plugin.go:169] mellanox-plugin runCommand(): mstconfig [-d 0000:5e:00.0 q LINK_TYPE_P2]
E0729 03:56:58.244693 15417 mellanox_plugin.go:163] mellanox-plugin mstconfigReadData(): failed : exit status 3 : -E- Failed to open the device
E0729 03:56:58.244779 15417 daemon.go:147] nodeStateAddHandler(): plugin mellanox_plugin error: exit status 3
I0729 03:56:58.244822 15417 daemon.go:240] nodeStateChangeHandler(): Interface not changed
W0729 03:56:58.244845 15417 daemon.go:115] Got an error: exit status 3
E0729 03:56:58.244916 15417 start.go:105] failed to run daemon: exit status 3
Expected results:
VF for mellanox can be worked
The text was updated successfully, but these errors were encountered: