New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-3057: Retry setting VF MAC address #53
OCPBUGS-3057: Retry setting VF MAC address #53
Conversation
Some NIC drivers (i.e. i40e/iavf) set their VF MAC addressing asynchronously when set administratively. This means that while the PF could already show the VF with the desired MAC address, the netdev VF may still have the original one. If in this window we issue a netdev VF MAC address set, the driver will return an error and the pod will fail to create. One way to fix this issue would be to not try to set the netdev VF MAC address, rather simply rely on the MAC address set administratively already in place. However, other NIC drivers (i.e. mlx5_core) do not propagate the MAC address down to the netdev VF so for those drivers we have to continue setting the VF MAC address the same way (via PF and netdev VF). This commit addresses the issue with a retry where it waits up to 1 second (5 retries * 200 millisecond sleep) in case driver is still working on propagating the MAC address change down to the VF. ResetVFConfig resets a VF administratively. We must run ResetVFConfig before ReleaseVF because some drivers will error out if we try to reset netdev VF with trust off. So, reset VF MAC address via PF first. Signed-off-by: Carlos Goncalves <cgoncalves@redhat.com>
@cgoncalves: This pull request references Jira Issue OCPBUGS-3057, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/jira refresh |
@cgoncalves: This pull request references Jira Issue OCPBUGS-3057, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cherry-pick release-4.12 |
@cgoncalves: once the present PR merges, I will cherry-pick it on top of release-4.12 in a new PR and assign it to you. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bn222, cgoncalves, SchSeba The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@cgoncalves: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@cgoncalves: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-3057 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@cgoncalves: #53 failed to apply on top of branch "release-4.12":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Cherry-pick from k8snetworkplumbingwg/sriov-cni#232
Some NIC drivers (i.e. i40e/iavf) set their VF MAC addressing asynchronously when set administratively. This means that while the PF could already show the VF with the desired MAC address, the netdev VF may still have the original one. If in this window we issue a netdev VF MAC address set, the driver will return an error and the pod will fail to create.
One way to fix this issue would be to not try to set the netdev VF MAC address, rather simply rely on the MAC address set administratively already in place. However, other NIC drivers (i.e. mlx5_core) do not propagate the MAC address down to the netdev VF so for those drivers we have to continue setting the VF MAC address the same way (via PF and netdev VF).
This commit addresses the issue with a retry where it waits up to 1 second (5 retries * 200 millisecond sleep) in case driver is still working on propagating the MAC address change down to the VF.
ResetVFConfig resets a VF administratively. We must run ResetVFConfig before ReleaseVF because some drivers will error out if we try to reset netdev VF with trust off. So, reset VF MAC address via PF first.
Signed-off-by: Carlos Goncalves cgoncalves@redhat.com