Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1937991: CNI: Lookup offending interface on NetlinkError #482

Merged

Conversation

dulek
Copy link
Contributor

@dulek dulek commented Mar 16, 2021

The infamous NetlinkError: (17, 'File exists') started to bug us again
due to changes in when cri-o is deleting network namespaces. This patch
is an ultimate brute-force attempt to fix the problem. The idea is that
on NetlinkError kuryr-daemon will iterate over all network namespaces
in the system and delete interfaces that have the conflicting VLAN ID.

Closes-Bug: 1892388

Change-Id: I6672ed0e0db99a91b68cc4d9e74a33d8a9bcf0ca

@openshift-ci-robot
Copy link

@dulek: No Bugzilla bug is referenced in the title of this pull request.
To reference a bug, add 'Bug XXX:' to the title of this pull request and request another bug refresh with /bugzilla refresh.

In response to this:

CNI: Lookup offending interface on NetlinkError

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dulek
Copy link
Contributor Author

dulek commented Mar 16, 2021

/hold

I need to test this.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 16, 2021
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 16, 2021
@dulek dulek changed the title CNI: Lookup offending interface on NetlinkError Bug 1937991: CNI: Lookup offending interface on NetlinkError Mar 16, 2021
@openshift-ci-robot openshift-ci-robot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Mar 16, 2021
@openshift-ci-robot
Copy link

@dulek: This pull request references Bugzilla bug 1937991, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (3.11.z) matches configured target release for branch (3.11.z)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (gcheresh@redhat.com), skipping review request.

In response to this:

Bug 1937991: CNI: Lookup offending interface on NetlinkError

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dulek dulek force-pushed the fix-netlink-error branch 5 times, most recently from 643671d to b6c193f Compare March 18, 2021 16:39
@dulek
Copy link
Contributor Author

dulek commented Mar 19, 2021

/test images

@dulek dulek force-pushed the fix-netlink-error branch 3 times, most recently from c199fe9 to b7c7ede Compare March 19, 2021 17:13
@dulek
Copy link
Contributor Author

dulek commented Mar 19, 2021

Okay, this version seems to work in both cri-o modes. @MaysaMacedo, @luis5tb, @gryf, can you please take a look. It's fragile code, so more diligent look is welcomed.

@dulek
Copy link
Contributor Author

dulek commented Mar 22, 2021

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 22, 2021
The infamous `NetlinkError: (17, 'File exists')` started to bug us again
due to changes in when cri-o is deleting network namespaces. This patch
is an ultimate brute-force attempt to fix the problem. The idea is that
on NetlinkError kuryr-daemon will iterate over *all* network namespaces
in the system and delete interfaces that have the conflicting VLAN ID.

Closes-Bug: 1892388

Change-Id: I6672ed0e0db99a91b68cc4d9e74a33d8a9bcf0ca
@dulek dulek requested review from MaysaMacedo, gryf and luis5tb and removed request for celebdor and yboaron March 23, 2021 16:19
@luis5tb
Copy link
Contributor

luis5tb commented Mar 24, 2021

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 24, 2021
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dulek, luis5tb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 8914de4 into openshift:release-3.11 Mar 24, 2021
@openshift-ci-robot
Copy link

@dulek: All pull requests linked via external trackers have merged:

Bugzilla bug 1937991 has been moved to the MODIFIED state.

In response to this:

Bug 1937991: CNI: Lookup offending interface on NetlinkError

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants