Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

daemon: Ignore cilium_* interfaces when deriving NodePort device #16104

Merged
merged 1 commit into from
May 19, 2021
Merged

daemon: Ignore cilium_* interfaces when deriving NodePort device #16104

merged 1 commit into from
May 19, 2021

Conversation

eyanulis
Copy link
Contributor

Any Cilium-created interface (cilium_host, etc) will never be a valid
interface for kube-proxy-replacement NodePort (or direct routing). In
certain cases, it is possible for the NodePort auto-derivation code to
select one of these interfaces. This notably happens when the k8s node
IP is an IPv6 address: the node IP is cloned to cilium_host, and the IP
(sans netmask) is used as a map key - so cilium_host may be viewed as
the only interface with an address matching the node IP.

Add a check bypassing any interface whose name is prefixed with
cilium_ during NodePort device detection.

Add a test mimicking the IPv6 cilium_host case: node IP assigned to a
"real" interface and a cilium_foo interface, we should ignore
cilium_foo.

Fixes: #16019

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Thanks for contributing!

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 12, 2021
@maintainer-s-little-helper
Copy link

Commit 102d4515a6b8c84d8aca5f947e568c42ca1b0953 does not contain "Signed-off-by".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. and removed dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. labels May 12, 2021
@eyanulis eyanulis marked this pull request as ready for review May 12, 2021 03:39
@eyanulis eyanulis requested review from a team and jibi May 12, 2021 03:39
@borkmann borkmann added kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. labels May 12, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 12, 2021
@borkmann borkmann added area/daemon Impacts operation of the Cilium daemon. dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels May 12, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 12, 2021
@borkmann borkmann added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. needs-backport/1.10 labels May 12, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.10.0-rc2 May 12, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.9.7 May 12, 2021
Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Looks good, just two nits.

// ifIndex or if its name begins with cilium_.
if link, err := netlink.LinkByIndex(a.LinkIndex); err != nil {
log.WithError(err).WithField(logfields.LinkIndex, a.LinkIndex).Warn(
"Unable to resolve link from ifIndex, skipping interface")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/skipping interface/skipping interface for device detection/.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks.

continue
} else if strings.HasPrefix(link.Attrs().Name, "cilium_") {
log.WithField(logfields.Device, link.Attrs().Name).Debug(
"Skipping Cilium-generated interface in NodePort device detection")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/NodePort device/device/ (it's not only used for NodePort).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks. Also s/in/for/ to be consistent with various messages around it.

if link, err := netlink.LinkByIndex(a.LinkIndex); err != nil {
log.WithError(err).WithField(logfields.LinkIndex, a.LinkIndex).Warn(
"Unable to resolve link from ifIndex, skipping interface")
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no need for these continues

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that, I've removed them. I had gone back and forth between having an "implicit" and "explicit" else condition and the continues would have been required for the implicit condition, but I forgot to remove them after settling on an explicit else.

@joestringer joestringer added this to Needs backport from master in 1.9.8 May 13, 2021
@joestringer joestringer removed this from Needs backport from master in 1.9.7 May 13, 2021
Any Cilium-created interface (cilium_host, etc) will never be a valid
interface for kube-proxy-replacement NodePort (or direct routing). In
certain cases, it is possible for the NodePort auto-derivation code to
select one of these interfaces. This notably happens when the k8s node
IP is an IPv6 address: the node IP is cloned to cilium_host, and the IP
(sans netmask) is used as a map key - so cilium_host may be viewed as
the only interface with an address matching the node IP.

Add a check bypassing any interface whose name is prefixed with
"cilium_" during NodePort device detection.

Add a test mimicking the IPv6 cilium_host case: node IP assigned to a
"real" interface and a "cilium_foo" interface, we should ignore
"cilium_foo".

Fixes: #16019

Signed-off-by: Eric M. Yanulis <eric@eyanulis.net>
@eyanulis
Copy link
Contributor Author

@brb @jibi Thank you both for the review. Most recent force-push addresses the comments.

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jibi
Copy link
Member

jibi commented May 17, 2021

test-me-please

@eyanulis
Copy link
Contributor Author

It looks like most(/all?) of the failing tests are Github flakes from the first run (HTTP 500 during checkout, etc).

Is there any way to rerun them? I'm not terribly familiar with how this repo is wired up to Github Actions but it doesn't seem like they re-ran after the test-me-please comment above from @jibi.

@jibi jibi closed this May 19, 2021
@jibi jibi reopened this May 19, 2021
@jibi
Copy link
Member

jibi commented May 19, 2021

It looks like most(/all?) of the failing tests are Github flakes from the first run (HTTP 500 during checkout, etc).

Is there any way to rerun them? I'm not terribly familiar with how this repo is wired up to Github Actions but it doesn't seem like they re-ran after the test-me-please comment above from @jibi.

I think GH was having some issues yesterday with actions. Closing and reopening the PR should have restarted them

@joamaki
Copy link
Contributor

joamaki commented May 19, 2021

test-1.16-netnext

@joamaki
Copy link
Contributor

joamaki commented May 19, 2021

The tests relevant to this PR are passing, so marking this ready for merge.

@joamaki joamaki added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 19, 2021
@borkmann borkmann merged commit 9366190 into cilium:master May 19, 2021
@aanm aanm added this to Needs backport from master in 1.10.1 May 19, 2021
@aanm aanm removed this from Needs backport from master in 1.10.0-rc3 May 19, 2021
This was referenced May 21, 2021
@aanm aanm moved this from Needs backport from master to Backport pending to v1.10 in 1.10.1 May 25, 2021
@aanm aanm moved this from Needs backport from master to Backport done to v1.9 in 1.9.8 May 27, 2021
@aanm aanm moved this from Backport pending to v1.10 to Backport done to v1.10 in 1.10.1 Jun 16, 2021
@eyanulis eyanulis deleted the gh16019-exclude-cilium-interfaces-nodeport branch August 30, 2021 05:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/daemon Impacts operation of the Cilium daemon. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
No open projects
1.10.1
Backport done to v1.10
1.9.8
Backport done to v1.9
Development

Successfully merging this pull request may close these issues.

Cilium kube-proxy-replacement sometimes binds to cilium_host, breaking NodePort
8 participants