Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cilium, gops: remap to fixed port to avoid collision with nodeport range #14329

Merged
merged 3 commits into from
Dec 10, 2020

Conversation

borkmann
Copy link
Member

@borkmann borkmann commented Dec 9, 2020

See commit msg.

@borkmann borkmann added pending-review area/daemon Impacts operation of the Cilium daemon. area/operator Impacts the cilium-operator component area/clustermesh Relates to multi-cluster routing functionality in Cilium. area/kube-proxy Issues related to kube-proxy (not the kube-proxy-free mode). labels Dec 9, 2020
@borkmann borkmann requested review from brb, tklauser, aanm and a team December 9, 2020 13:40
@borkmann borkmann requested review from a team as code owners December 9, 2020 13:40
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Dec 9, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.10.0 Dec 9, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.9.2 Dec 9, 2020
@borkmann borkmann added the release-note/misc This PR makes changes that have no direct user impact. label Dec 9, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Dec 9, 2020
@borkmann
Copy link
Member Author

borkmann commented Dec 9, 2020

test-me-please

Copy link
Member

@rolinh rolinh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Could you please also update Hubble Relay as I think it's also affected (nodeport protection is global)? It also enables gops by default with the default listen address (see hubble-relay/cmd/serve/serve.go).

@borkmann
Copy link
Member Author

borkmann commented Dec 9, 2020

Nice catch! Could you please also update Hubble Relay as I think it's also affected (nodeport protection is global)? It also enables gops by default with the default listen address (see hubble-relay/cmd/serve/serve.go).

Ah, good point, that one fell through the cracks :/ will add.

@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Backport done to v1.9 in 1.9.2 Jan 5, 2021
@aanm aanm added release-note/bug This PR fixes an issue in a previous release of Cilium. and removed release-note/misc This PR makes changes that have no direct user impact. labels Jan 20, 2021
@tklauser
Copy link
Member

Marked for backport to v1.8 as this also fixes an issue with port collision between gops agent and the proxy, see #13400. I'll send a manual backport PR as cherry-picking is not straight forward.

@tklauser
Copy link
Member

1.8 backport PR: #15634

@tklauser
Copy link
Member

Apologies, I forgot to mark this as backport-pending/1.8 when opening #15634. Done so now.

aanm pushed a commit to tklauser/cilium that referenced this pull request Apr 17, 2021
[ upstream commit 7757d31 ]

Manually backported from cilium#14329 to address cilium#13400 for v1.8.

Lee reported that kube-proxy log had a warning that its bind protection
couldn't bind a specific port in the nodeport range. Turns out gops was
using this particular port already through it's auto-binding (127.0.0.1:0).
Meaning that in case gops collides with a NodePort service, we might
not be able to pull gops data from that port since either kube-proxy or
kube-proxt free variant will redirect us to the actual service instead.

Given this is rather unpredictable wrt which port the agent will bind for
gops, remap it to a fixed default port and add a user configurable knob
that allows to use a different one if necessary. Given the agent, operator,
clustermesh-apiserver and hubble-relay all start the gops listener, add
the --gops-port flag to each of them. The CNI does not have gops enabled
by default but only in debug mode hence no changes there for now given
it's unlikely being used this way in production.

Fixes: cilium#14218
Reported-by: Lee Hu via Slack
Co-authored-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Tobias Klauser <tobias@cilium.io>
aanm pushed a commit that referenced this pull request Apr 17, 2021
[ upstream commit 7757d31 ]

Manually backported from #14329 to address #13400 for v1.8.

Lee reported that kube-proxy log had a warning that its bind protection
couldn't bind a specific port in the nodeport range. Turns out gops was
using this particular port already through it's auto-binding (127.0.0.1:0).
Meaning that in case gops collides with a NodePort service, we might
not be able to pull gops data from that port since either kube-proxy or
kube-proxt free variant will redirect us to the actual service instead.

Given this is rather unpredictable wrt which port the agent will bind for
gops, remap it to a fixed default port and add a user configurable knob
that allows to use a different one if necessary. Given the agent, operator,
clustermesh-apiserver and hubble-relay all start the gops listener, add
the --gops-port flag to each of them. The CNI does not have gops enabled
by default but only in debug mode hence no changes there for now given
it's unlikely being used this way in production.

Fixes: #14218
Reported-by: Lee Hu via Slack
Co-authored-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Tobias Klauser <tobias@cilium.io>
@tklauser tklauser added this to Backport done to v1.8 in 1.8.9 Apr 19, 2021
EricMountain pushed a commit to DataDog/cilium that referenced this pull request Feb 21, 2022
[ upstream commit 7757d31 ]

Manually backported from cilium#14329 to address cilium#13400 for v1.8.

Lee reported that kube-proxy log had a warning that its bind protection
couldn't bind a specific port in the nodeport range. Turns out gops was
using this particular port already through it's auto-binding (127.0.0.1:0).
Meaning that in case gops collides with a NodePort service, we might
not be able to pull gops data from that port since either kube-proxy or
kube-proxt free variant will redirect us to the actual service instead.

Given this is rather unpredictable wrt which port the agent will bind for
gops, remap it to a fixed default port and add a user configurable knob
that allows to use a different one if necessary. Given the agent, operator,
clustermesh-apiserver and hubble-relay all start the gops listener, add
the --gops-port flag to each of them. The CNI does not have gops enabled
by default but only in debug mode hence no changes there for now given
it's unlikely being used this way in production.

Fixes: cilium#14218
Reported-by: Lee Hu via Slack
Co-authored-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Tobias Klauser <tobias@cilium.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/clustermesh Relates to multi-cluster routing functionality in Cilium. area/daemon Impacts operation of the Cilium daemon. area/kube-proxy Issues related to kube-proxy (not the kube-proxy-free mode). area/operator Impacts the cilium-operator component release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
1.8.9
Backport done to v1.8
1.9.2
Backport done to v1.9
Development

Successfully merging this pull request may close these issues.

None yet

8 participants