New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Figure out what to do about external IPs #97110
Comments
@tallclair: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@tallclair : Thanks for sharing the details. Commenting here, since the original issue is closed. This maybe a novice question, but does default deny CNI network policy and / or Mutual Auth (e.g. mTLS) act as a mitigating control? |
Does this have the same root cause as the issue discussed in #22650 and fixed in openshift with an admission controller? (openshift issue: openshift/origin#7808 and PR: openshift/origin#7810) There is some existing discussion in the k/k thread if the added context helps form a strategy here. I first encountered the issue a couple years ago when a tenant accidentally used an external IP incorrectly in one of our clusters, which I discovered while debugging an unreachable etcd node. The external IP Service was specifying the same IP as an etcd node's host IP, and that prevented all nodes running kube-proxy from reaching the etcd node since traffic was routed to the Service instead. Luckily we had other etcd nodes in the cluster and by debugging kube-proxy logs along with I tried to report the issue via HackerOne several months ago, but I was never able to get the steps to reproduce working consistently in all clusters to meet the HackerOne report requirements, which may be similar to @champtar's experience in this comment. The OPA Gatekeeper based solution is similar to the workaround I came to as well, but it would be great to fix it upstream. I'm sure there are users that depend on external IP, but for what it's worth I have rarely seen it used correctly. |
/cc @bowei |
@chadswen thanks for digging those old tickets, now I'm not sure I even searched for related open issues before reporting ... Here is my write up with 18 tests on 3 clusters: https://blog.champtar.fr/K8S_MITM_LoadBalancer_ExternalIPs/ |
#97076 as a proposal for k/k |
I never understood why the data plane is different between LoadBalancer and ExternalIPs. |
you can completely remove ExternalIP and replace it by a small static
LoadBalancer controller
A user could. We can't unilaterally rip out a feature. We don't want to
pull the rug out from underneath the (relatively small) number of
legitimate users.
…On Tue, Dec 8, 2020 at 4:35 AM Etienne Champetier ***@***.***> wrote:
I never understood why the data plane is different between LoadBalancer
and ExternalIPs.
That being said, you can completely remove ExternalIP and replace it by a
small static LoadBalancer controller with a simple mapping as config namespace/servicename
-> ip, or have people run MetalLB controller without MetalLB speaker.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#97110 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKWAVHTCVZVCAK4G33DTLTSTYMRJANCNFSM4UQZNYYA>
.
|
I mean make it configurable, then at some point disable it by default, then even later remove it. |
I'm proposing to give users a choice - completely off, completely on
(default, legacy), or enabled subject to your own policy. I haven't
seen enough use-cases with enough commonality to warrant trying to add
more standard APIs to control it. Users are adopting more general
policy control, and I don't think we should swimp against that
current.
…On Tue, Dec 8, 2020 at 8:10 AM Etienne Champetier ***@***.***> wrote:
I mean make it configurable, then at some point disable it by default, then even later remove it.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Note that externalip-webhook's |
Right. To *really* solve this we need a much more robust control API and I
am not (yet) convinced that is justified.
…On Tue, Dec 8, 2020 at 8:28 AM Dan Winship ***@***.***> wrote:
1. Accept the current state, and promote the externalip-webhook
<https://github.com/kubernetes-sigs/externalip-webhook> to a built-in
admission controller
2. Accept the current state, and support the externalip-webhook
<https://github.com/kubernetes-sigs/externalip-webhook> as a long-term
solution
Note that externalip-webhook's allowed-external-ip-cidrs solves the
problem from the CVE but is still insecure if there are mutually-untrusting
users, because it doesn't stop one user from creating a Service using the
same externalIP+port as another user's existing Service.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#97110 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKWAVHVFAFQR6KVU3DSE6LSTZH2DANCNFSM4UQZNYYA>
.
|
As a potential reference on how to fix this: We have fixed this in Cilium a while ago [0] by ignoring ExternalIP for any traffic from a pod unless the destination IP is mapped to a node IP (GCE case). It appears that this code [1] is trying to do something similar but is specific to bridge based network implementations. [0] cilium/cilium@1d8589a kubernetes/pkg/proxy/ipvs/proxier.go Lines 1740 to 1752 in 018942a
|
The problem is that the feature is too loosely defined. Something like
that feels right but is, strictly, a breaking change *and* doesn't solve
the whole problem (i.e. I can still steal your IP address).
I just want to break it completely and let people who *need* it define
their own policies in their own ways...
…On Thu, Dec 10, 2020 at 7:31 AM Thomas Graf ***@***.***> wrote:
As a potential reference on how to fix this: We have fixed this in Cilium
a while ago [0] by ignoring ExternalIP for any traffic from a pod unless
the destination IP is mapped to a node IP (GCE case). It appears that this
code [1] is trying to do something similar but is specific to bridge based
network implementations.
[0] ***@***.***
<cilium/cilium@1d8589a>
[1]
https://github.com/kubernetes/kubernetes/blob/018942a92a90f91261ba63971827f2ec6b64587e/pkg/proxy/ipvs/proxier.go#L1740-L1752
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#97110 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKWAVBNRHHASIV5SVTE7E3SUDSTRANCNFSM4UQZNYYA>
.
|
I'm not sure I understand how you can still steal the IP outside of what ExternalIP was intended for unless you are referring to a potential restriction for ExternalIPs to not cover PodIP and ServiceIP CIDR which would potentially be even better. I'm not sure what we can better at the k8s service implementation layer. I agree with your overall statement though. We often see users try and patch over this with network policies. Maybe it would be cleaner to have separate k8s resources for cluster internal and external services. |
You can have multiple Services with the same EXTERNAL-IP:PORT
|
As an alternative to using external-webhook one can use a KubeMod reject rule like this one. KubeMod rules apply only to the namespace they are deployed to. |
I concur with @tgraf . It would be much cleaner to separate k8s resources for internal / external cluster services. |
It's good to have that classification. |
Gateway should be the basis for the distinction between internal and external. |
Proposd impl: #97395 |
should we close it now @thockin ? |
is this PR the final solution to CVE-2020-8554?Do we have plans to imple the second path "Attempt to redesign the external IP feature, deprecate the old behavior, and manage the migration to the new feature" in future version of kubernetes? |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
The question by @zhousam still needs to be addressed. If the PR is merged and already live in the now released k8s 1.21, should this issue be resolved or is there additional work needed ? cc @thockin @tallclair |
/remove-lifecycle rotten |
The field is part of Services API that is GA, is not likely to be removed, see comment #97110 (comment)
@thockin refers to https://gateway-api.sigs.k8s.io/ , I think that he means that any attempt to redesign will come from that new API Hope that answers the remaining questions |
@aojea: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
For CVE-2020-8554, #97076, we decided we couldn't patch it in-tree, and instead provided workarounds to disable (or allowlist) the feature through admission controls. Now that the issue is public, I'd like to open the conversation about a long-term fix.
Unless we missed something (entirely possible), I see a few possible paths forward:
/sig network architecture
/area security
/priority important-soon
The text was updated successfully, but these errors were encountered: