breaking: add ...networkPolicy.egressAllowRules and don't allow singleuser pods to access PrivateIPv4 addresses by default#2508
Conversation
|
Thanks a lot to @consideRatio for bringing this up and engaging with this intently in 2i2c-org/infrastructure#829, and to @sgibson91 for actually coming up with the network policies. We've been running this at 2i2c and Berkeley for a while. Inspired by the REAL BAD azure security incident https://www.wiz.io/blog/chaosdb-explained-azures-cosmos-db-vulnerability-walkthrough. jupyterhub/kubespawner#545 is the other more secure default that came out of that. Given that the Kubespawner PR is going to go into 2.0, I'd love to get this into 2.0 as well. |
|
Discuss adding another rule that removes any network restriction to |
|
Thanks @yuvipanda for bringing this onwards! Agreement clarificationThanks to practical experience reported by @yuvipanda and the 2i2c.org tech team of using tighter egress rules, combined with some practical ideas on how to implement it - I'm now also in favor of dropping permissions to a bare minimum.
Change requestsWhile I have a lot of suggestions below, it would be great as a first step to arrive at an agreement on the configuration names we go for as it can become quite messy to change multiple times given how many places we need to update things at.
|
|
At this point, I'd like to reach agreement on the configuration naming and structure. Here is my initial proposal. I'd like to ping @manics about this as well who has reasoned a lot about these network policies in the past. Configuration proposal 1 singleuser:
networkPolicy:
enabled: true
ingress: []
+ egressAllowRules:
+ nonLocalIPs: true
+ egress: []
- egress:
- - to:
- - ipBlock:
- cidr: 0.0.0.0/0
- except:
- - 169.254.169.254/32
interNamespaceAccessLabels: ignore
allowedIngressPorts: []I also suggest we make the dnsPorts configuration configurable alongside the singleuser:
networkPolicy:
enabled: true
ingress: []
egressAllowRules:
+ dnsPorts: true
nonLocalIPs: true
egress: []
interNamespaceAccessLabels: ignore
allowedIngressPorts: []Related change - not needed to be part of this PRWith this structure, we can also make it clear that we have a special treatment for singleuser:
networkPolicy:
enabled: true
ingress: []
egressAllowRules:
dnsPorts: true
+ cloudMetadataServer: false
nonLocalIPs: true
egress: []
interNamespaceAccessLabels: ignore
allowedIngressPorts: [] |
|
👍 to the change. The only thing I think we need to highlight in docs is that this feature must be disabled in order to block egress more strictly, because adding a more restrictive egress policy with this in place will have no effect, since this already grants access to the outside world. We might even want to fail if networkPolicy.egress is defined while this is true, since I can't think of a situation where that would behave as intended. |
@minrk this is why I suggest a change of language where we speak of allowing instead of blocking or restricting, and a concrete suggestion on how we name our config options in #2508 (comment)
I think this is a problematic as it makes some special configuration transform the behavior of other configuration, and that is a pattern I've seen us revert many times as complexity increase. I suggest we stick to anything that is easy to describe and avoid introducing configuration that influences other configuration behavior. I think #2508 (comment) accomplishes that as well. |
|
@consideRatio I think that proposal's great! Then it's a sort of 'alias' for assembling one egress rule,r athe rthan two that won't make sense together. That seems like a good way to think of it. If we do ever have mutually exclusive config options, I do think it's reasonable to try to fail informatively rather than have undefined behavior, as long as the failure is manageable. But of course, it's ideal if we can achieve the result without creating mutually exclusive config. |
It's probably worth clarifying what you mean by "internal"- do you mean internal to the K8s cluster, or internal to an organisation's network? Blocking all non-public traffic makes sense for public JupyterHubs, but I'd be surprised if it was the case for internal hubs inside an organisation's network- one of the reasons for running an internal hub is to safely give access to other (non-k8s) networked resources in an organisation. I think defaulting to blocking non-public traffic from singleuser pods is fine as it improves security by default, though I think we've discussed this before so might be worth going back through past discussions to check if we've missed any points. Ideally there'd be an option to block within-cluster traffic only but I don't think that's possible- instead we could add an example config showing how to correctly block internal k8s traffic whilst still allowing traffic to private IPs outside the cluster CIDR. I prefer the |
|
@manics I just want to warn about using or even thinking about "blocking". We may at most "not allow" with a network policy. Regarding local etc, my definition is what wiki call private ipv4 addresses: https://en.m.wikipedia.org/wiki/Private_network The rule we consider, in my mind, is to allow all non-provate ipv4 adresses except the cloud metadata server ip. |
|
Thank you for the extremely detailed feedback, @consideRatio! I really appreciate that. I'm very much in favor of the example you had in Configuration proposal 1, and would like to get that implemented. |
|
Nomenclature-wise, I also prefer PrivateIPs, as the IPs we are blocking are from the RFC that defines private IP space, and calls them private IPs: https://datatracker.ietf.org/doc/html/rfc1918. Things you consider 'local' can also have public IPs - for example, most kubernetes masters are accessed via public IPs! |
|
@yuvipanda wiee okay! Btw 100% on replacing references to "local IPs" with references to "private IPs" as that is what seems to be their common name. Looking back at the configuration proposal 1, I wonder if we may want to include something like |
|
I think it'd be good to get For docs does it sound reasonable to have 4 examples based on the use-cases discussed in this issue?
|
|
Is there anyone else who can also help push this along? I don't want to be a blocker... |
|
I can take this on in due time @yuvipanda, if anyone wants to go for it, please do! I'll write something here if i start doing actual work later. |
|
@yuvipanda I'm picking up on this now! |
|
Me and @yuvipanda spoke about this work and concluded a lot of concrete action points to take that are found acceptable to us.
networkPolicy:
egressAllowRules:
cloudMetadataServer: false
dnsPortsPrivateIPs: true
nonPrivateIPs: true
privateIPs: falseWe discussed the use of rules where the |
|
@yuvipanda I've now made a second pass on this configuration changes according to what we discussed earlier! |
36586d4 to
f13a613
Compare
yuvipanda
left a comment
There was a problem hiding this comment.
Still WIP I think, but I love this! Completely approve. Comments need some updating, but otherwise ✔️
|
@yuvipanda and I have discussed (#2508 (comment)) how network policies work and we had a disagreement how we understood them to work. Due to that, the following test was developed. Here are the reproduction steps and a gif demonstrating the result. - to:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 10.72.2.240/32
# Will decommenting this rule make the netpol allow the pod to establish
# a connection to 10.72.2.240? The answer was yes! NetworkPolicy
# resources rules are additive like this!
# - to:
# - ipBlock:
# cidr: 10.72.0.0/16# Creation of a server exposing port 1080 to try connect against from the busybox
helm upgrade --install mockserver http://www.mock-server.com/mockserver-5.11.1.tgz
# Get the mockserver Pod IP:
kubectl describe pod -l app=mockserver | grep IP
IP=10.72.2.240
# Creation of a busybox pod (separate terminal)
kubectl run -it --rm busybox --image=busybox --restart=Never -- sh
# Run the following in the busybox pod to test the ability
# to create a connection to the webserver repeatedly
watch -tn1 "timeout 1 nc -zv 10.72.2.240 1080"
# Label the busybox pod
kubectl label pod busybox enforce-test-netpol=true
# Restrict the busybox pod to communication to anything
# besides the mockserver's ip
cat <<EOF | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: test-netpol
spec:
podSelector:
matchLabels:
enforce-test-netpol: "true"
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 10.72.2.240/32
# - to:
# - ipBlock:
# cidr: 10.72.0.0/16
EOF
# Add another rule that allows communication
# to the mockservers ip
cat <<EOF | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: test-netpol
spec:
podSelector:
matchLabels:
enforce-test-netpol: "true"
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 10.72.2.240/32
- to:
- ipBlock:
cidr: 10.72.0.0/16
EOF |
manics
left a comment
There was a problem hiding this comment.
A couple of docs suggestions, but feel free to put them somewhere else!
|
Wow, that is amazing work @consideRatio! That definitely did not match what my understanding of how network policies worked, and TIL. Thank you for putting in the effort to make that happen. I'll de-confuse myself over time :) Given this, I think it's alright to have them be mentioned multiple times - I'll admit it's confusing, but that is just how network policies seem to be. |
|
@yuvipanda great! I'm happy to stop coalescing the rules for chart maintainability and readability, but i have no ideas about the performance impact - how concerned are you about that? Ah you wrote on slack:
Then I'll restructure this back to be standalone rules again! |
|
@yuvipanda @manics - i'm opinionated towards having easy logic to maintain: separated rules Currently we have logic that coalesce rules, which leads to easier parsing of the final result. What are you opinionated towards doing? Refactor to have separate rules or not? I refactored it to coalesced rules motivated by potential performance, but yuvi doesnt think there id a notable difference, so then im opinionated towards separate rules for charr maintainability and comprehensibility. The final rendered ruleset will be more complicated as a netpol ruleset, but also easier to map to the chart condig that generated it: like 4/2*3/6 is 1, and 1 is easier to understand in one way, but the expression is easier to map to the configuration that led to it being 1. |
This PR enables all traffic to the *internet* from user pods, but restricts all access to the private network where other kubernetes pods or services may be running. Meant to be a minimally intrusive defense in depth mechanism that prevents access to unsecured endpoints running on VMs, in the kubernetes cluster, etc. For example, users often run Prometheus without any kind of network protection in the same cluster as the users, and access to prometheus can leak a lot of sensitive user data. https://www.wiz.io/blog/chaosdb-explained-azures-cosmos-db-vulnerability-walkthrough shows the immense security value of restricting access to internal network from user written code. Most software projects do not have their authz/authn layers, and rely on network segmentation for protection. When network segmentation fails like it does in the Azure security incident, there is often no secondary line of defense. Most JupyterHub users talk to the internet, but by default I think nobody really talks to the *internal network*. This also only affects network connections from user code - stuff like mounting NFS shares are done by the kubelet, and hence are not affected. Connections to hub infrastructure (proxy, hub, etc) are also not affected, as they are separately allowed by other NetworkPolicy rules.
f748bfb to
7b8f7cd
Compare
Co-authored-by: Simon Li <orpheus+devel@gmail.com>
|
@yuvipanda @manics @choldgraf this is now ready for review/merge. I think I've addressed all review comments, my own included, at this point. |
|
This looks awesome to me, @consideRatio! I don't wanna merge this as I opened the PR, but I think this is good to go! |
|
Thank you @yuvipanda!! Wieee this PR merged is a big win! :) |
jupyterhub/zero-to-jupyterhub-k8s#2508 Merge pull request #2508 from yuvipanda/no-private
|
Amazing - thanks for the persistence on this 🙂 |

Updated technical change summary by Erik
<hub|proxy.chp|proxy.traefik|singleuser>.networkPolicy.egressAllowRulesis introduced. They are meant to help adjust the created NetworkPolicy resource rules with a few convenient true/false values. They are all set to true by default besides for the singleuser NetworkPolicy governing the user pods. Using the default configuration, user pods are no longer granted access to PrivateIPv4 addresses.Also note that with the introduction of the
egressAllowRules, the configuration to allow outbound network connections moved from.egressto.egressAllowRules. This means that if users have configured.egresspreviously to not be as permissive, they may experience a permission escalation by having.egressAllowRulesintroduced. So, for anyone configuring.egressand upgrading to version 2 of this chart can set all .egressAllowRules to false.Original PR description
This PR enables all traffic to the internet from user pods,
but stops allowing access to the private network where other
kubernetes pods or services may be running. Meant to be a
minimally intrusive defense in depth mechanism that prevents
access to unsecured endpoints running on VMs, in the kubernetes
cluster, etc. For example, users often run Prometheus without
any kind of network protection in the same cluster as the
users, and access to prometheus can leak a lot of sensitive
user data.
https://www.wiz.io/blog/chaosdb-explained-azures-cosmos-db-vulnerability-walkthrough
shows the immense security value of restricting access to internal
network from user written code. Most software projects do not have
their authz/authn layers, and rely on network segmentation for
protection. When network segmentation fails like it does in the Azure security incident,
there is often no secondary line of defense.
Most JupyterHub users talk to the internet, but by default I think
nobody really talks to the internal network. This also only affects
network connections from user code - stuff like mounting NFS shares
are done by the kubelet, and hence are not affected. Connections
to hub infrastructure (proxy, hub, etc) are also not affected,
as they are separately allowed by other NetworkPolicy rules.
Access to
http://proxy-publicfor user code to talk in a hub agnosticway to services, as well as access to the full hub URL when autohttps
is used are also added in this PR.