Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

review cross-nodegroup ingress rules #419

Closed
errordeveloper opened this issue Jan 10, 2019 · 9 comments · Fixed by #444
Closed

review cross-nodegroup ingress rules #419

errordeveloper opened this issue Jan 10, 2019 · 9 comments · Fixed by #444

Comments

@errordeveloper
Copy link
Contributor

At present, we only allow access to majority of ports below 1025 within a single nodegroup. We need to review this, as user may wish to run pods that listen on port 80, for example. We need probably need to open this up, perhaps we can use a shared SG in the cluster stack, or simply allow access on the basis of VPC CIDR (which is what we have for DNS - #418).

@mumoshu
Copy link
Contributor

mumoshu commented Jan 13, 2019

I was wondering if we could just open up everything across nodegroups, while suggesting users to use k8s network policies w/ calico or cilium or whatever for fine-grained access control.

@ozzieba
Copy link
Contributor

ozzieba commented Jan 14, 2019

In my experience network segregation in Kubernetes is generally applicable at the Namespace/Pod/Service level, rarely if ever by Nodegroup. The standard expectation is that as a cluster user I don't have to think about nodes at all, and I can talk to another pod wherever it happens to be launched. So +1 to a shared SG for all Node Groups.

Re: using the VPC CIDR, I think it's worth thinking about the "existing VPC" case, and whether that might include other resources not in the cluster, which possibly shouldn't have full access. Alternatively, we could introduce a Cluster CIDR config, which would be a subset of the VPC CIDR used for the security group as well as creation of the original Subnets (ie #379).

@errordeveloper
Copy link
Contributor Author

errordeveloper commented Jan 14, 2019

I think there should be no difference between node-to-node rules within a nodegroup vs accross nodegroups. It's actually a question of what node ports should be open between nodes. At present

However, I do think we should provide a way for users to implement advanced security by blocking all ports that are not used to provide vital functionality. The use-case would be to run sandboxed application that should not be able access or (be accessed by) anything else inside the cluster at all, or applications that have limited egress and no ingress access at all. I agree that Kubernetes network policy API serves many of that kind of use-cases well, but I also know that in some use-cases regulatory policies are defined around security groups, so there needs to be an option for that when it's needed. Arguably, one may wish to use network policy as well as SG isolation that provides additional insurance and implemented on the underlay network that lives completely outside the Kubernetes cluster.

It's unfortunate that pods cannot be segregated into an SG of their own that would be separate from host network SG.
Also, it should be noted that with an overlay network we would be able to close most of the node ports, and in case of Weave Net, overlay traffic can be actually encrypted. So in that case, you would want to ensure that only vital host ports are open.

In any case, the question here should be really about which ports should be kept close between nodes by default. At the moment we open SSH (only when needed), we always open DNS and then only high ports are open between nodegroups and to the outside world.

I believe the case of outside world should remain the same, and we should actually add a mode where none of the ports are open, yet without private resorting to private subnets (because NAT doesn't come for free - #392).

But certainly we should allow use of all ports starting with e.g. starting with 21, so that one can run an FTP server if they must, and we can specify provide a config parameter that lets user specify a range of ports or something like that. But I don't think these ports have to be open outside the VPC CIRD is sufficient. As I said above, I believe we should only open higher ports outside the VPC by default, and we do need a way for users to close those easily.

@errordeveloper
Copy link
Contributor Author

errordeveloper commented Jan 14, 2019

I think there should be no difference between node-to-node rules within a nodegroup vs accross nodegroups.

I suspect there is a bug at the moment, but I've not check it. I think right now you can run a pod listening on e.g. port 80 and you will get access to it between nodes in one nodegroup, but not between nodegroups. But maybe I'm wrong - could someone on this thread look into it please?

@errordeveloper
Copy link
Contributor Author

I think we should have a shared SG, as well as per-nodegroup SG.

By default:

  • shared SG would allow inter-node traffic on all ports and all protocols, no CIDR-based rules, it would live inside the cluster stack
  • per-nodegroup SGs would be used to only control the following types of traffic:
    • inter-VPC traffic (e.g. for SSH)
    • access to high ports from outside the cluster
    • bindings for control-plane SG
    • opening any ports to outside world

Options for sealing a nodegroup:

  • opt-out from shared SG
  • custom ingress rules

We should add shared SG in new clusters, or during cluster upgrades. When it's missing, we will provide instruction for users how to add it, perhaps provide a eksctl utilils create-shared-node-sg helper to fix existing clusters.

@mumoshu
Copy link
Contributor

mumoshu commented Jan 15, 2019

It's actually a question of what node ports should be open between nodes. At present
snip
It's unfortunate that pods cannot be segregated into an SG of their own that would be separate from host network SG.

I have the same feelings as yours. And that's why I was exited about aws/amazon-vpc-cni-k8s#165, and I've opened aws/amazon-vpc-cni-k8s#208. How about implementing what we can with the former today? Also, Let's +1 on the latter to see its progress there.

Also, it should be noted that with an overlay network we would be able to close most of the node ports, and in case of Weave Net, overlay traffic can be actually encrypted. So in that case, you would want to ensure that only vital host ports are open.

An overlay network does work in certain use-cases. I've been using one for a long time. But today I'm trying to figure out how I can get meaningful AWS VPC flow logs from k8s workload and I believe I can't use an overlay network in that case.

the question here should be really about which ports should be kept close between nodes by default. At the moment we open SSH (only when needed), we always open DNS and then only high ports are open between nodegroups and to the outside world.

DNS can be closed if we could segregate node sgs and pod sgs. aws/amazon-vpc-cni-k8s#208 will allow that in a straightf-forward way.

Also, ENIConfig introduced by aws/amazon-vpc-cni-k8s#165 will help us achieve it today. That is, create a security group dedicated to pods, create ENIConfig per subnet, create ENIConfig per subnet sharing the pod security group. Then, we can associate the ENIConfig to the node on startup.

@mumoshu
Copy link
Contributor

mumoshu commented Jan 15, 2019

So my idea seems orthogonal to yours.

I'm fine if we could say the shared sg you've suggested is for pods:

shared SG would allow inter-node traffic on all ports and all protocols, no CIDR-based rules, it would live inside the cluster stack

It will initially be associated to all the nodes (and hence pods because we don't use ENIConfig yet, which means that aws-vpc-cni-k8s reuses the node sgs as pod sgs).

Introducing aws/amazon-vpc-cni-k8s#165 will allow eksctl to associate the shared SG solely to pods, which provides more security.

@errordeveloper
Copy link
Contributor Author

@mumoshu thanks a lot for the insight on AWS CNI, I've not had the time to dig in and I tend to naturally gravitate towards Weave Net to be honest also. I'll start working on shared SG today. Adding shared SG means that we will need to handle some backwards compatibility cases and provide a way to make updates to cluster stack, so some plumbing needs to be done.

@errordeveloper
Copy link
Contributor Author

@mumoshu I've opened #448 with remaining bits, perhaps we should open another issue with your notes on AWS CNI?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants