Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EKS] [request]: Remove requirement of public IPs on EKS managed worker nodes #607

Closed
atheiman opened this issue Nov 27, 2019 · 53 comments
Closed
Assignees
Labels
EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue

Comments

@atheiman
Copy link

atheiman commented Nov 27, 2019

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request
Remove requirement of public IPs on EKS managed worker nodes. If worker nodes have egress access to the apiserver and the apiserver can reach the worker nodes in the same vpc by private ip, I dont think public IP should be required. Assigning public ips to k8s worker nodes (or any ec2 instances) is a security vulnerability some organizations wont accept.

Which service(s) is this request for?
EKS

update 4/22
This change is now made, see the details here


update 4/17
We're planning to make the change to managed node groups to stop assigning public IPs to nodes on April 22, 2020.

If you are launching nodes into public subnets, you'll need to change your subnet settings to set the mapPublicIpOnLaunch to TRUE so that IPs are assigned and the nodes can connect to the public cluster endpoint. If you are not using public subnets, starting April 20, you can create a new node group and public IPs will no longer be assigned.

We'll update on this thread when the change is live.

We wrote a blog announcing this change and how to check the public IP assignment settings for your VPC.

We also wrote a deep dive blog on node networking that explains the various options for subnet and cluster endpoint configuration.

@atheiman atheiman added the Proposed Community submitted issue label Nov 27, 2019
@mikestef9 mikestef9 added the EKS Amazon Elastic Kubernetes Service label Nov 27, 2019
@rtripat
Copy link

rtripat commented Dec 11, 2019

We are evaluating an API opt-in to disable setting public IP on managed nodes. However, for some context on why Managed node groups get public IP in assigned in all subnets. From our documentation

"Amazon EKS managed node groups can be launched in both public and private subnets. The only requirement is for the subnets to have outbound internet access. Amazon EKS automatically associates a public IP to the instances started as part of a managed node group to ensure that these instances can successfully join a cluster.

This ensures compatibility with existing VPCs created using eksctl or the Amazon EKS-vended AWS CloudFormation templates. The public subnets in these VPCs do not have MapPublicIpOnLaunch set to true, so by default instances launched into these subnets are not assigned a public IP address. Creating a public IP address on these instances launched by managed node groups ensures that they have outbound internet access and are able to join the cluster."

@atheiman
Copy link
Author

atheiman commented Dec 11, 2019

Yea, thats the part in your documentation I was surprised by. Seems like an unnecessary security hole going a long way to ensure nodes have outbound internet access. Public IPs on EC2s is a security violation in lots of organizations.

I really think opting out of public IPs on worker nodes will be a requirement for many organizations to even try out managed worker nodes.

@whereisaaron
Copy link

whereisaaron commented Dec 12, 2019

How about defaulting to what the user asked for 😄

If a users launches instances in a subnet they have configured with MapPublicIpOnLaunch disabled, the nodes should not get a public IP assigned by default - isn't that the whole point of the setting? It seems unhelpful and counter-intuitive to override users' express request not to add public IP addresses.

Could you instead warn/inform the user in AWS console or eksctl output that the nodes are launching in a subnet with public IP disabled? And point them to the documentation to fix that if they made mistake by disabling public IPs?

Please make this subnet override an optional setting.

@Siva-R
Copy link

Siva-R commented Dec 25, 2019

In which version of EKS we can expect this to be fixed?

@mikestef9 mikestef9 added this to Researching in containers-roadmap Jan 3, 2020
@omerfsen
Copy link

+1

@davidalger
Copy link

davidalger commented Jan 15, 2020

We are evaluating an API opt-in to disable setting public IP on managed nodes. However, for some context on why Managed node groups get public IP in assigned in all subnets. From our documentation…

While the quoted documentation does mention eksctl or CF template related reasoning, it seems a rather contrived reason. Require these networks be upgraded to utilize managed node groups and provide an upgrade path. The requirement is that "subnets have outbound internet access" which can be accomplished by having a NAT Gateway on the subnet. I.e. fulfilling the stated requirement to allow the worker node access to the control plane doesn't even need to require a Public IP be associated to the worker node.

AWS documentation on Cluster VPC considerations states the following:

We recommend a network architecture that uses private subnets for your worker nodes and public subnets for Kubernetes to create internet-facing load balancers within.

To me it seems rather bizarre that EKS best practices would outline creating both public and private subnets (with the public ones being used for things like ELB creation) yet a major managed feature addition to EKS was launched with a Public IP magically added to the EC2 instances created in a "private" subnet.

IMO an "API opt-in to disable setting public IP on managed nodes" is the wrong way to go about it. It should be opt-in to have public IP addressing applied to instances created in subnets where public addressing is turned off.

@bchav
Copy link

bchav commented Jan 21, 2020

Hey folks, dropping in to provide some clarification around this to help with some of the security concerns here. This is indeed confusing, and we're working on product changes to make this less so, but it doesn't place your managed nodes in private subnets at additional risk.

Due to the way VPC networking works, managed nodes with a public IP in a private subnet do not behave any differently than regular instances in private subnets. All egress to the internet would require traversal of a NAT Gateway or NAT instance, and the public IP address will not be used. Public IPs require a route to the internet via an Internet Gateway (IGW). As such, the public IP of these instances are not accessible from the internet. The public IP address is effectively a no-op. The only access to nodes in private subnets comes from other resources that share the "Cluster Security Group".

More about the cluster security group: https://docs.aws.amazon.com/eks/latest/userguide/sec-group-reqs.html#cluster-sg

@adonley
Copy link

adonley commented Jan 21, 2020

Even though the above reasoning makes absolute sense, my org requires that we document and port scan any "public ip" for compliance. We could write a tool that looks for nodes coming and going from node groups and listens to node group creation but it would make life easier not to deal with these issues (also it's less confusing for the security team / auditors ;) )

@prashant-wipro
Copy link

Any ETA on when this fix will be made available.

@olipachi
Copy link

olipachi commented Feb 5, 2020

Like some folks have mentioned above, my org is one that explicitly blocks public IPs on EC2s. While I understand and appreciate the explanation that these public IPs are essentially no-ops, all of our security scanning tools don't feel the same way. Any ETA on when this feature would be delivered?

@abdennour
Copy link

Exposing nodes to internet is very bad. network policies can deny egress/ingress but it is at the pod level not node level.
any deadline to fix this issue ?

@zucler
Copy link

zucler commented Feb 11, 2020

Even though nodes can be deployed into private subnets, having public IP assigned would fail security audits from any of the Fortune 500 companies. They simply wouldn't want the tools to dictate their security checklist and as right now they are strict about no public IPs.

@tabern
Copy link
Contributor

tabern commented Feb 12, 2020

Thanks everyone - your feedback is clear and we’ve been listening.

We agree its critical to respect subnet settings used for managed node groups. Our current plan is to update the behavior of managed node groups so that public IPs are not automatically assigned by managed node groups. This change will only apply to new node groups. Existing node groups will be unaffected.

Without taking additional steps, your EC2 instances need outbound internet access to join an EKS cluster. Nodes launched into private subnets typically access the internet through a NAT gateway, while nodes in a public subnet need an elastic IP or static public IP assigned. Some public subnets (eg: the default ones created by eksctl today) have the setting “Auto-assign Public IP Address” set to false. Without a public IP, these nodes will fail to become ready since they cannot access the internet.

Once this change is made, if your nodes cannot join the cluster, we want to make the problem simple to diagnose. We’re thinking to create a warning and health issue that explains why your nodes may not be able to connect to the cluster in the case you are launching nodes into a subnet using where “Auto-assign Public IP Address” is set to false. We're also planning to update our CloudFormation VPC templates for public only VPCs and eksctl to set “Map Public IP address” to true for public subnets created using eksctl or CloudFormation.

We are working on this with high priority and welcome your feedback.

-- Nate && Raghav

@zucler
Copy link

zucler commented Feb 12, 2020

Hi @tabern, thank you for acknowledging the issue and shedding light on the future plans.

Just want to clarify on outbound internet access requirement. Currently, it is possible to launch managed node groups into completely private subnet (no internet/NAT gateways whatsoever) by enabling private endpoint option on EKS cluster and also creating ECR and EC2 PrivateLink endpoints into the subnet and S3 gateway into the associated routing table.

I hope that the new change won't affect this functionality and also the warning wouldn't be triggered incorrectly if nodes have no internet access, but are able to join the cluster.

@tabern
Copy link
Contributor

tabern commented Feb 12, 2020

@zucler that is correct. These are the 'additional steps' I was referencing in my last note. The change here directly enables people doing these steps today by stopping the assignment of public IPs to nodes in private subnets. However, it directly impacts customers running public subnets where they do not have this same setup and nodes can fail to join the cluster. It's only for these public subnets where we would show a warning. Our goal is to balance both sides here so regardless of your network setup, you get the right information to make sure everything works as you intend.

@fcervantes2
Copy link

@tabern thank you for your explanation. Is there an ETA for when we will have access to this feature?

@cdenneen
Copy link

@abdennour

Exposing nodes to internet is very bad. network policies can deny egress/ingress but it is at the pod level not node level.
any deadline to fix this issue ?

As stated by @bchav

All egress to the internet would require traversal of a NAT Gateway or NAT instance, and the
public IP address will not be used. Public IPs require a route to the internet via an Internet
Gateway (IGW). As such, the public IP of these instances are not accessible from the internet.
The public IP address is effectively a no-op.

@priyavartk
Copy link

So what I understood is that "auto assign public ip" feature has no meaning for EKS managed nodes's subnet? Its anyway going to assign them IP?

@lukegriffith
Copy link

What is the ETA or release cadence for this product, and when can we expect this to be available. This is pretty fundamental.

@tabern
Copy link
Contributor

tabern commented Mar 27, 2020

@cdenneen I'm not sure I follow. Essentially we are doing what we discussed above, respecting the IP assignment settings of the subnet. If you have public subnets and are using the cluster public endpoint, you'll need to ensure those subnets are set to assign public IPs to the nodes b/c MNG will no longer assign them. If you are using eksctl and starting nodes with private networking, you do not need to take any action - the nodes will start in private subnets and either route to the public endpoint via the subnet's nat gateway or through the private endpoint.

Your comment above is correct. If you are already using private networking, you don't have to take any action except creating a new node group after we launch the change (we won't change behavior of existing node groups).

@cdenneen
Copy link

@tabern with eksctl it will not create a managedNodeGroup if there are no publicSubnets specified. They have made that as condition since privateNetworking couldn’t be set to true. If this is going to change then it’s something that needs to be relayed back to eksctl team for an update to coincide.

@tabern
Copy link
Contributor

tabern commented Mar 30, 2020

@cdenneen yep! We work closely with that team and this will be one of the things the change on the MNG IP assignment behavior will unblock.

@karlskewes
Copy link

We also wrote a deep dive blog on node networking that explains the various options for subnet and cluster endpoint configuration.

Can you please update the link to the deep dive blog? 404 for me.

@tabern
Copy link
Contributor

tabern commented Apr 17, 2020

Hi everyone,

We are going to delay this change until Wednesday 4/22 in order to ensure our customers who are still creating node groups in public subnets have time to complete updating their subnet settings. One of our top priorities for EKS is operational stability, and we are making this change out of an abundance of caution in order to minimize the risk of outages for our customers.

--Nate

@bkbwese
Copy link

bkbwese commented Apr 20, 2020

From the main posts 4/17 update:

"If you are not using public subnets, starting April 20, you can create a new node group and public IPs will no longer be assigned."

@tabern I just redeployed my node groups in us-east-1 and they are still getting public IPs. Is there a specific time that this change should be live?

Scratch that, the update in the main post is a bit misleading, because there were two dates. I obtained clarity by viewing the blog post https://aws.amazon.com/blogs/containers/upcoming-changes-to-ip-assignment-for-eks-managed-node-groups/

@lado-g
Copy link

lado-g commented Apr 22, 2020

Today i got on every eks cluster error from node-groups: AccessDenied.
All day trying to find what was the problem and found this page.
Okay, i deleted every node group and created new one in private subnet, but instances are still getting public ips, as i understand it should have only private ip, am i wrong?

@deepakpatel82
Copy link

I tried creating a new node group and i see the public ip still assigned to the worker EC2, subnets are private. Is this change in place?

@tabern
Copy link
Contributor

tabern commented Apr 22, 2020

Hey Everyone,

As of 1pm PST today, the behavior for MNG in all regions is to respect subnet settings and not allocate public IPs unless mapPublicIpOnLaunch is set to TRUE in the subnet settings.

This behavior will be in affect for all new managed node groups that you create on any EKS clusters. Existing node groups will continue to assign public IP addresses. You will need to create new node groups to get this behavior for existing clusters.

Also, please be aware that in the case where you are attempting to create a new node group using a public subnet where mapPublicIpOnLaunch is set to FALSE, node group creation will fail.

Check the subnet setttings for your VPC:

Run:

$ aws ec2 describe-subnets \
--filters "Name=vpc-id,Values=<VPC-ID>" | grep 'MapPublicIpOnLaunch\|SubnetId\|VpcId\|State'

Learn more:

@tabern tabern closed this as completed Apr 22, 2020
containers-roadmap automation moved this from Coming Soon to Just Shipped Apr 22, 2020
@lado-g
Copy link

lado-g commented Apr 23, 2020

Okay, but as i wrote before i tried to create new node group in private subnet and it still gets public ip address, did this functionality release only for certain regions?

@tabern
Copy link
Contributor

tabern commented Apr 23, 2020

@lado-g this change is deployed in all regions - we did just launch this at 1pm PST, and your comment was at 5:21AM PST, have you tried again?

@vedat227
Copy link

vedat227 commented Apr 23, 2020

hey;
I'm getting "NodeCreationFailure: Instances failed to join the kubernetes cluster" now (the same terraform script was working yesterday). would this be anything todo with this change ? private access for "API server endpoint access" is enabled

-- PS --
I checked the "/var/log/messages" on the node and saw "ec2.us-east-1.amazonaws.com: Name or service not known" in the logs. It turned out VPC endpoint configuration was wrong for "ec2". Somehow it worked when node had the public IP. I fixed the VPC endpoints and it worked...

@shawnjohnson
Copy link

FWIW - I created a new "private" managed node group today, and the worker nodes came up with only private IP's - no public IP assigned - yay!

@kvcmarshall6
Copy link

@tabern like @vedat227 I am also getting "NodeCreationFailure: Instances failed to join the kubernetes cluster" for CloudFormation scripts that were okay working yesterday. Is there a field we can add to CloudFormation templates that will override this change to generate public IP as before?

@rtripat
Copy link

rtripat commented Apr 23, 2020

@katemarshall96 You need to set MapPublicIpOnLaunch to true for your public Subnets to generate public IP by default.

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ec2-subnet.html#cfn-ec2-subnet-mappubliciponlaunch

More details in our blog post.

@kvcmarshall6
Copy link

Thanks @rtripat - I am actually working on private subnets though

@vedat227
Copy link

vedat227 commented Apr 23, 2020

Thanks @rtripat - I am actually working on private subnets though

I'm working on private subnet as well. make sure "API server endpoint access" is enabled on the cluster and your node, in private subnet, can access to the AWS services (ec2/ecr/...) via internet gw or VPC end point. I'd double check routing rules from private subnet to NAT GW. In my case connectivity was the problem.... good luck :)

@kvcmarshall6
Copy link

Thanks @vedat227! I tried your suggestion with various combinations re. enabling API server endpoint access from within and outside VPC and checked routing but can't spot anything out of sorts 👎 stumped on what to do with this one...

@jodem
Copy link

jodem commented Apr 24, 2020

@katemarshall96 check you VPC dns settings : https://docs.aws.amazon.com/eks/latest/userguide/cluster-endpoint.html

Also you can start manually an instance in the subnet and double check that all network is ok. (My route was pointing to IGW instead of the NAT, classic mistake).

@kvcmarshall6
Copy link

I worked around it by using a mix of public and private subnets - couldn't get it working any more using only private subnets!

@tabern
Copy link
Contributor

tabern commented Apr 24, 2020

@katemarshall96 have you followed the steps in the 'Private Only Subnets' section of our cluster networking blog? You need to ensure that ECR and EC2 private link are enabled to use only private subnets now b/c the nodes need to be able to connect to ECR and download images that we vend in order to connect to the cluster. (previously they had public IPs so this was not required).

Using public+private subnets my guess is you have a nat gateway in your public subnets which is allowing the nodes to route out to connect to ECR, etc and properly boot.

@mzs114
Copy link

mzs114 commented Jun 24, 2020

@katemarshall96
I am not sure whether this helps you now(perhaps for the new people who might face similar problem).

I was trying to setup managed nodegroups in private subnets, it kept failing(like multiple times), as suggested by one commentator, I then checked the private subnets and it had both igw and natgw in its routing tables, I removed igw and then had to try twice to get it working, it still failed the first time.

The private subnets were created(manually) and shared from another team and I had no idea this could be an issue.

Before this, I also tried enabling the private and public access to the API end point just in case, however this did not solve the problem.
I tried it with Terraform, the vpc_config looks like this:

  vpc_config {
    endpoint_private_access = "true"
    subnet_ids              = var.all_subnet_ids
  }

@abdennour
Copy link

@shawnjohnson the same as well ! I was able to assign private subnets for managed node groups.. eks 1.16 , terraform-eks module 12.1.0

@Shahard2
Copy link

What about existing nodes ?

@aktaksh
Copy link

aktaksh commented Feb 11, 2022

Pub IP feature is running plenty of critical apps on worker so they should not be removed.

@vmpowercli
Copy link

Don't know if this feature has already been implemented or not but I created a new EKS cluster v1.22 with public and private subnets and the host got created in private subnets. Created an LB in public subnets and can able to hit the pod which is running on private subnets Worker nodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EKS Amazon Elastic Kubernetes Service Proposed Community submitted issue
Projects
Development

No branches or pull requests