Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest Ubuntu Cloud Image AMI is packaged with AWS CLI version 1.x which causes /etc/eks/bootstrap.sh to silently misconfigure the cluster DNS when the EKS cluster has a custom Service IP CIDR address #963

Closed
mw-tlhakhan opened this issue Jul 6, 2022 · 8 comments

Comments

@mw-tlhakhan
Copy link

mw-tlhakhan commented Jul 6, 2022

What happened:
When an EKS cluster with a custom Kubernetes Service IP CIDR is created with Ubuntu cloud-image worker nodes, the /etc/eks/bootstrap.sh script silently misconfigures the --cluster-dns argument to kubelet in /var/snap/kubelet-eks/70/args file.

The Ubuntu cloud-image AMI (us-east-1, EKS 1.21) is ami-04c4f2c4799614025. We found this AMI from the official AWS EKS Ubuntu cloud images catalog https://cloud-images.ubuntu.com/docs/aws/eks/.

# Distro details
root@ip-10-109-4-64:~# lsb_release -a 2> /dev/null
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.4 LTS
Release:        20.04
Codename:       focal

# Kubernetes version
root@ip-10-109-4-64:~# kubectl version --short
Client Version: v1.21.9
Server Version: v1.21.12-eks-a64ea69

What you expected to happen:
The /etc/eks/bootstrap.sh script to properly configure the --cluster-dns argument to kubelet and not default to 172.20.0.10.

How to reproduce it (as minimally and precisely as possible):
Here is a link to the /etc/eks/bootstrap.sh script: https://github.com/awslabs/amazon-eks-ami/blob/master/files/bootstrap.sh#L373.

In my test case, I've extracted the command of interest executed by the bootstrap.sh script:

#
# Below is the extracted command from the /etc/eks/bootstrap.sh script, see line 373
#
AWS_DEFAULT_REGION=us-east-1
CLUSTER_NAME=iat-dev-us-east-1-eks
aws eks describe-cluster \
            --region=${AWS_DEFAULT_REGION} \
            --name=${CLUSTER_NAME} \
            --output=text \
            --query 'cluster.{endpoint: endpoint, serviceIpv4Cidr: kubernetesNetworkConfig.serviceIpv4Cidr, serviceIpv6Cidr: kubernetesNetworkConfig.serviceIpv6Cidr, clusterIpFamily: kubernetesNetworkConfig.ipFamily}'

Failure when using AWS CLI version 1.x.x

#
# The Ubuntu cloud-image ami-04c4f2c4799614025 is installed with version 1.18 of the AWS CLI
# AMI image was retrieved for us-east-1 from: https://cloud-images.ubuntu.com/docs/aws/eks/
#
root@ip-10-109-4-64:~# aws --version/
aws-cli/1.18.69 Python/3.8.10 Linux/5.13.0-1031-aws botocore/1.16.19

# 
# The aws_describe_cluster script produces None for the Service CIDR address.
#
root@ip-10-109-4-64:~# bash aws_describe_cluster.sh
None    https://6CCB47A35A560104CFDE3CAF89B1A0D6.gr7.us-east-1.eks.amazonaws.com        None    None

Anything else we need to know?:

  • Working when used with AWS CLI version 2.x.x
  • See that the 3rd field contains the custom EKS service IP CIDR address.
#
# This host is installed with version 2.7.4 of the AWS CLI
#
root@test-awscli-1:~# aws --version
aws-cli/2.7.4 Python/3.9.11 Linux/5.10.0-11-amd64 exe/x86_64.debian.11 prompt/off

#
# The same script produces the correct Service CIDR address.
#
root@test-awscli-1:~# bash aws_describe_cluster.sh
ipv4    https://6CCB47A35A560104CFDE3CAF89B1A0D6.gr7.us-east-1.eks.amazonaws.com        10.109.16.0/20  None

Environment:

  • AWS Region: us-east-1
  • Instance Type(s): r6i.4xlarge
  • EKS Platform version (use aws eks describe-cluster --name <name> --query cluster.platformVersion):
root@ip-10-109-4-64:~# aws eks describe-cluster --name iat-dev-us-east-1-eks --region us-east-1 --query cluster.platformVersion
"eks.7"
  • Kubernetes version (use aws eks describe-cluster --name <name> --query cluster.version):
root@ip-10-109-4-64:~# aws eks describe-cluster --name iat-dev-us-east-1-eks --region us-east-1 --query cluster.version
"1.21"
  • AMI Version:
    • ubuntu-eks/k8s_1.21/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20220623
    • ami-04c4f2c4799614025
  • Kernel (e.g. uname -a):
Linux ip-10-109-4-64 5.13.0-1031-aws #35~20.04.1-Ubuntu SMP Mon Jun 13 22:30:30 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  • Release information (run cat /etc/eks/release on a node):
root@ip-10-109-4-64:~# cat /etc/eks/release
cat: /etc/eks/release: No such file or directory

Workaround:

  • Update the /var/snap/kubelet-eks/70/args file by hand.
  • OR update the AWS launch template and provide kube-dns service IP to the --dns-cluster-ip to the /etc/eks/bootstrap.sh script.

Best fixes:

  • Update the AWS CLI to version 2 in the Ubuntu cloud-image AMI.
  • Update the bootstrap.sh script to require AWS CLI version 2.
@sbocinec
Copy link

@mw-tlhakhan I think it would be good to report this issue and request the AWS CLI upgrade in the Ubuntu cloud-images bug tracker https://bugs.launchpad.net/cloud-images that is used to track issues for the Ubuntu EKS AMI

@mw-tlhakhan
Copy link
Author

@sbocinec, thank you on pointer. I've created the bug report in that space. Link here https://bugs.launchpad.net/cloud-images/+bug/1982107.

I would like to request, at minimum, an enhancement to the bootstrap.sh script or at a higher abstraction on the AWS CLI version requirements. It seems clear that the author of bootstrap.sh was expecting version 2 of the AWS CLI.

@lure
Copy link

lure commented Oct 4, 2022

Another workaround is to cpecify 172.20.0. 0/16 as cluster service ipv4 cidr.
Probably, the easiest one. If you know it beforehand

@mw-tlhakhan
Copy link
Author

@lure , by default AWS EKS uses the 172.20.0.0/16 address block. See documentation here: https://docs.aws.amazon.com/eks/latest/APIReference/API_KubernetesNetworkConfigRequest.html.

AWS EKS starting ~ Oct 2021 supported configurable cluster service CIDR. See post here: https://aws.amazon.com/about-aws/whats-new/2020/10/amazon-eks-supports-configurable-kubernetes-service-ip-address-range/.

The primary reason for using a different Kubernetes service CIDR is for the following use case:

...
Previously, Amazon EKS automatically chose a value for this range based on the primary CIDR block of the Amazon VPC used by the cluster. While this worked for most cases, customers with VPCs peered to on-premise networks or other Amazon VPCs found that the EKS chosen Kubernetes service IP address range may conflict with other IP ranges in use across their network.
...
This enables customers with clusters running in a peered or direct connected network environment to ensure that their pods can communicate with external applications on networks outside the cluster.

@lure
Copy link

lure commented Oct 4, 2022 via email

@cartermckinnon
Copy link
Member

We now install 2.x CLI instead of relying on the version available in the package manager. Unfortunately I don't have an update on Ubuntu's AMI, we don't track those issues here.

@toabctl
Copy link

toabctl commented Nov 21, 2022

We now install 2.x CLI instead of relying on the version available in the package manager. Unfortunately I don't have an update on Ubuntu's AMI, we don't track those issues here.

@cartermckinnon you install 2.x in the AWS EKS AMI, not the Ubuntu AMI. Just to clarify.
So this bug is about the Ubuntu EKS Worker images. For the current status, please look at the launchpad bug here: https://bugs.launchpad.net/cloud-images/+bug/1982107

@cartermckinnon
Copy link
Member

cartermckinnon commented Nov 21, 2022

Correct; my point was the Ubuntu AMI is maintained by Canonical, not EKS. I've relayed this feedback as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants