Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet.yaml set cgroupDriver: systemd instead of cgroupDriver: cgroupfs in AL2-GPU instances #3005

Closed
DanielAmmar opened this issue Dec 30, 2020 · 2 comments · Fixed by #3007
Assignees
Labels
kind/bug priority/critical Should be investigated as soon as possible

Comments

@DanielAmmar
Copy link

DanielAmmar commented Dec 30, 2020

What happened?
launch unmanaged node group with p3.2xlarge gpu (ami-0f23f1b20f58cc97f)
however it failed to start -

systemctl status kubelet
● kubelet.service - Kubernetes Kubelet
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-eksclt.al2.conf
   Active: activating (auto-restart) (Result: exit-code) since Wed 2020-12-30 14:16:36 UTC; 4s ago
     Docs: https://github.com/kubernetes/kubernetes
  Process: 22376 ExecStart=/usr/bin/kubelet --node-ip=${NODE_IP} --node-labels=${NODE_LABELS},alpha.eksctl.io/instance-id=${INSTANCE_ID} --max-pods=${MAX_PODS} --register-node=true --register-with-taints=${NODE_TAINTS} --cloud-provider=aws --container-runtime=docker --network-plugin=cni --cni-bin-dir=/opt/cni/bin --cni-conf-dir=/etc/cni/net.d --pod-infra-container-image=${AWS_EKS_ECR_ACCOUNT}.dkr.ecr.${AWS_DEFAULT_REGION}.${AWS_SERVICES_DOMAIN}/eks/pause:3.3-eksbuild.1 --kubeconfig=/etc/eksctl/kubeconfig.yaml --config=/etc/eksctl/kubelet.yaml (code=exited, status=255)
  Process: 22365 ExecStartPre=/sbin/iptables -P FORWARD ACCEPT -w 5 (code=exited, status=0/SUCCESS)
 Main PID: 22376 (code=exited, status=255)

error message:
failed to run Kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs

cat /etc/eksctl/kubelet.yaml points that cgroupDriver: systemd however I suspect it should be cgroupDriver: cgroupfs

docker cgroup in Amazon Linux 2 (GPU) is set to "cgroupfs" (vs. "systemd" in non GPU versions)

How to reproduce it?
launch gpu group node via eksctl v0.35.0

Anything else we need to know?
What OS are you using, are you using a downloaded binary or did you compile eksctl, what type of AWS credentials are you using (i.e. default/named profile, MFA) - please don't include actual credentials though!

Versions

$ eksctl version
0.35.0

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:50:19Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.9-eks-d1db3c", GitCommit:"d1db3c46e55f95d6a7d3e5578689371318f95ff9", GitTreeState:"clean", BuildDate:"2020-10-20T22:18:07Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Addiional info
I also tried to set an old GPU AMI version = "ami-0969f51a73874a795" (and even unset) - the same disappointing result.
When manually changing /etc/systemd/system/kubelet.service.d/10-eksclt.al2.conf
to include --cgroup-driver=cgroupfs and restart the service I could see the node registered successfully to my cluster.

@Callisto13 Callisto13 added the priority/critical Should be investigated as soon as possible label Dec 31, 2020
@Callisto13 Callisto13 self-assigned this Dec 31, 2020
@DanielAmmar
Copy link
Author

DanielAmmar commented Dec 31, 2020

A temporary solution is to add the following lines to the ClusterConfig yaml (only in GPU node groups):

preBootstrapCommands: 
- "sed -i 's/cgroupDriver:.*/cgroupDriver: cgroupfs/' /etc/eksctl/kubelet.yaml"

@Callisto13
Copy link
Contributor

Callisto13 commented Dec 31, 2020

After some digging here is what's going on:

  • The reasons for setting the cgroupDriver to systemd given in Have kubelet use systemd cgroup driver on al2 and ubuntu #2962 are valid, so we do want to do that for all instance types.
  • That PR added some config-writing in /etc/eksctl/kubelet.yaml and /etc/docker/daemon.json which ensured that both the kubelet and the docker daemon both start with the same driver
  • However, when it comes to GPU instances, after we write that /etc/docker/daemon.json file it gets removed by /etc/eks/accelerated-docker-custom.sh. Which means the daemon does not start with it, we get the mismatch and the kubelet fails to start.
  • This is because "Amazon EKS optimized accelerated Amazon Linux AMIs" (GPU ones) include the NVIDIA drivers and the nvidia-container-runtime and start docker/containerd with a bunch of flags (the base of which comes from /etc/systemd/system/docker.service.d/nvidia-docker-dropin.conf with some vars from /etc/sysconfig/docker and /run/docker/runtimes.env).
  • When I SSH onto the node and edit /etc/sysconfig/docker to include --exec-opt native.cgroupdriver=systemd in OPTIONS and restart (sudo systemctl daemon-reload && sudo systemctl restart docker), the kubelet starts and the node joins.
    • Creating a /etc/docker/daemon.json containing just {"exec-opts":["native.cgroupdriver=systemd"]} and restarting the docker process with sudo pkill -SIGHUP dockerd to make docker reload with that config file (not systemctl restart etc, this removes the file), also works, but is ofc not robust.
  • I am still playing with finding a "nice" (less bad) way to configure the flags that the EKS-configured service starts with as part of the create, but no joy yet. awslabs/amazon-eks-ami does not seem to be the only thing used to build the GPU images, I think whatever is used is not open yet.

So, what can be done?:

  • Work around for 0.35.0:
    • a) the solution given by the OP above should get users past this
    • b) "sed -i 's/^OPTIONS=\"/&--exec-opt native.cgroupdriver=systemd /' /etc/sysconfig/docker" will also work in preBootstrapCommands if users want the systemd driver
  • Short term options:
  • Long term:
    • a) Find out if there is a nice a way to configure the cgroup driver (and other things) that are set in these AMIs (talk to Amazon)
    • b) Ask AWS to set the cgroup driver as systemd in those AMIs (if systemd available)

(note: Docker 20.01 has the cgroupDriver set to systemd by default, so this problem may be solved in future k8s versions)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug priority/critical Should be investigated as soon as possible
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants