You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What I expected to happen:
Scale-in activities should take the same average amount of time either with admin container enabled or disabled.
What actually happened:
Scale-in activities is taking more than 5 minutes when the admin container is enabled.
If not enable, the scale-in process takes less than 2 minutes.
Apparently there is a once sigterm hits containerd, systemd starts repeatedly trying to deactivate the mount for what seems to be the admin host container without success.
How to reproduce the problem:
Spin up a Managed Node Group, or Karpenter Nodepool with Bottlerocket family AMI.
Enable admin container.
Scale-out to any amount of replicas.
Scale-in.
Feb 15 10:31:03 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.cNhkWt.mount: Deactivated successfully.
Feb 15 10:31:13 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.V9Sqd1.mount: Deactivated successfully.
Feb 15 10:31:33 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.fb2IM1.mount: Deactivated successfully.
Feb 15 10:31:43 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.8yARBs.mount: Deactivated successfully.
Feb 15 10:31:53 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.nNPatX.mount: Deactivated successfully.
Feb 15 10:32:23 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.wNYNZV.mount: Deactivated successfully.
Feb 15 10:32:43 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.adSYp4.mount: Deactivated successfully.
Feb 15 10:32:53 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.KXK3eY.mount: Deactivated successfully.
Feb 15 10:33:03 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.8na3Hj.mount: Deactivated successfully.
Feb 15 10:33:03 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.Q2oofj.mount: Deactivated successfully.
Feb 15 10:33:23 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.rzEq2c.mount: Deactivated successfully.
Feb 15 10:33:33 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.hIHHGm.mount: Deactivated successfully.
Feb 15 10:33:43 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.cl4hiM.mount: Deactivated successfully.
Feb 15 10:34:03 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.V8Ow0G.mount: Deactivated successfully.
Feb 15 10:34:13 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.7Ys1Dd.mount: Deactivated successfully.
Feb 15 10:34:13 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.DEGKUp.mount: Deactivated successfully.
Feb 15 10:34:43 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.PvgYkQ.mount: Deactivated successfully.
Feb 15 10:34:43 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.DWMUA7.mount: Deactivated successfully.
Feb 15 10:34:53 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.2BYjLl.mount: Deactivated successfully.
Feb 15 10:35:43 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.tlckbN.mount: Deactivated successfully.
Feb 15 10:35:53 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.1par7Q.mount: Deactivated successfully.
Feb 15 10:35:53 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.HdZbur.mount: Deactivated successfully.
Feb 15 10:36:03 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.9QlOtj.mount: Deactivated successfully.
Feb 15 10:36:13 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.Eg7exB.mount: Deactivated successfully.
Feb 15 10:36:23 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.ralHiZ.mount: Deactivated successfully.
Feb 15 10:36:26 ip-192-168-66-3.us-west-2.compute.internal apiserver[971]: 10:36:26 [INFO] Received exec request to localhost:/exec
Feb 15 10:36:26 ip-192-168-66-3.us-west-2.compute.internal apiserver[971]: 10:36:26 [INFO] exec process returned 0
Feb 15 10:36:26 ip-192-168-66-3.us-west-2.compute.internal apiserver[971]: 10:36:26 [INFO] Closing exec connection; message: "0"
Feb 15 10:36:26 ip-192-168-66-3.us-west-2.compute.internal apiserver[971]: 10:36:26 [INFO] Received exec request to localhost:/exec
Feb 15 10:36:33 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.hwUmTa.mount: Deactivated successfully.
Feb 15 10:36:33 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.E7VdUM.mount: Deactivated successfully.
Feb 15 10:36:37 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: Configuration file /etc/systemd/system/kubelet.service.d/exec-start.conf is marked world-inaccessible. This has no effect as configuration data is accessible via APIs without restrictions. Proceeding anyway.
Feb 15 10:36:43 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.087Jfc.mount: Deactivated successfully.
Feb 15 10:37:13 ip-192-168-66-3.us-west-2.compute.internal systemd[1]: run-containerd-runc-k8s.io-b3cd8f645b9345f01fb9a5976473d691862beeff1a60207bc28f7d36a0d4a197-runc.zRVIty.mount: Deactivated successfully.
The text was updated successfully, but these errors were encountered:
One thing we noticed is that the container that seems to be problematic is in the k8s-io namespace which means it is not the admin container. I don't think I see anything related to the admin container (though we can't rule out some interaction there).
Can you list the containers running on a host that is in this state?
enter-admin-container and use sudo sheltie, then ctr --namespace k8s.io images ls
Image I'm using:
bottlerocket-aws-k8s-1.28-x86_64-v1.19.1-c325a08b
What I expected to happen:
Scale-in activities should take the same average amount of time either with admin container enabled or disabled.
What actually happened:
Scale-in activities is taking more than 5 minutes when the admin container is enabled.
If not enable, the scale-in process takes less than 2 minutes.
Apparently there is a once sigterm hits
containerd
,systemd
starts repeatedly trying to deactivate the mount for what seems to be the admin host container without success.How to reproduce the problem:
Spin up a Managed Node Group, or Karpenter Nodepool with Bottlerocket family AMI.
Enable admin container.
Scale-out to any amount of replicas.
Scale-in.
The text was updated successfully, but these errors were encountered: