Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow startup of AL2023 AMI #1751

Open
stijndehaes opened this issue Apr 5, 2024 · 6 comments
Open

Slow startup of AL2023 AMI #1751

stijndehaes opened this issue Apr 5, 2024 · 6 comments

Comments

@stijndehaes
Copy link

After trying the AL2023 AMI I noticed startup is way slower compared to the AL2 image.
After some help from @ndbaker1 I made the following analysis using systemd-analyze.

I added a plot using systemd-analyze, the nodeadm component appears to be in the hoth path.
With nodeadm-config.service taking 20s before it allows cloud-init.service (7.2s) to start, and then nodeadm-run.service takes 10.53s, I think it is waiting for containerd-service (8.7s) to start though.

Looking at the logs of nodeadm-config, it appears to take a long time to start before the first log(5s), but also takes 13s between starting to configure kubelet and getting the kubelet version.

Looking at the logs of nodeadm-run, most time apears to be spend looking up the sandbox image. I noticed in code this is done using containerd config dump and a regex, for some reason this takes a long time.

Not sure what can be done to optimize these steps, but I am willing to help :) I can write golang code and make custom AMI's with custom nodeadm if needed to test things out.

plot

nodeadm-config logs, via journalctl -u nodeadm-config

Apr 04 08:56:49 localhost systemd[1]: Starting nodeadm-config.service - EKS Nodeadm Config...
Apr 04 08:56:54 localhost nodeadm[1693]: {"level":"info","ts":1712221014.1338603,"caller":"init/init.go:49","msg":"Checking user is root.."}
Apr 04 08:56:54 localhost nodeadm[1693]: {"level":"info","ts":1712221014.1339903,"caller":"init/init.go:57","msg":"Loading configuration..","configSource":"imds://user-data"}
Apr 04 08:56:54 localhost nodeadm[1693]: {"level":"info","ts":1712221014.2056038,"caller":"init/init.go:66","msg":"Loaded configuration","config":{"metadata":{"creationTimestamp":null},"spec":{"cluster":{"name":"datafy-dp-eks-dev-sdh","apiServerEndpoin>
Apr 04 08:56:54 localhost nodeadm[1693]: {"level":"info","ts":1712221014.2059212,"caller":"init/init.go:68","msg":"Enriching configuration.."}
Apr 04 08:56:54 localhost nodeadm[1693]: {"level":"info","ts":1712221014.2059379,"caller":"init/init.go:148","msg":"Fetching instance details.."}
Apr 04 08:56:54 localhost nodeadm[1693]: SDK 2024/04/04 08:56:54 DEBUG attempting waiter request, attempt count: 1
Apr 04 08:56:55 localhost nodeadm[1693]: {"level":"info","ts":1712221015.7738488,"caller":"init/init.go:161","msg":"Instance details populated","details":{"id":"i-048abcd8a00795a14","region":"eu-west-1","type":"m5.xlarge","availabilityZone":"eu-west-1c>
Apr 04 08:56:55 localhost nodeadm[1693]: {"level":"info","ts":1712221015.7739344,"caller":"init/init.go:162","msg":"Fetching default options..."}
Apr 04 08:56:55 localhost nodeadm[1693]: {"level":"info","ts":1712221015.7748718,"caller":"init/init.go:170","msg":"Default options populated","defaults":{"sandboxImage":"602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/pause:3.5"}}
Apr 04 08:56:55 localhost nodeadm[1693]: {"level":"info","ts":1712221015.7749023,"caller":"init/init.go:73","msg":"Validating configuration.."}
Apr 04 08:56:55 localhost nodeadm[1693]: {"level":"info","ts":1712221015.7749126,"caller":"init/init.go:78","msg":"Creating daemon manager.."}
Apr 04 08:56:55 localhost nodeadm[1693]: {"level":"info","ts":1712221015.7760847,"caller":"init/init.go:96","msg":"Configuring daemons..."}
Apr 04 08:56:55 localhost nodeadm[1693]: {"level":"info","ts":1712221015.7761178,"caller":"init/init.go:103","msg":"Configuring daemon...","name":"containerd"}
Apr 04 08:56:55 localhost nodeadm[1693]: {"level":"info","ts":1712221015.7761524,"caller":"containerd/config.go:38","msg":"Writing containerd config to file..","path":"/etc/containerd/config.toml"}
Apr 04 08:56:55 localhost nodeadm[1693]: {"level":"info","ts":1712221015.894721,"caller":"init/init.go:107","msg":"Configured daemon","name":"containerd"}
Apr 04 08:56:55 localhost nodeadm[1693]: {"level":"info","ts":1712221015.8947728,"caller":"init/init.go:103","msg":"Configuring daemon...","name":"kubelet"}
Apr 04 08:57:08 localhost nodeadm[1693]: {"level":"info","ts":1712221028.7341733,"caller":"kubelet/config.go:300","msg":"Detected kubelet version","version":"v1.28.5"}
Apr 04 08:57:08 localhost nodeadm[1693]: {"level":"info","ts":1712221028.7360287,"caller":"kubelet/config.go:218","msg":"Setup IP for node","ip":"10.0.3.248"}
Apr 04 08:57:08 localhost nodeadm[1693]: {"level":"info","ts":1712221028.7370236,"caller":"kubelet/config.go:351","msg":"Writing kubelet config to file..","path":"/etc/kubernetes/kubelet/config.json"}
Apr 04 08:57:09 localhost nodeadm[1693]: {"level":"info","ts":1712221029.3084157,"caller":"init/init.go:107","msg":"Configured daemon","name":"kubelet"}
Apr 04 08:57:09 localhost systemd[1]: nodeadm-config.service: Deactivated successfully.
Apr 04 08:57:09 localhost systemd[1]: Finished nodeadm-config.service - EKS Nodeadm Config.

nodeadm-run logs, via journalctl -u nodeadmrun:

Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal systemd[1]: Starting nodeadm-run.service - EKS Nodeadm Run...
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.5944886,"caller":"init/init.go:49","msg":"Checking user is root.."}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.5945692,"caller":"init/init.go:57","msg":"Loading configuration..","configSource":"imds://user-data"}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.5981271,"caller":"init/init.go:66","msg":"Loaded configuration","config":{"metadata":{"creationTimestamp":null},"spec":{"cluster":{"name":"datafy-dp>
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.5983663,"caller":"init/init.go:68","msg":"Enriching configuration.."}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.5983822,"caller":"init/init.go:148","msg":"Fetching instance details.."}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: SDK 2024/04/04 08:57:17 DEBUG attempting waiter request, attempt count: 1
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.72048,"caller":"init/init.go:161","msg":"Instance details populated","details":{"id":"i-048abcd8a00795a14","region":"eu-west-1","type":"m5.xlarge",">
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.7205393,"caller":"init/init.go:162","msg":"Fetching default options..."}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.7216415,"caller":"init/init.go:170","msg":"Default options populated","defaults":{"sandboxImage":"602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/p>
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.721685,"caller":"init/init.go:73","msg":"Validating configuration.."}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.7216954,"caller":"init/init.go:78","msg":"Creating daemon manager.."}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.7227976,"caller":"init/init.go:112","msg":"Setting up system aspects..."}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.7228348,"caller":"init/init.go:115","msg":"Setting up system aspect..","name":"local-disk"}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.7228491,"caller":"system/local_disk.go:26","msg":"Not configuring local disks!"}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.7228668,"caller":"init/init.go:119","msg":"Set up system aspect","name":"local-disk"}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.722877,"caller":"init/init.go:115","msg":"Setting up system aspect..","name":"networking"}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.722903,"caller":"system/networking.go:79","msg":"writing eks_primary_eni_only network configuration"}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.7329752,"caller":"init/init.go:119","msg":"Set up system aspect","name":"networking"}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.7330172,"caller":"init/init.go:128","msg":"Ensuring daemon is running..","name":"containerd"}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.7368443,"caller":"init/init.go:132","msg":"Daemon is running","name":"containerd"}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.7369227,"caller":"init/init.go:134","msg":"Running post-launch tasks..","name":"containerd"}
Apr 04 08:57:17 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221037.7369375,"caller":"containerd/sandbox.go:20","msg":"Looking up current sandbox image in containerd config.."}
Apr 04 08:57:25 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221045.1754317,"caller":"containerd/sandbox.go:32","msg":"Found sandbox image","image":"602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/pause:3.5"}
Apr 04 08:57:25 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221045.1754868,"caller":"containerd/sandbox.go:34","msg":"Fetching ECR authorization token.."}
Apr 04 08:57:25 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221045.2011657,"caller":"containerd/sandbox.go:48","msg":"Pulling sandbox image..","image":"602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/pause:3.5"}
Apr 04 08:57:25 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: E0404 08:57:25.202548    1918 remote_image.go:135] PullImage "602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/pause:3.5" from image service failed: rpc error: code = Unavailable d>
Apr 04 08:57:27 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221047.207119,"caller":"containerd/sandbox.go:48","msg":"Pulling sandbox image..","image":"602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/pause:3.5"}
Apr 04 08:57:28 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221048.0419524,"caller":"containerd/sandbox.go:53","msg":"Finished pulling sandbox image","image-ref":"sha256:6996f8da07bd405c6f82a549ef041deda57d1d658ec20>
Apr 04 08:57:28 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221048.0420063,"caller":"init/init.go:138","msg":"Finished post-launch tasks","name":"containerd"}
Apr 04 08:57:28 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221048.042022,"caller":"init/init.go:128","msg":"Ensuring daemon is running..","name":"kubelet"}
Apr 04 08:57:28 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221048.0442576,"caller":"init/init.go:132","msg":"Daemon is running","name":"kubelet"}
Apr 04 08:57:28 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221048.0443828,"caller":"init/init.go:134","msg":"Running post-launch tasks..","name":"kubelet"}
Apr 04 08:57:28 ip-10-0-3-248.eu-west-1.compute.internal nodeadm[1918]: {"level":"info","ts":1712221048.044407,"caller":"init/init.go:138","msg":"Finished post-launch tasks","name":"kubelet"}
Apr 04 08:57:28 ip-10-0-3-248.eu-west-1.compute.internal systemd[1]: nodeadm-run.service: Deactivated successfully.
Apr 04 08:57:28 ip-10-0-3-248.eu-west-1.compute.internal systemd[1]: Finished nodeadm-run.service - EKS Nodeadm Run.

Originally posted by @stijndehaes in #1696 (comment)

@stijndehaes
Copy link
Author

@cartermckinnon suggested it might have something to do with EBS lazy block fetching resulting in slow startup for the executables we need: #1696 (comment)

@stijndehaes
Copy link
Author

stijndehaes commented Apr 5, 2024

One way we can speed things up is to make sure that kubelet and containerd are loaded earlier in the chain.
We could for example:

  • have containerd and kubelet systemd service started at boot, and restarted by nodeadm after configuring, but this could have a big impact
  • 2 systemd oneshot service that just call kubelet --version and containerd --version, this sounds easy to me, but we need to make sure they are started first in the chain, and don't block the rest

@cartermckinnon
Copy link
Member

cartermckinnon commented Apr 5, 2024

We can test the EBS hypothesis by enabling fast-restore on the snapshot: https://docs.aws.amazon.com/ebs/latest/userguide/ebs-fast-snapshot-restore.html

This is pretty expensive so I understand if you don't want to give it a try; but if you can update the PR description with your launch steps we can have consistency across our experiments @stijndehaes

@stijndehaes
Copy link
Author

@cartermckinnon I am willing to give it a try.
As to your launch steps question, I am not enterily sure I understand what you want. Can you elaborate?

@stijndehaes
Copy link
Author

stijndehaes commented Apr 8, 2024

Here is the new result, with fast restore on the snapshot enabled.
Sadly I see roughly the same timing, I also see similar delays in the logs.

plot

sh-5.2# journalctl -u nodeadm-run
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal systemd[1]: Starting nodeadm-run.service - EKS Nodeadm Run...
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.2066824,"caller":"init/init.go:49",">
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.206749,"caller":"init/init.go:57","m>
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.2111814,"caller":"init/init.go:66",">
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.2115018,"caller":"init/init.go:68",">
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.211519,"caller":"init/init.go:148",">
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: SDK 2024/04/08 10:05:55 DEBUG attempting waiter request, attempt cou>
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.3542361,"caller":"init/init.go:161",>
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.3543026,"caller":"init/init.go:162",>
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.3551373,"caller":"init/init.go:170",>
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.3551688,"caller":"init/init.go:73",">
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.3551803,"caller":"init/init.go:78",">
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.3563209,"caller":"init/init.go:112",>
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.356349,"caller":"init/init.go:115",">
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.356361,"caller":"system/local_disk.g>
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.356374,"caller":"init/init.go:119",">
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.3563807,"caller":"init/init.go:115",>
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.3564048,"caller":"system/networking.>
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.3663833,"caller":"init/init.go:119",>
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.3664064,"caller":"init/init.go:128",>
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.3696218,"caller":"init/init.go:132",>
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.3697126,"caller":"init/init.go:134",>
Apr 08 10:05:55 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570755.3697586,"caller":"containerd/sandbox>
Apr 08 10:06:00 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570760.477823,"caller":"containerd/sandbox.>
Apr 08 10:06:00 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570760.4778726,"caller":"containerd/sandbox>
Apr 08 10:06:00 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570760.516175,"caller":"containerd/sandbox.>
Apr 08 10:06:00 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: E0408 10:06:00.530553    1914 remote_image.go:135] PullImage "602401>
Apr 08 10:06:02 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570762.5316176,"caller":"containerd/sandbox>
Apr 08 10:06:03 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570763.0406716,"caller":"containerd/sandbox>
Apr 08 10:06:03 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570763.0407143,"caller":"init/init.go:138",>
Apr 08 10:06:03 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570763.0407345,"caller":"init/init.go:128",>
Apr 08 10:06:03 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570763.0431368,"caller":"init/init.go:132",>
Apr 08 10:06:03 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570763.0432801,"caller":"init/init.go:134",>
Apr 08 10:06:03 ip-10-0-3-27.eu-west-1.compute.internal nodeadm[1914]: {"level":"info","ts":1712570763.0433705,"caller":"init/init.go:138",>
Apr 08 10:06:03 ip-10-0-3-27.eu-west-1.compute.internal systemd[1]: nodeadm-run.service: Deactivated successfully.
Apr 08 10:06:03 ip-10-0-3-27.eu-west-1.compute.internal systemd[1]: Finished nodeadm-run.service - EKS Nodeadm Run.
sh-5.2# journalctl -u nodeadm-config
Apr 08 10:05:33 localhost systemd[1]: Starting nodeadm-config.service - EKS Nodeadm Config...
Apr 08 10:05:37 localhost nodeadm[1687]: {"level":"info","ts":1712570737.4740438,"caller":"init/init.go:49","msg":"Checking user is root.."}
Apr 08 10:05:37 localhost nodeadm[1687]: {"level":"info","ts":1712570737.4741585,"caller":"init/init.go:57","msg":"Loading configuration..">
Apr 08 10:05:37 localhost nodeadm[1687]: {"level":"info","ts":1712570737.5820394,"caller":"init/init.go:66","msg":"Loaded configuration","c>
Apr 08 10:05:37 localhost nodeadm[1687]: {"level":"info","ts":1712570737.5823655,"caller":"init/init.go:68","msg":"Enriching configuration.>
Apr 08 10:05:37 localhost nodeadm[1687]: {"level":"info","ts":1712570737.5824058,"caller":"init/init.go:148","msg":"Fetching instance detai>
Apr 08 10:05:38 localhost nodeadm[1687]: SDK 2024/04/08 10:05:38 DEBUG attempting waiter request, attempt count: 1
Apr 08 10:05:38 localhost nodeadm[1687]: {"level":"info","ts":1712570738.696702,"caller":"init/init.go:161","msg":"Instance details populat>
Apr 08 10:05:38 localhost nodeadm[1687]: {"level":"info","ts":1712570738.6967807,"caller":"init/init.go:162","msg":"Fetching default option>
Apr 08 10:05:38 localhost nodeadm[1687]: {"level":"info","ts":1712570738.6998506,"caller":"init/init.go:170","msg":"Default options populat>
Apr 08 10:05:38 localhost nodeadm[1687]: {"level":"info","ts":1712570738.699901,"caller":"init/init.go:73","msg":"Validating configuration.>
Apr 08 10:05:38 localhost nodeadm[1687]: {"level":"info","ts":1712570738.6999104,"caller":"init/init.go:78","msg":"Creating daemon manager.>
Apr 08 10:05:38 localhost nodeadm[1687]: {"level":"info","ts":1712570738.7011337,"caller":"init/init.go:96","msg":"Configuring daemons..."}
Apr 08 10:05:38 localhost nodeadm[1687]: {"level":"info","ts":1712570738.7011614,"caller":"init/init.go:103","msg":"Configuring daemon...",>
Apr 08 10:05:38 localhost nodeadm[1687]: {"level":"info","ts":1712570738.701194,"caller":"containerd/config.go:38","msg":"Writing container>
Apr 08 10:05:38 localhost nodeadm[1687]: {"level":"info","ts":1712570738.746515,"caller":"init/init.go:107","msg":"Configured daemon","name>
Apr 08 10:05:38 localhost nodeadm[1687]: {"level":"info","ts":1712570738.74656,"caller":"init/init.go:103","msg":"Configuring daemon...","n>
Apr 08 10:05:47 localhost nodeadm[1687]: {"level":"info","ts":1712570747.596923,"caller":"kubelet/config.go:300","msg":"Detected kubelet ve>
Apr 08 10:05:47 localhost nodeadm[1687]: {"level":"info","ts":1712570747.5988646,"caller":"kubelet/config.go:218","msg":"Setup IP for node">
Apr 08 10:05:47 localhost nodeadm[1687]: {"level":"info","ts":1712570747.5999613,"caller":"kubelet/config.go:351","msg":"Writing kubelet co>
Apr 08 10:05:48 localhost nodeadm[1687]: {"level":"info","ts":1712570748.362136,"caller":"init/init.go:107","msg":"Configured daemon","name>
Apr 08 10:05:48 localhost systemd[1]: nodeadm-config.service: Deactivated successfully.
Apr 08 10:05:48 localhost systemd[1]: Finished nodeadm-config.service - EKS Nodeadm Config.

@dims
Copy link
Member

dims commented Apr 8, 2024

@stijndehaes the output for journalctl -u nodeadm-run is truncated (only 140 characters per line...) can you please capture the full output? (like you did in your issue text right at the top?

from the initial log, it seems like containerd config dump could be taking time?

zap.L().Info("Looking up current sandbox image in containerd config..")
// capture the output of a `containerd config dump`, which is the final
// containerd configuration used after all of the applied transformations
dump, err := exec.Command("containerd", "config", "dump").Output()
if err != nil {
return err
}
matches := containerdSandboxImageRegex.FindSubmatch(dump)
if matches == nil {
return fmt.Errorf("sandbox image could not be found in containerd config")
}
sandboxImage := string(matches[1])
zap.L().Info("Found sandbox image", zap.String("image", sandboxImage))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants