Open
Description
Investigate the possibility of relying on the CRI to retry pulling docker images from different hosts (i.e. try quay and then dockerhub).
Determine which CRI is used by AmazonLinux2 AMIs. It might be helpful to ssh into the instance https://eksctl.io/usage/schema/#nodeGroups-ssh-enableSsm. Update generate_eks.py before building the cluster.
def default_nodegroup(cluster_config):
return {
"ami": "auto",
"ssh": {"enableSsm": True},
"iam": {"withAddonPolicies": {"autoScaler": True}},
"privateNetworking": cluster_config.get("subnet_visibility", "public") != "public",
"kubeletExtraConfig": {
"kubeReserved": {"cpu": "150m", "memory": "300Mi", "ephemeral-storage": "1Gi"},
"kubeReservedCgroup": "/kube-reserved",
"systemReserved": {"cpu": "150m", "memory": "300Mi", "ephemeral-storage": "1Gi"},
"evictionHard": {"memory.available": "200Mi", "nodefs.available": "5%"},
},
}
In the AWS console, select a worker instance of the cluster, click connect and you should be able to ssh into the worker using AWS SSM.
Useful links:
https://github.com/containerd/containerd/blob/master/docs/cri/registry.md#configure-registry-endpoint
https://kubernetes.io/docs/concepts/containers/runtime-class/