Skip to content

Current EKS instructions break kubelet memory metrics (otel, etc.) #618

@dfsdevops

Description

@dfsdevops

Spegel version

v0.0.25

Kubernetes distribution

EKS

Kubernetes version

v1.29.8-eks-a737599

CNI

Amazon VPC CNI

Describe the bug

I have been using Karpenter (as well as terraform,) to update the user data field when launching new nodes with the following spec:

---
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiSelectorTerms:
  - alias: al2@v20240910
  userData: |
    set -ex
    mkdir -p /etc/containerd/config.d
    cat > /etc/containerd/config.d/spegel.toml << EOL
    [plugins."io.containerd.grpc.v1.cri".registry]
      config_path = "/etc/containerd/certs.d"
    [plugins."io.containerd.grpc.v1.cri".containerd]
      discard_unpacked_layers = false
    EOL
    /etc/eks/bootstrap.sh

I'm basing this off these instructions https://github.com/spegel-org/spegel/blob/main/docs/COMPATIBILITY.md#eks

there seems to be an issue in that all of our Otel metrics for memory usage from the kubeletstats receiver are coming back as 0 rather than the number of bytes. I suspect that the patch is clobbering a containerd config dictionary that has an effect on kubelet metrics, as per this comment, a potential reason: awslabs/amazon-eks-ami#1628 (comment)

might be related to the default_runtime_name key thats being used the table here:

[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "runc"
discard_unpacked_layers = true

My suggestion is that the documentation switch to using sed to override the template, as this seems to avoid the issue. I confirmed that after switching to this in my userData: field in Karpenter, I begin to see memory metrics in Otel again.

sed -i 's/discard_unpacked_layers = true/discard_unpacked_layers = false/g'  /etc/eks/containerd/containerd-config.toml

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions