bug(nodeadm): containerd BaseRuntimeSpec config not merged correctly #1928

vedantrathore · 2024-08-19T08:38:39Z

What happened:
While setting custom rlimits to containerd containers as mentioned in the documentation you can use a custom NodeConfig passed into the user data to override it. However we are using karpenter for provisioning nodes and it adds a custom NodeConfig to the user data as well to establish cluster connectivity. When adding our custom NodeConfig to the user data, the baseRuntimeSpec config key is not reflected in the merged NodeConfig.

User data script:

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="BOUNDARY"

--BOUNDARY
Content-Type: application/node.eks.aws

# Karpenter Generated NodeConfig
---
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
metadata:
  creationTimestamp: null
spec:
  cluster:
    apiServerEndpoint: xxxxxxxxxx
    certificateAuthority: xxxxxxxxxxx
    cidr: xxxxxxxxxx
    name: xxxxxxxxxx
  containerd: {}
  instance:
    localStorage: {}
  kubelet:
    config:
      clusterDNS:
      - xxxxxxxxxx
      maxPods: xx
    flags:
    - --node-labels="karpenter.sh/capacity-type=on-demand"

--BOUNDARY
Content-Type: application/node.eks.aws

---
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
  containerd:
    baseRuntimeSpec:
      process:
        rlimits:
          - type: RLIMIT_NOFILE
            soft: 1024
            hard: 1024

--BOUNDARY--

What you expected to happen:
Expected merged NodeConfig

apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
  cluster:
    apiServerEndpoint: xxxxxxxxxx
    certificateAuthority: xxxxxxxxxxx
    cidr: xxxxxxxxxx
    name: xxxxxxxxxx
  instance:
    localStorage: {}
  kubelet:
    config:
      clusterDNS:
      - xxxxxxxxxx
      maxPods: xx
    flags:
    - --node-labels="karpenter.sh/capacity-type=on-demand"
  containerd:
    baseRuntimeSpec:
      process:
        rlimits:
          - type: RLIMIT_NOFILE
            soft: 1024
            hard: 1024

But the result is

apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
  cluster:
    apiServerEndpoint: xxxxxxxxxx
    certificateAuthority: xxxxxxxxxxx
    cidr: xxxxxxxxxx
    name: xxxxxxxxxx
  instance:
    localStorage: {}
  containerd: {}
  kubelet:
    config:
      clusterDNS:
      - xxxxxxxxxx
      maxPods: xx
    flags:
    - --node-labels="karpenter.sh/capacity-type=on-demand"

How to reproduce it (as minimally and precisely as possible):

Pass the above mentioned node config into user data of any EC2 instance, ssh and run nodeadm init
The merged node config should be visible in the logs.

Anything else we need to know?:
I dived into nodeadm source code to understand why it is happening. My understanding is, when we call the config.Merge(nodeConfig) here we setup our merger with some custom transformers using mergo.WithTransformers(nodeConfigTransformer{})

The transformer sets up custom transformers for containerd and kubelet config using type reflection but we only setup the transformer for Config key of containerd config and not BaseRuntimeSpec. I reused the logic from the func transformKubeletConfig and tested by making the following changes to the transformer to see if the merge functionality is working as expected. Please let me know this makes sense, would be happy to send out a PR for this fix.

const (
	kubeletFlagsName  = "Flags"
	kubeletConfigName = "Config"

	containerdBaseRuntimeSpecName = "BaseRuntimeSpec"
	containerdConfigName = "Config"
)

type nodeConfigTransformer struct{}

func (t nodeConfigTransformer) Transformer(typ reflect.Type) func(dst, src reflect.Value) error {
	if typ == reflect.TypeOf(ContainerdOptions{}) {
		return func(dst, src reflect.Value) error {
			t.transformContainerdConfig(
				dst.FieldByName(containerdConfigName),
				src.FieldByName(containerdConfigName),
			)

			if err := t.transformContainerdBaseRuntimeSpec(
				dst.FieldByName(containerdBaseRuntimeSpecName),
				src.FieldByName(containerdBaseRuntimeSpecName),
			); err != nil {
				return err
			}

			return nil
		}
	} else if typ == reflect.TypeOf(KubeletOptions{}) {
		return func(dst, src reflect.Value) error {
			t.transformKubeletFlags(
				dst.FieldByName(kubeletFlagsName),
				src.FieldByName(kubeletFlagsName),
			)

			if err := t.transformKubeletConfig(
				dst.FieldByName(kubeletConfigName),
				src.FieldByName(kubeletConfigName),
			); err != nil {
				return err
			}

			return nil
		}
	}
	return nil
}

The text was updated successfully, but these errors were encountered:

cartermckinnon · 2024-08-20T16:43:49Z

@vedantrathore makes sense to me, feel free to open a PR!

vedantrathore · 2024-08-21T11:16:55Z

Sounds good, would there be any different merge strategy in case of transformContainerdBaseRuntimeSpec ? Currently I just reused the same code as for transformKubeletConfig

cartermckinnon · 2024-08-22T16:40:24Z

@vedantrathore makes sense to me, you can add a test case to confirm

cartermckinnon changed the title ~~Nodeadm does not merge containerd BaseRuntimeSpec config correctly~~ bug(nodeadm): containerd BaseRuntimeSpec config not merged correctly Aug 23, 2024

ndbaker1 mentioned this issue Oct 12, 2024

nodeadm(bug): scope merge transformer with internal types #2003

Merged

Issacwww closed this as completed in #2003 Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug(nodeadm): containerd BaseRuntimeSpec config not merged correctly #1928

bug(nodeadm): containerd BaseRuntimeSpec config not merged correctly #1928

vedantrathore commented Aug 19, 2024

cartermckinnon commented Aug 20, 2024

vedantrathore commented Aug 21, 2024

cartermckinnon commented Aug 22, 2024

bug(nodeadm): containerd BaseRuntimeSpec config not merged correctly #1928

bug(nodeadm): containerd BaseRuntimeSpec config not merged correctly #1928

Comments

vedantrathore commented Aug 19, 2024

cartermckinnon commented Aug 20, 2024

vedantrathore commented Aug 21, 2024

cartermckinnon commented Aug 22, 2024