Skip to content

fix: substitute the EC2NodeClass placeholders in staging and production#15

Merged
stxkxs merged 1 commit into
mainfrom
fix/karpenter-nodeclass-placeholders
May 22, 2026
Merged

fix: substitute the EC2NodeClass placeholders in staging and production#15
stxkxs merged 1 commit into
mainfrom
fix/karpenter-nodeclass-placeholders

Conversation

@stxkxs
Copy link
Copy Markdown
Member

@stxkxs stxkxs commented May 22, 2026

Summary

The base karpenter-resources EC2NodeClass carries ${CLUSTER_NAME} / ${ENVIRONMENT} placeholders. The dev overlay patches them out with concrete values; staging and production never did — they defined a karpenter-config configMapGenerator that nothing consumed (no replacements: block, no substitution plugin). So the rendered staging/production EC2NodeClass shipped literal ${CLUSTER_NAME} in the IAM role and the subnet / security-group tag selectors — Karpenter can't resolve an instance profile or match any subnet, so node provisioning fails — plus an orphan karpenter-config ConfigMap into kube-system.

Fix: staging and production now patch the EC2NodeClass with their concrete cluster name + environment, exactly as the dev overlay does; the dead configMapGenerator is removed.

Pre-existing bug (predates the agent-runtime program); surfaced by the pre-deploy quality audit.

Test plan

  • kustomize build — all three overlays render a fully-resolved EC2NodeClass (no ${...}), no stray ConfigMap
  • task validate — yamllint + kustomize build all overlays, pass

The base karpenter-resources EC2NodeClass carries ${CLUSTER_NAME} and
${ENVIRONMENT} placeholders. The dev overlay patches the EC2NodeClass with
concrete values, but the staging and production overlays only defined a
karpenter-config configMapGenerator — and nothing ever consumed it (no
replacements block, no substitution plugin). The rendered staging and
production EC2NodeClass therefore shipped the literal strings:
`role: ${CLUSTER_NAME}-karpenter-node` and tag selectors keyed on
${CLUSTER_NAME}, so Karpenter could not resolve an instance-profile role
or match any subnet / security group — node provisioning would fail. The
unconsumed configMapGenerator also emitted an orphan ConfigMap into
kube-system.

Both overlays now patch the EC2NodeClass with their concrete cluster name
and environment, exactly as the dev overlay does, and the dead
configMapGenerator is removed. `kustomize build` for all three overlays
now renders a fully-resolved EC2NodeClass with no placeholders and no
stray ConfigMap.

Pre-existing bug, predating the agent-runtime program — surfaced by the
pre-deploy quality audit.
@github-actions
Copy link
Copy Markdown

CI Results

Check Status
YAML Lint
Environment Kustomize Build
dev
staging
production

All validations passed.

@stxkxs stxkxs merged commit e92577f into main May 22, 2026
5 checks passed
@stxkxs stxkxs deleted the fix/karpenter-nodeclass-placeholders branch May 22, 2026 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant