Skip to content

HookOS embedded images for Tinkerbell workflows does not work in v0.21.1 #9046

@thecloudgarage

Description

@thecloudgarage

What happened:

  • 1st Admin node: Ran a cluster creation with EKS Anywhere version v0.20.7 with the cluster configuration file provided below, cluster comes up perfectly
  • 2nd Admin node: Created another Admin node with EKS Anywhere version v0.21.1 and tried creating the cluster, but it errors out when Control Plane tries to boot via HookOS. HookOS gets downloaded, but then docker tink-logs on the Control plane node suggest that there is an error getting the image2stream image via 127.0.0.1 (please see the snip)
  • 2nd Admin node: Downgraded the EKS Anywhere version from v0.21.1 to v0.20.7 and ran cluster creation and it came up perfectly suggesting that there is a problem with EKS-A v0.21.1 HookOS embedded images change that was introduced in v0.21.0

SNIP FROM THE IDRAC CONSOLE OF THE CONTROL PLANE NODE WHEN IT FAILS AFTER BOOTING INTO HOOKOS

image

Cluster Config YAML:

apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
  name: poc5
spec:
  clusterNetwork:
    cniConfig:
      cilium: {}
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    services:
      cidrBlocks:
      - 10.96.0.0/12
  controlPlaneConfiguration:
    count: 1
    endpoint:
      host: "172.29.198.70"
    machineGroupRef:
      kind: TinkerbellMachineConfig
      name: poc5-cp
  datacenterRef:
    kind: TinkerbellDatacenterConfig
    name: poc5
  kubernetesVersion: "1.30"
  managementCluster:
    name: poc5
  workerNodeGroupConfigurations:
  - count: 2
    machineGroupRef:
      kind: TinkerbellMachineConfig
      name: poc5-wk
    name: md-0

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellDatacenterConfig
metadata:
  name: poc5
spec:
  osImageURL: "http://172.29.198.61:8080/ubuntu-2204-kube-1-30.gz"
  tinkerbellIP: "172.29.198.71"
  hookImagesURLPath: "http://172.29.198.61:8080/hook"

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellMachineConfig
metadata:
  name: poc5-cp
spec:
  hardwareSelector:
    type: cp
  osFamily: ubuntu
  users:
  - name: prd
    sshAuthorizedKeys:
    - ssh-rsa <snipped-out>

---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellMachineConfig
metadata:
  name: poc5-wk
spec:
  hardwareSelector:
    type: worker
  osFamily: ubuntu
  users:
  - name: prd
    sshAuthorizedKeys:
    - ssh-rsa <snipped-out>

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • EKS Anywhere Release:
eksctl anywhere version
Version: v0.21.1
Release Manifest URL: https://anywhere-assets.eks.amazonaws.com/releases/eks-a/manifest.yaml
Bundle Manifest URL: https://anywhere-assets.eks.amazonaws.com/releases/bundles/83/manifest.yaml
  • EKS Distro Release: v1.30.4-eks-16b398d

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions