fix: pass explicit platform to krane export to fix arm64 skill loading#1487
fix: pass explicit platform to krane export to fix arm64 skill loading#1487
Conversation
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
There was a problem hiding this comment.
Pull request overview
Fixes arm64 agent startup failures by ensuring skills-init pulls the correct architecture variant from multi-platform OCI image indexes when running krane export.
Changes:
- Detect node CPU architecture at runtime via
uname -m(normalized to OCI arch strings). - Pass an explicit
--platform linux/<arch>tokrane exportto avoid defaulting tolinux/amd64.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| )" | ||
| echo "Exporting OCI image ${_image} into ${_dest}" | ||
| krane export{{ if $.InsecureOCI }} --insecure{{ end }} "$_image" '/tmp/oci-skill.tar' | ||
| _arch="$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/')" |
There was a problem hiding this comment.
The arch normalization uses a uname -m | sed ... pipeline with substring replacements. This can produce invalid platforms for values that contain those substrings (e.g. aarch64_be -> arm64_be) and makes it hard to fail fast on unsupported architectures. Consider switching to an explicit case mapping (x86_64/amd64 -> amd64, aarch64/arm64 -> arm64) and exiting with a clear error for anything else before invoking krane.
| _arch="$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/')" | |
| _uname="$(uname -m)" | |
| case "$_uname" in | |
| x86_64|amd64) | |
| _arch="amd64" | |
| ;; | |
| aarch64|arm64) | |
| _arch="arm64" | |
| ;; | |
| *) | |
| echo "Unsupported architecture for OCI export: ${_uname}" >&2 | |
| exit 1 | |
| ;; | |
| esac |
Signed-off-by: Jeremy Alvis <jeremy.alvis@solo.io>
Description
Fixes #1486
krane exportdefaults tolinux/amd64when resolving multi-platform OCI image indexes, regardless of the actual node architecturearm64nodes,skills-initcrashes withno child with platform linux/amd64 in index <image>, putting the pod inCrashLoopBackOffand preventing the agent from becomingReadyuname -mand passes it explicitly tokranevia--platformTesting
arm64kind cluster (Apple Silicon) —TestE2EInvokeSkillInAgentpasses