ci(release): build docker images on native arm64 runners#940
Closed
ci(release): build docker images on native arm64 runners#940
Conversation
Split each cross-arch docker build into per-arch build jobs on native ubuntu-24.04-arm/ubuntu-latest runners, then fuse per-arch digests into a multi-arch manifest in a downstream merge job. Eliminates QEMU emulation which stalled the full/latest variants past the 6h job timeout (see run 24516158412). Applies to release.yaml (docker-images, docker-web) and release-beta.yaml (docker-images).
vanducng
added a commit
to vanducng/goclaw
that referenced
this pull request
Apr 16, 2026
vanducng
added a commit
to vanducng/goclaw
that referenced
this pull request
Apr 17, 2026
Validation branch for upstream PR. Contains: - Native arm64 runners (upstream PR nextlevelbuilder#940 equivalent) - docker-registry-login composite action - docker-multiarch reusable workflow - release.yaml + release-beta.yaml refactored to callers - ubuntu-24.04 pinned across workflows - DOCKERHUB_IMAGE: dataplanelabs/goclaw (fork-local patch)
Contributor
Author
|
Superseded by #946, which combines this arm64 fix with a DRY refactor (composite action + reusable workflow). Fork-validated on both beta and stable paths. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Root cause
Run 24516158412 showed
docker-images (full)anddocker-images (latest)stalled for 6 hours before GHA cancelled them at the job timeout. Thebase(19 min) andotel(27 min) variants on the same workflow finished fine.The bottleneck is QEMU emulating
linux/arm64on anamd64runner. WhenENABLE_PYTHON=true, the Alpine build layer runs pnpm + pip source-compiles pandas/numpy/lxml under QEMU — effectively single-threaded software emulation of a different ISA. This reliably hangs or takes 10× longer than native.Fix
Replace the single cross-arch buildx job with a 2-stage pipeline:
Stage 1 —
*-build(per-variant × per-arch, native runners)linux/amd64jobs run onubuntu-latest(existing free runner)linux/arm64jobs run onubuntu-24.04-arm(GitHub-hosted ARM runner, free for public repos)digest-<variant>-<arch>)Stage 2 —
*-merge(per-variant, fuses digests into multi-arch manifest)docker buildx imagetools createtwice — once for GHCR tags, once for Docker Hub tags — using GHCR digests as the source (content-addressable, accepted by both registries)Same pattern applied to:
release.yaml:docker-images→docker-images-build+docker-images-merge;docker-web→docker-web-build+docker-web-mergerelease-beta.yaml:docker-images→docker-images-build+docker-images-mergenotify-discord.needsupdated to referencedocker-images-mergeanddocker-web-merge.Validation
python3 yaml.safe_load(actionlint not installed locally — CI is the authoritative validator)setup-qemu-actionremaining in either filelinux/amd64,linux/arm64platform lines remaining:vX.Y.Z,:vX.Y.Z-full,:full,:latest,:beta,:beta-full, etc.imagetools createinvocationsChecklist
notify-discordneeds updatedfail-fast: falseon all new matrices