Skip to content

perf: split stack build/push buildx driver and cache strategy#36

Merged
kaiitunnz merged 2 commits into
mainfrom
kaiitunnz/perf/build-optim
May 12, 2026
Merged

perf: split stack build/push buildx driver and cache strategy#36
kaiitunnz merged 2 commits into
mainfrom
kaiitunnz/perf/build-optim

Conversation

@kaiitunnz
Copy link
Copy Markdown
Collaborator

@kaiitunnz kaiitunnz commented May 12, 2026

Purpose

Optimize local stack image iteration without giving up the multi-arch / cross-machine cache properties needed for publishing. stack build and stack push had been sharing one bake invocation that always pulled registry cache and assumed whatever buildx builder the user had active — fast for cold-start CI, slow for tight local edit-build-run loops. Split the two paths so each uses the buildx driver and cache layout that actually fits its workload.

Changes

  • cli/stack/src/flowmesh_cli_stack/stack.py — split _run_bake so load uses the native docker driver with no registry cache, and push uses docker-container with registry cache in/out. Replace the old _ensure_multiplatform_builder_support check with _ensure_buildx_builder_ready (verifies the named builder exists and runs the expected driver, surfaces a create hint for the push builder) and _switch_active_buildx_builder (auto-switches the active builder, prompting for confirmation unless --force). stack build and stack push each gain --builder and -f/--force; defaults are default and flowmesh-multiarch respectively.
  • tests/cli/test_stack_build.py — cover the new helpers (driver match / mismatch / missing-with-hint, switch no-op / force / prompt / decline) and assert that load passes --builder + no registry cache, and push passes --builder + cache-from + cache-to.

Design

The native docker driver builds straight into the daemon's image store and reuses its layer cache, so local iteration avoids the registry roundtrip that cache-from=type=registry would force on every build. It cannot, however, export cache-to=type=registry or build multi-arch images — exactly what push needs. The docker-container driver covers both, at the cost of an extra container hop and --load no longer placing images directly into the daemon. Keeping each command on the driver that matches its goal removes the compromise.

Auto-switching the active buildx builder (rather than only passing --builder to bake) means follow-on docker buildx invocations the user runs by hand stay consistent with what the FlowMesh command just did. The confirmation prompt prevents a silent change to the user's environment; --force is the CI / scripted escape hatch.

For push, a missing flowmesh-multiarch builder surfaces the exact docker buildx create ... command the user needs to run — same UX as before, just routed through the new helper.

Test Plan

uv run pre-commit run --all-files
uv run pytest tests --ignore=tests/worker/test_mp_executor_cleanup_gpu.py

# Smoke the new switch prompt against a real buildx environment with
# 'flowmesh-multiarch' selected as the active builder:
flowmesh stack build

Test Result

$ uv run pre-commit run --all-files
Detect hardcoded secrets.................................................Passed
isort....................................................................Passed
black....................................................................Passed
ruff check...............................................................Passed
codespell................................................................Passed
mypy.....................................................................Passed
sync requirements........................................................Passed

$ uv run pytest tests --ignore=tests/worker/test_mp_executor_cleanup_gpu.py
759 passed, 18 warnings in 38.39s

$ flowmesh stack build
Active buildx builder is 'flowmesh-multiarch'; switch to 'default'? [y/N]: y
...
Images built locally.

Pre-submission Checklist
  • I have read the contribution guidelines.
  • I have run `pre-commit run --all-files` and fixed any issues.
  • I have added or updated tests covering my changes (if applicable).
  • I have verified that `uv run pytest tests/` passes locally.
  • If I changed shared schemas or proto definitions, I have checked downstream compatibility across Server and Worker.
  • If I changed the SDK or CLI, I have verified the affected packages work (`uv sync --all-packages --group ci --frozen`).
  • If this is a breaking change, I have prefixed the PR title with `[BREAKING]` and described migration steps above.
  • I have updated documentation or config examples if user-facing behavior changed.

`stack build` now targets the native `docker` driver and relies on the
local builder cache only, so iteration no longer pays a registry cache
roundtrip on every build. `stack push` targets `docker-container` with
registry cache in/out, preserving multi-arch publishing and
cross-machine cache sharing.

Both commands gain `--builder` and `-f/--force`. The active buildx
builder is switched (with confirmation unless `--force`) so the bake
invocation can't silently run against an unintended builder, and the
selected builder's driver is verified up front.

Signed-off-by: Noppanat Wadlom <noppanat.wad@gmail.com>
@kaiitunnz kaiitunnz requested a review from timzsu May 12, 2026 10:06
Copy link
Copy Markdown
Collaborator

@timzsu timzsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One note worth resolving:  the CLI docs are now stale because stack push enforces a docker-container buildx builder while docs/CLI.md still says Docker Engine with containerd image store is acceptable, so the PR should update the docs to describe the new flowmesh-multiarch builder setup plus the --builder and --force behavior.

Signed-off-by: Noppanat Wadlom <noppanat.wad@gmail.com>
@kaiitunnz kaiitunnz merged commit e224d6f into main May 12, 2026
12 of 13 checks passed
@kaiitunnz kaiitunnz deleted the kaiitunnz/perf/build-optim branch May 12, 2026 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants