Skip to content

ci: consolidate llama-cpp-darwin into the matrix-driven Darwin flow#9731

Merged
mudler merged 1 commit into
masterfrom
ci/consolidate-llama-darwin
May 9, 2026
Merged

ci: consolidate llama-cpp-darwin into the matrix-driven Darwin flow#9731
mudler merged 1 commit into
masterfrom
ci/consolidate-llama-darwin

Conversation

@mudler
Copy link
Copy Markdown
Owner

@mudler mudler commented May 9, 2026

Summary

Move the bespoke llama-cpp-darwin + llama-cpp-darwin-publish top-level jobs in backend.yml into the matrix-driven Darwin flow that already handles all 34 other Darwin backends. The bespoke jobs were the only Darwin code path skipping the path filter — they rebuilt on every push/cron whether backend/cpp/llama-cpp/ was touched or not.

What changes

  • .github/backend-matrix.yml — new entry under includeDarwin: { backend: llama-cpp, tag-suffix: -metal-darwin-arm64-llama-cpp, lang: go } (no build-type, the bespoke build script doesn't read one).
  • .github/workflows/backend_build_darwin.yml — adds an if: inputs.backend == 'llama-cpp' branch in the build step that drives make backends/llama-cpp-darwin (the bespoke scripts/build/llama-cpp-darwin.sh compiles three CMake variants and bundles dylibs via otool — doesn't fit the generic build-darwin-go-backend mold). The llama-cpp-aware ccache setup blocks already in this file (lines 127, 134, 144) were the half-finished consolidation that this PR completes.
  • scripts/changed-backends.jsinferBackendPathDarwin gains a special case so llama-cpp on Darwin maps to backend/cpp/llama-cpp/ (where the C++ sources live), not the non-existent backend/go/llama-cpp/ that the generic rule would produce.
  • .github/workflows/backend.yml — deletes ~80 lines: the bespoke llama-cpp-darwin and llama-cpp-darwin-publish jobs are gone.
  • Go version bump 1.24.x → 1.25.x in backend.yml and backend_pr.yml for the Darwin matrix. The bespoke job ran on 1.25.x, so this preserves llama-cpp's prior toolchain. The other 34 Darwin matrix backends pick this up too — none have a known reason to pin 1.24.

Behavior change

  • Before: llama-cpp-darwin rebuilt on every backend.yml trigger (push, cron, tag).
  • After: same path-filter as every other Darwin backend — only rebuilds when backend/cpp/llama-cpp/ changes (or via the weekly Sunday cron / a tag push, which both force the full matrix).
  • All 34 existing Darwin backends now run on Go 1.25.x. Worth flagging in review.

Test plan

  • PR-side backend_pr.yml runs generate-matrix cleanly with the new entry — already validated locally; smoke test confirms llama-cpp is in matrix-darwin output and the per-backend boolean llama-cpp=true lands when forced.
  • After merge, a touch commit to backend/cpp/llama-cpp/ schedules backend-jobs-darwin (llama-cpp, ...) only — not the unconditional bespoke job (which is gone).
  • The matrix entry produces the same final published tag (localai/localai-backends:master-metal-darwin-arm64-llama-cpp and the quay equivalent) — verified by reading docker/metadata-action config (unchanged).
  • Go 1.25.x compiles all 34 other Darwin backends without regression.

Revert plan

If Go 1.25.x breaks something on Darwin, revert this commit; the bespoke jobs come back. If the matrix flow's make backends/llama-cpp-darwin invocation has a subtle env mismatch with the bespoke job's, the special-case build step is the focused unit of revert.

Assisted-by: Claude:claude-opus-4-7

The bespoke llama-cpp-darwin + llama-cpp-darwin-publish top-level jobs
in backend.yml ran unconditionally on every backend.yml trigger
(push/cron), bypassing the path filter that all 34 other Darwin
backends already honor via backend-jobs-darwin -> backend_build_darwin.yml.

Move llama-cpp into the includeDarwin matrix:
- New entry in .github/backend-matrix.yml (lang=go, no build-type).
- backend_build_darwin.yml gains an `if: inputs.backend == 'llama-cpp'`
  build step that drives `make backends/llama-cpp-darwin`. The bespoke
  script (scripts/build/llama-cpp-darwin.sh) compiles three CMake
  variants from backend/cpp/llama-cpp and bundles dylibs via otool, so
  it doesn't fit the build-darwin-go-backend mold; the existing
  llama-cpp-aware ccache setup blocks already in this workflow are
  what motivated the consolidation in the first place.
- scripts/changed-backends.js's inferBackendPathDarwin gains a special
  case so llama-cpp on Darwin maps to backend/cpp/llama-cpp/ (the C++
  source tree) rather than the non-existent backend/go/llama-cpp/.
- Bumps Darwin go-version from 1.24.x -> 1.25.x in backend.yml and
  backend_pr.yml so llama-cpp keeps the Go toolchain it had under the
  bespoke job; the other 34 Darwin backends pick this up too with no
  known reason to pin 1.24.
- Removes ~80 lines of bespoke YAML from backend.yml.

The publish path is unchanged in shape - every Darwin backend now uses
the same crane-push leg from ubuntu-latest in
backend_build_darwin.yml; only the build target differs per backend.

After this commit, llama-cpp-darwin only rebuilds when
backend/cpp/llama-cpp/ is touched (verified locally) - same behavior
as every other Darwin backend.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
@mudler mudler merged commit 733c254 into master May 9, 2026
52 checks passed
@mudler mudler deleted the ci/consolidate-llama-darwin branch May 9, 2026 08:18
mudler added a commit that referenced this pull request May 9, 2026
… push

Yesterday two PRs (#9724 llama.cpp bump, #9731 llama-cpp-darwin
consolidation) merged 11 seconds apart. Both shared the same
backend.yml concurrency group (ci-backends-refs/heads/master-...) due
to "${{ github.head_ref || github.ref }}" — empty head_ref on push
events falls through to the static refs/heads/master. With
cancel-in-progress: true that meant the second merge cancelled the
first's in-flight backend builds. The first PR's CI never finished;
the second PR only touched CI files so its run was a no-op.

Two changes per workflow:
- group: replace "${{ github.head_ref || github.ref }}" with
  "${{ github.event.pull_request.number || github.sha }}". On PRs
  this groups by PR number (same as before, just keyed on number not
  branch name); on push events it groups per-commit, so two master
  pushes never share a group.
- cancel-in-progress: gate on github.event_name == 'pull_request' so
  rapid pushes to a PR still cancel old runs (newer push wins) but
  master pushes never cancel each other.

Trade-off vs alternatives:
- Merge queue would also solve this and additionally test the merged
  commit before it lands. Heavier process change; out of scope here.
- Allowing per-commit master concurrency means two simultaneous master
  runs may overlap and race on tag pushes, but each commit's manifest
  digest is unique and the registry is last-writer-wins on tags —
  newer commit's tag overwrites older.

Applied to 11 workflows that share the same concurrency pattern:
backend.yml, backend_pr.yml, image.yml, image-pr.yml, lint.yml,
test.yml, test-extra.yml, tests-e2e.yml, tests-aio.yml,
tests-ui-e2e.yml, generate_intel_image.yaml.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
@localai-bot localai-bot added the enhancement New feature or request label May 9, 2026
mudler added a commit that referenced this pull request May 9, 2026
Symptom: `ccache: command not found` in the Configure ccache step on
runs that hit the brew cache.

Root cause: actions/cache restores /opt/homebrew/Cellar/<formula> but
NOT the bin symlinks at /opt/homebrew/bin/*. The subsequent
`brew install` sees the Cellar entries present and decides "already
installed" — without re-running the link step. So on cache-hit runs
none of the cached formulas are actually on PATH.

Fix: explicit `brew link --overwrite` for every formula we install,
right after `brew install`. --overwrite tolerates leftover symlinks
from a partial earlier install. The 2>/dev/null + || true keeps the
step from failing if a formula is already correctly linked.

Pre-existing flake; surfaces more often as Darwin matrix coverage
grows after the llama-cpp-darwin consolidation in #9731.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
mudler added a commit that referenced this pull request May 9, 2026
#9742)

The migration shipped over a sequence of PRs (#9726#9727#9730#9731#9737#9738 plus a handful of direct-to-master fixes) and
left the .agents/ docs significantly out of date.

Updated:

- .agents/ci-caching.md (significant rewrite)
  - Cache key shape: now includes per-arch suffix (cache<suffix>-<arch>).
  - New "Workflow surfaces" overview table.
  - New "Pre-built base images (base-grpc-*)" section covering the 10
    quay.io/go-skynet/ci-cache:base-grpc-* tags, the multi-target
    Dockerfile pattern (builder-fromsource / builder-prebuilt /
    aliasing FROM), the BUILDER_BASE_IMAGE → BUILDER_TARGET derivation,
    the bootstrap-on-branch order for new variants.
  - New "Per-arch native builds + manifest merge" section: split
    matrix entries, push-by-digest, backend_merge.yml, why
    provenance: false matters.
  - New "Path filter on master push" section: changed-backends.js
    handles push events via the Compare API; weekly Sunday cron is
    the safety net for unpinned Python deps.
  - New "ccache for C++ backend builds" section.
  - New "Composite actions" section: free-disk-space and
    setup-build-disk.
  - New "Concurrency" section documenting the per-PR-per-commit group
    fix.
  - Darwin section gains the brew link --overwrite note (after-
    cache-restore symlinks weren't restored) and the llama-cpp-darwin
    consolidation context.
  - "Self-hosted runners" section confirming the matrix is free of
    arc-runner-set / bigger-runner references except the residual
    test-extra.yml vibevoice case.
  - "Touching the cache pipeline" rule list extended (provenance,
    install-base-deps.sh single-source-of-truth, base-images bootstrap
    order).

- .agents/adding-backends.md
  - Section 2 title: backend.yml -> backend-matrix.yml (path moved).
  - New paragraph on per-arch entries (platform-tag + paired matrix
    rows + auto-firing merge job).
  - New paragraph on builder-base-image for llama-cpp / ik-llama-cpp /
    turboquant.
  - Final checklist line updated accordingly.

- .agents/building-and-testing.md
  - Reference: backend.yml -> backend-matrix.yml.
  - Note about builder-base-image and BUILDER_TARGET defaulting to
    builder-fromsource for local builds.

- AGENTS.md
  - One-line description update for ci-caching.md to mention the new
    infrastructure (per-arch keys, base-grpc-*, manifest-merge,
    setup-build-disk, path filter).

Assisted-by: Claude:claude-opus-4-7

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
mudler added a commit that referenced this pull request May 10, 2026
…ash, zstd)

Symptom (run 25634195866, job 75244019809): the Configure ccache step
on the Darwin llama-cpp build aborted with:

  dyld[5647]: Library not loaded: /opt/homebrew/opt/blake3/lib/libblake3.0.dylib
    Referenced from: /opt/homebrew/Cellar/ccache/4.13.5/bin/ccache
  Abort trap: 6

The previous Darwin fix (acc5588) addressed missing /opt/homebrew/bin
symlinks after a brew cache restore by force-linking. This is a
different layer: ccache's Cellar dir IS restored from cache and IS
linked, but ccache 4.13 dynamically links against blake3 / hiredis /
xxhash / zstd at runtime, and those dependencies are NOT in the
restored Cellar paths. brew install ccache sees the ccache Cellar
present and skips the install — including skipping installation of
those transitive deps.

Two-part fix:
  - Add /opt/homebrew/Cellar/{blake3,hiredis,xxhash,zstd} to the brew
    cache restore/save paths so future cache-hit runs restore them.
  - Explicitly install + link them in the Dependencies step so even
    a fresh runner (cache miss on a new key) gets them, and brew has
    them on hand for ccache to dlopen.

Caught by run 25634195866. Pre-existing condition on Darwin runners;
surfaced because Darwin builds run more often after the llama-cpp-
darwin consolidation in #9731.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants