Skip to content

Apple Foundation Models Backend PoC

coo1white edited this page Jun 8, 2026 · 1 revision

Apple Foundation Models Backend (PoC)

Goal: prove, with minimal effort, that CW can burn Apple's free on-device fuel — wrap Apple's on-device model as a local endpoint and let CW drive it as a backend to fulfill a worker.

Metaphor: fit CW — the neutral car — with a nozzle for Apple's on-device fuel. See The Car Metaphor and Execution Backends.

Core judgment: the biggest unknown is not CW, it's the Apple side — whether the on-device model can be driven headlessly, and whether the ToS allows using it as a general-purpose backend. So de-risk Apple first, standalone, before touching CW.

Phase A — De-risk the Apple side first (no CW)

Write a thin macOS Swift bridge that exposes the on-device model as a 127.0.0.1 endpoint.

A1. Swift bridge (illustrative — verify API names against current docs)

import FoundationModels   // Apple Intelligence, macOS 26+ capable hardware

// minimal HTTP server on 127.0.0.1:8765
// POST /complete  {"prompt":"..."}  ->  {"text":"...","model":"apple-on-device"}
let session = LanguageModelSession()
let reply = try await session.respond(to: prompt)
// return reply.content (+ any usage/model metadata the framework exposes)

A2. Raw test

curl -s localhost:8765/complete -d '{"prompt":"Summarize: ..."}'

Success = headless, no UI, repeatable Apple-model text.

Decision gate: if the framework refuses headless/background calls, or the ToS forbids using the on-device model as a general backend, stop here. This risk lives entirely on the Apple side — write no CW code until Phase A passes.

Phase B — Wire the bridge into CW (after the v0.1.38 agent backend lands)

Prerequisite: today's remote/ci backends delegate commands, not natural-language prompts. Fulfilling an NL worker with Apple's model specifically needs the v0.1.38 agent backend (see Real Execution Backends and the Agent Delegation Drive that follows it). So Phase A can be done now; Phase B waits on v0.1.38.

B1. Register the bridge as an agent backend (HTTP endpoint or command template — vendor-neutral by design):

backendId: agent  ->  http://127.0.0.1:8765/complete
# or command template:  node bridge-call.js --prompt {{prompt}}

B2. Run one worker (a small repo / single-worker app):

cw run architecture-review --drive --backend agent   # or dispatch a single worker

B3. Verify on the CW side — four points:

  • result.md is accepted and passes the evidence gate (recordWorkerOutput)
  • the attestation's model id comes only from what the bridge reports (apple-on-device / unreported) — never synthesized by CW
  • fail-closed: kill the bridge mid-run → the worker parks, no fabricated completion
  • replay: re-running the snapshot re-verifies digests and does not re-spawn the bridge

Why this is the right PoC

  • The smallest action that proves "CW can burn Apple's free on-device fuel."
  • Puts the biggest unknown (Apple headless access) first and in isolation.
  • Reuses the v0.1.38 agent backend — zero new lock-in; Apple is just another backendId.

What success unlocks

A + B pass → CW gains a free / private / offline fuel option: burn Apple on-device for cheap workers, Claude/GPT for the hard ones, with one orchestration + audit layer across both — the "privacy + auditability + cheap local fuel" story. A fails → Apple-as-backend is closed for now; nothing lost on the CW side.

Open unknowns to verify

  • Exact FoundationModels API (LanguageModelSession / respond)
  • Headless / background eligibility + hardware requirements (Apple Intelligence-capable Mac)
  • ToS for using the on-device model as a general-purpose backend
  • Whether the framework exposes usage / model metadata (for the attestation)

Related: The Car Metaphor · Execution Backends · Real Execution Backends

Clone this wiki locally