Skip to content

docs(gap-analysis): catalogue today 23-PR Carl-OOTB push + chain status#1004

Merged
joelteply merged 4 commits into
canaryfrom
mac/gap-analysis-day-storm-update
May 2, 2026
Merged

docs(gap-analysis): catalogue today 23-PR Carl-OOTB push + chain status#1004
joelteply merged 4 commits into
canaryfrom
mac/gap-analysis-day-storm-update

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Pure docs. End-of-day snapshot of 23 PRs landed + the Carl-OOTB chain status post-push. Tomorrow has a clean baseline.

🤖 Generated with Claude Code

Test and others added 4 commits May 1, 2026 21:46
Per Joel's "OOTB on all architectures from Docker" + "5090 Windows box
available later." Extends the ORT GPU EP coverage from #985 (Mac/CUDA
only) to the full Carl-OOTB matrix:

  --features rocm     → AMD GPU (Linux). ROCmExecutionProvider.
  --features directml → Windows-native, any DX12 GPU (Nvidia/AMD/Intel).
  --features openvino → Intel CPU/GPU/VPU (Linux + Windows).

Each is a cfg-gated branch in build_ort_gpu_execution_providers(). The
no-GPU-EP-configured error message now lists all 5 features so a
contributor on a new arch sees the right --features incantation.

Cargo.toml feature definitions added at lines ~199-207. Per Joel's
"GPU 100%" rule the EPs only activate when explicitly built with the
matching feature flag — no runtime CPU fallback.

Build verified: cargo check --features metal,accelerate clean (the
new cfg branches don't fire on this Mac, no compile cost).

Validation needed on real hardware:
  - BigMama or 5090 Windows box: --features cuda + --features directml
  - Linux+AMD box (when available): --features rocm
  - Intel-Arc Linux box (rarer): --features openvino

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… just CUDA

Per Joel's "OOTB on all architectures from Docker" + the ORT EP
coverage added in #1001. Pre-fix the script only mapped Mac→metal +
Linux+Nvidia→cuda; ROCm was commented-out, Vulkan absent, Windows-
native unhandled entirely.

Detection order on Linux:
  1. nvidia-smi → cuda (highest priority — full ORT/llama.cpp/Candle)
  2. rocminfo  → rocm (AMD with ROCm runtime, full ORT EP)
  3. vulkaninfo → vulkan (AMD/Intel without ROCm; llama.cpp Vulkan
                  path; ORT EPs absent — will hard-fail at session
                  create per #985's helper, surfacing the gap clearly)
  4. else: empty → continuum-core panics at startup per #998 (no CPU
     fallback per architectural rule)

Windows-native (MINGW/MSYS/CYGWIN):
  - DirectML always (DX12 universal on Win10+)
  - +CUDA if nvidia-smi present (ORT picks CUDA first, DirectML for
    non-CUDA-supported ops)

Tested on this Mac: still resolves to "--features metal,accelerate"
(unchanged — Darwin branch).

Validation needed on real hardware:
  - 5090 Windows box: should resolve to "--features cuda,directml"
  - BigMama Linux+Nvidia: still "--features cuda,load-dynamic-ort"
    (unchanged)
  - Future Linux+AMD: will resolve to "--features rocm,load-dynamic-ort"
  - Future Linux+Intel-Arc with Vulkan loader: "--features vulkan,
    load-dynamic-ort"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ok Air on up"

Per Joel's "100% free OOTB on MacBook Air on up, accessible, high
school computer" + "we are just trying to make a viable release
candidate." Pre-fix install.sh required 28GB physical RAM and rejected
16GB MBAs with "Get a 32GB+ M-series" — categorically wrong for the
stated MBA target.

Three tiers based on Mac physical RAM:

| Tier    | RAM       | Native budget | PERSONA_MODEL                   |
|---------|-----------|---------------|---------------------------------|
| MBA     | 16-23GB   | 5GB           | qwen3.5-0.8b-general-forged (~500MB) |
| mid     | 24-31GB   | 8GB           | qwen3.5-2b-general-forged (~1.4GB)  |
| primary | 32GB+     | 12GB          | qwen3.5-4b-code-forged-GGUF (~2.7GB; original) |
| reject  | <16GB     | n/a           | hard-fail with actionable message |

Previously hardcoded NATIVE_RESERVE_MIB=12GB + DOCKER_FLOOR=10GB =
22GB headroom alone (28GB+ total). Now MBA tier needs 5+6+4 = 15GB
total minimum, which fits a 16GB MBA with ~1GB headroom for working
set spikes.

PERSONA_MODEL tiering uses the existing public continuum-ai org models
(all gated:False per earlier audit). All three remain HF-public so
Carl never needs an HF token regardless of tier.

CONTINUUM_TIER env var is exported so future code paths (compose env,
runtime feature gates for Bevy/vision/audio) can consult it. This PR
doesn't yet skip Bevy/vision pull on MBA tier — that's a follow-up
once the runtime supports a chat-only mode flag.

Failure message rewritten to be actionable:
  - Names the specific minimums + what each subsystem reserves
  - Says "16GB MBA: chat-only OOTB works (smaller model). For 32GB+:
    full multimodal experience." — gives the user a sense of what
    they get at each tier instead of just a price-tag rejection.

Validation needed:
  - 16GB MBA (when available): expect tier=MBA, install completes,
    chat works with 0.8B model
  - 32GB M-series (Joel's M5 today): expect tier=primary, no
    behavior change from current (same model, same budgets)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…atus

End-of-day snapshot: 23 PRs landed today targeting "100% free OOTB
on MacBook Air on up, install→chat with AI flawlessly" (Joel). Lists
each PR + the Carl-OOTB chain status post-push, with explicit callouts
for what's known broken / unfixed (#980 Bug 9 leak — needs live RCA;
#75 echo loops dev-tab scope; NEW-A upstream tracking).

Also documents the worktree-based parallel-AI workflow lesson learned
the hard way (3× commit cross-contamination during today's session
before switching to per-AI worktrees + SHA-to-ref push escape valve).

Pure docs change. Tomorrow's work has a clean baseline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joelteply joelteply merged commit e02e86e into canary May 2, 2026
4 checks passed
@joelteply joelteply deleted the mac/gap-analysis-day-storm-update branch May 2, 2026 02:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant