fix(gpu): inference-grpc hard-fail on no-GPU (companion to #1312) by joelteply · Pull Request #1314 · CambrianTech/continuum

joelteply · 2026-05-16T15:10:36Z

Summary

Closes the inference-grpc CPU-fallback path supervisor vhsm-d1f4 flagged in audit pass 1 finding #2 (2026-05-16). Same shape + rationale as codex's #1312 for orpheus.

Pre-fix select_best_device tried CUDA, tried Metal, then printed "Using CPU (no GPU acceleration)" and returned Device::Cpu. Same "code in fallbacks" anti-pattern Joel flagged at 900% CPU.

Change

select_best_device → Result<Device, Box<dyn Error + Send + Sync>>
caller propagates via ? — no behavior change on GPU-available hosts
Error message names what to do: rebuild with the right --features on a host with the matching GPU
Same shape as the llama.cpp n_gpu_layers: -1 GPU-only contract
Evaded the codified no_cpu_fallback_contract.rs test (only inspects llamacpp / ort_providers / llamacpp_adapter, not workers/inference-grpc)

Test plan

cargo check --features metal clean
CI green
no_cpu_fallback_contract.rs should be widened to grep workers/inference-grpc too (separate slice)

🤖 Generated with Claude Code

Companion to codex's #1312 (orpheus same-shape fix). Closes the inference-grpc CPU-fallback path supervisor vhsm-d1f4 flagged in audit pass 1 finding #2 (2026-05-16). Evaded the codified no_cpu_fallback_contract.rs test (only inspects llamacpp / ort_providers / llamacpp_adapter, not workers/inference-grpc). Pre-fix select_best_device tried CUDA, tried Metal, then printed 'Using CPU (no GPU acceleration)' and returned Device::Cpu. - select_best_device now returns Result<Device, Box<dyn Error>> - caller propagates via ?, no behavior change on GPU-available hosts - Error message names what to do - cargo check clean: --features metal

joelteply merged commit 95d825f into canary May 16, 2026
3 checks passed

joelteply deleted the fix/inference-grpc-no-cpu-fallback branch May 16, 2026 15:10

github-actions Bot added the size: S label May 16, 2026

joelteply mentioned this pull request May 16, 2026

docs(architecture): refresh CONTINUUM-ARCHITECTURE @ 2026-05-16 — substrate contract cross-link, lane-shaped roadmap, per-engine status notes #1317

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gpu): inference-grpc hard-fail on no-GPU (companion to #1312)#1314

fix(gpu): inference-grpc hard-fail on no-GPU (companion to #1312)#1314
joelteply merged 1 commit into
canaryfrom
fix/inference-grpc-no-cpu-fallback

joelteply commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joelteply commented May 16, 2026

Summary

Change

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant