Skip to content

fix(gpu): inference-grpc hard-fail on no-GPU (companion to #1312)#1314

Merged
joelteply merged 1 commit into
canaryfrom
fix/inference-grpc-no-cpu-fallback
May 16, 2026
Merged

fix(gpu): inference-grpc hard-fail on no-GPU (companion to #1312)#1314
joelteply merged 1 commit into
canaryfrom
fix/inference-grpc-no-cpu-fallback

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Summary

Closes the inference-grpc CPU-fallback path supervisor vhsm-d1f4 flagged in audit pass 1 finding #2 (2026-05-16). Same shape + rationale as codex's #1312 for orpheus.

Pre-fix select_best_device tried CUDA, tried Metal, then printed "Using CPU (no GPU acceleration)" and returned Device::Cpu. Same "code in fallbacks" anti-pattern Joel flagged at 900% CPU.

Change

  • select_best_deviceResult<Device, Box<dyn Error + Send + Sync>>
  • caller propagates via ? — no behavior change on GPU-available hosts
  • Error message names what to do: rebuild with the right --features on a host with the matching GPU
  • Same shape as the llama.cpp n_gpu_layers: -1 GPU-only contract
  • Evaded the codified no_cpu_fallback_contract.rs test (only inspects llamacpp / ort_providers / llamacpp_adapter, not workers/inference-grpc)

Test plan

  • cargo check --features metal clean
  • CI green
  • no_cpu_fallback_contract.rs should be widened to grep workers/inference-grpc too (separate slice)

🤖 Generated with Claude Code

Companion to codex's #1312 (orpheus same-shape fix). Closes the
inference-grpc CPU-fallback path supervisor vhsm-d1f4 flagged in
audit pass 1 finding #2 (2026-05-16). Evaded the codified
no_cpu_fallback_contract.rs test (only inspects llamacpp /
ort_providers / llamacpp_adapter, not workers/inference-grpc).

Pre-fix select_best_device tried CUDA, tried Metal, then printed
'Using CPU (no GPU acceleration)' and returned Device::Cpu.

- select_best_device now returns Result<Device, Box<dyn Error>>
- caller propagates via ?, no behavior change on GPU-available hosts
- Error message names what to do
- cargo check clean: --features metal
@joelteply joelteply merged commit 95d825f into canary May 16, 2026
3 checks passed
@joelteply joelteply deleted the fix/inference-grpc-no-cpu-fallback branch May 16, 2026 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant