DMR CUDA on Windows Docker Desktop requires manual Settings toggle (not scriptable)

## Gap
Docker Model Runner on Windows Docker Desktop defaults to `llama.cpp latest-cpu` even with an RTX 5090 and `InferenceCanUseGPUVariant=True` in settings.

To switch to CUDA backend: Docker Desktop → Settings → AI → Enable two toggles:
1. Enable GPU-backed inference
2. Enable host-side TCP

Backend auto-swaps to `llama.cpp latest-cuda` after toggling. BUT these toggles are only in the GUI — `docker desktop enable model-runner --gpu cuda` doesn't exist on Windows CLI (v0.3.0). `docker model install-runner --gpu cuda` rejected: "Standalone installation not supported with Docker Desktop."

## Impact
- Carl on Windows can't get CUDA inference without manual GUI interaction
- install.sh can't automate the toggle
- install.sh can detect the gap (check `docker model status` for "cpu" vs "cuda" backend) and warn the user with instructions

## Verified (BigMama, 2026-04-17)
- Before toggle: 32 tok/s (CPU)
- After toggle: 237 tok/s (CUDA) — 7.4x improvement

## Workaround for install.sh
Detect CPU backend + NVIDIA GPU → print: "GPU detected but Docker Model Runner using CPU backend. Open Docker Desktop → Settings → AI → enable GPU-backed inference."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DMR CUDA on Windows Docker Desktop requires manual Settings toggle (not scriptable) #910

Gap

Impact

Verified (BigMama, 2026-04-17)

Workaround for install.sh

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DMR CUDA on Windows Docker Desktop requires manual Settings toggle (not scriptable) #910

Description

Gap

Impact

Verified (BigMama, 2026-04-17)

Workaround for install.sh

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions