Skip to content

DMR CUDA on Windows Docker Desktop requires manual Settings toggle (not scriptable) #910

@joelteply

Description

@joelteply

Gap

Docker Model Runner on Windows Docker Desktop defaults to llama.cpp latest-cpu even with an RTX 5090 and InferenceCanUseGPUVariant=True in settings.

To switch to CUDA backend: Docker Desktop → Settings → AI → Enable two toggles:

  1. Enable GPU-backed inference
  2. Enable host-side TCP

Backend auto-swaps to llama.cpp latest-cuda after toggling. BUT these toggles are only in the GUI — docker desktop enable model-runner --gpu cuda doesn't exist on Windows CLI (v0.3.0). docker model install-runner --gpu cuda rejected: "Standalone installation not supported with Docker Desktop."

Impact

  • Carl on Windows can't get CUDA inference without manual GUI interaction
  • install.sh can't automate the toggle
  • install.sh can detect the gap (check docker model status for "cpu" vs "cuda" backend) and warn the user with instructions

Verified (BigMama, 2026-04-17)

  • Before toggle: 32 tok/s (CPU)
  • After toggle: 237 tok/s (CUDA) — 7.4x improvement

Workaround for install.sh

Detect CPU backend + NVIDIA GPU → print: "GPU detected but Docker Model Runner using CPU backend. Open Docker Desktop → Settings → AI → enable GPU-backed inference."

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions