Skip to content

feat(llm): add QMD_FORCE_CUDA env var to disable Vulkan offloading#338

Closed
JasonOA888 wants to merge 1 commit into
tobi:mainfrom
JasonOA888:feat/force-cuda-env
Closed

feat(llm): add QMD_FORCE_CUDA env var to disable Vulkan offloading#338
JasonOA888 wants to merge 1 commit into
tobi:mainfrom
JasonOA888:feat/force-cuda-env

Conversation

@JasonOA888
Copy link
Copy Markdown

Problem

On Windows VMs with para-virtualized GPUs (e.g., ExHyperV RTX 4090), QMD may use Vulkan offloading instead of pure CUDA mode even when CUDA is working correctly:

$ qmd status
GPU: vulkan (offloading: yes)

Solution

Add QMD_FORCE_CUDA environment variable to force CUDA and disable Vulkan:

export QMD_FORCE_CUDA=1
qmd query "test"

This sets gpu: "cuda" in getLlama() options, bypassing the auto-detection that might choose Vulkan.

Related

Fixes #278

On Windows VMs with para-virtualized GPUs, QMD may use Vulkan
offloading instead of pure CUDA mode even when CUDA is available.

This adds QMD_FORCE_CUDA env var to force CUDA and disable Vulkan:

  export QMD_FORCE_CUDA=1
  qmd query "test"

Fixes tobi#278
@tobi
Copy link
Copy Markdown
Owner

tobi commented May 20, 2026

Closing as part of the post-v2.5.1 backlog cleanup: this item is over two months old and has either been superseded by newer QMD releases or is too stale to keep actionable. If it still reproduces on v2.5.1 or newer, please open a fresh focused issue with current repro steps.

@tobi tobi closed this May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Add --force-cuda parameter to disable Vulkan offloading

2 participants