Suppress benign cuBLAS warning when capturing cudagraphs with DBO #25596

SageMoore · 2025-09-24T16:40:21Z

Purpose

Currently on vllm will output the following warning when capturing DBO cudagraphs.

/home/sage/git/nm-vllm/.venv/lib/python3.12/site-packages/torch/cuda/__init__.py:1166: UserWarning: Attempting to run cuBLAS, but there was no current CUDA context! Attempting to set the primary context... (Triggered internally at /pytorch/aten/src/ATen/cuda/CublasHandlePool.cpp:179.)
(EngineCore_DP0 pid=3571065)   return torch._C._cuda_getCurrentBlasHandle()
(EngineCore_DP1 pid=3571066) /home/sage/git/nm-vllm/.venv/lib/python3.12/site-packages/torch/cuda/__init__.py:1166: UserWarning: Attempting to run cuBLAS, but there was no current CUDA context! Attempting to set the primary context... (Triggered internally at /pytorch/aten/src/ATen/cuda/CublasHandlePool.cpp:179.)
(EngineCore_DP1 pid=3571066)   return torch._C._cuda_getCurrentBlasHandle()

This warning is completely benign so we should suppress it.

Test Plan

Spun up a vllm server with DBO enabled and confirmed that the message no longer appears

Test Result

VLLM_ALL2ALL_BACKEND=deepep_low_latency vllm serve --model="deepseek-ai/DeepSeek-V2-Lite" --data-parallel-size 2 --enable-expert-parallel --gpu-memory-utilization 0.75 --enable-dbo

|Tasks|Version|     Filter     |n-shot|  Metric   |   |Value |   |Stderr|
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:|
|gsm8k|      3|flexible-extract|     5|exact_match|↑  |0.3567|±  |0.0277|
|     |       |strict-match    |     5|exact_match|↑  |0.3533|±  |0.0276|

Signed-off-by: Sage Moore <sage@neuralmagic.com>

gemini-code-assist

Code Review

This pull request addresses a benign cuBLAS warning that occurs during CUDA graph capture with DBO. The root cause is correctly identified as a missing CUDA context in the worker threads. The fix involves storing the device in the UBatchWrapper and explicitly setting it at the beginning of the _capture_ubatch_thread using torch.cuda.set_device(). This ensures a CUDA context is established before any cuBLAS operations are attempted, effectively resolving the warning. The change is clean, well-targeted, and correctly implemented. I have no further suggestions for improvement.

…5596) Signed-off-by: Sage Moore <sage@neuralmagic.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

init

eff5ec4

Signed-off-by: Sage Moore <sage@neuralmagic.com>

SageMoore requested review from WoosukKwon, robertgshaw2-redhat, njhill, ywang96, comaniac and alexm-redhat as code owners September 24, 2025 16:40

mergify bot added the v1 label Sep 24, 2025

tlrmchlsmth approved these changes Sep 24, 2025

View reviewed changes

gemini-code-assist bot reviewed Sep 24, 2025

View reviewed changes

tlrmchlsmth added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 24, 2025

tlrmchlsmth enabled auto-merge (squash) September 24, 2025 16:53

tlrmchlsmth merged commit f84a472 into vllm-project:main Sep 24, 2025
55 checks passed

yewentao256 pushed a commit that referenced this pull request Oct 3, 2025

Suppress benign cuBLAS warning when capturing cudagraphs with DBO (#2…

0e0d51c

…5596) Signed-off-by: Sage Moore <sage@neuralmagic.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Suppress benign cuBLAS warning when capturing cudagraphs with DBO #25596

Suppress benign cuBLAS warning when capturing cudagraphs with DBO #25596

Uh oh!

SageMoore commented Sep 24, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Suppress benign cuBLAS warning when capturing cudagraphs with DBO #25596

Suppress benign cuBLAS warning when capturing cudagraphs with DBO #25596

Uh oh!

Conversation

SageMoore commented Sep 24, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

SageMoore commented Sep 24, 2025 •

edited by github-actions bot

Loading