[BUG FIX][NON-CUDA]quick fix to avoid call cudagraph_unsafe in attention #25298

xuechendi · 2025-09-20T02:17:25Z

Purpose

Fix failing for non cuda device when calling torch._C.Tag.cudagraph_unsafe
from #24281

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>

gemini-code-assist

Code Review

This pull request correctly fixes a crash on non-CUDA platforms by conditionally including torch._C.Tag.cudagraph_unsafe. The approach is valid, but I've suggested a more robust implementation using a try-except block. This change would make the code more resilient by directly checking for the attribute's existence in the PyTorch library, rather than relying on vLLM's platform detection. I also recommend adding a regression test for non-CUDA environments to prevent this issue from recurring.

vllm/attention/layer.py

Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>

xuechendi · 2025-09-20T02:22:35Z

@ProExpertProg , please help to review, thanks

Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>

bigPYJ1151

the cpu ci recovered, thanks for the fix :)

…ion (vllm-project#25298) Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>

…ion (vllm-project#25298) Signed-off-by: Chendi Xue <Chendi.Xue@intel.com> Signed-off-by: charlifu <charlifu@amd.com>

…ion (#25298) Signed-off-by: Chendi Xue <Chendi.Xue@intel.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

quick fix

d2a0be6

Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>

xuechendi requested a review from LucasWilkinson as a code owner September 20, 2025 02:17

xuechendi mentioned this pull request Sep 20, 2025

[torch.compile] CUDAGraph Inductor partition integration #24281

Merged

gemini-code-assist bot reviewed Sep 20, 2025

View reviewed changes

vllm/attention/layer.py Outdated Show resolved Hide resolved

Take advice from gemini

4b71b09

Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>

Fix for mypy

f057dca

Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>

bigPYJ1151 approved these changes Sep 20, 2025

View reviewed changes

bigPYJ1151 enabled auto-merge (squash) September 20, 2025 02:56

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 20, 2025

bigPYJ1151 merged commit 6c5f82e into vllm-project:main Sep 20, 2025
52 of 54 checks passed

BoyuanFeng mentioned this pull request Sep 20, 2025

[torch.compile][Minor Fix] Gate cudagraph_unsafe tag for torch>=2.9 #25304

Open

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[BUG FIX][NON-CUDA]quick fix to avoid call cudagraph_unsafe in attent…

4f174d2

…ion (vllm-project#25298) Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>

charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025

[BUG FIX][NON-CUDA]quick fix to avoid call cudagraph_unsafe in attent…

6ddda58

…ion (vllm-project#25298) Signed-off-by: Chendi Xue <Chendi.Xue@intel.com> Signed-off-by: charlifu <charlifu@amd.com>

yewentao256 pushed a commit that referenced this pull request Oct 3, 2025

[BUG FIX][NON-CUDA]quick fix to avoid call cudagraph_unsafe in attent…

9d70c10

…ion (#25298) Signed-off-by: Chendi Xue <Chendi.Xue@intel.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[BUG FIX][NON-CUDA]quick fix to avoid call cudagraph_unsafe in attention #25298

[BUG FIX][NON-CUDA]quick fix to avoid call cudagraph_unsafe in attention #25298

Uh oh!

xuechendi commented Sep 20, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

xuechendi commented Sep 20, 2025

Uh oh!

bigPYJ1151 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[BUG FIX][NON-CUDA]quick fix to avoid call cudagraph_unsafe in attention #25298

[BUG FIX][NON-CUDA]quick fix to avoid call cudagraph_unsafe in attention #25298

Uh oh!

Conversation

xuechendi commented Sep 20, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

xuechendi commented Sep 20, 2025

Uh oh!

bigPYJ1151 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

xuechendi commented Sep 20, 2025 •

edited by github-actions bot

Loading