Skip to content

Conversation

BoyuanFeng
Copy link
Contributor

torch._C.Tag.cudagraph_unsafe is only used for PyTorch >= 2.9.

Signed-off-by: Boyuan Feng <boyuan@meta.com>
@BoyuanFeng
Copy link
Contributor Author

BoyuanFeng commented Sep 20, 2025

cc @xuechendi @bigPYJ1151 Thanks for #25298! This PR should be a proper fix. The issue is that the tag is only available in pytorch 2.9+.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly gates the usage of torch._C.Tag.cudagraph_unsafe based on the PyTorch version, ensuring compatibility with versions 2.9 and newer. The implementation uses is_torch_equal_or_newer which is a clean approach. However, the version-checking logic is duplicated in both vllm/attention/layer.py and tests/compile/silly_attention.py. My review focuses on refactoring this duplication into a centralized utility to improve code maintainability and prevent potential future inconsistencies.

Comment on lines 34 to 37
if is_torch_equal_or_newer("2.9.0.dev"):
tag_cudagraph_unsafe = (torch._C.Tag.cudagraph_unsafe, )
except AttributeError:
else:
tag_cudagraph_unsafe = () # type: ignore[assignment]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This logic for determining tag_cudagraph_unsafe based on the PyTorch version is also present in tests/compile/silly_attention.py. To improve maintainability and avoid potential inconsistencies in the future, it would be better to define this in a single, shared location (e.g., vllm/utils) and import it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to vllm/utils/__init__.py.

Don't keep it in vllm/attention/layer.py since other ops (e.g., moe) may use it in the future.
Don't keep it in vllm/compilation to avoid circular import.

Signed-off-by: Boyuan Feng <boyuan@meta.com>
Signed-off-by: Boyuan Feng <boyuan@meta.com>
return prompt_token_len


if is_torch_equal_or_newer("2.9.0.dev"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May you also add condition check for attribute? We are working on xpu(intel GPU) and Hpu(intel gaudi), we use torch-XPU or torch-HPU, which I am afraid will not have cudagraph related attributes or assert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes added current_platform.is_cuda_alike()

Signed-off-by: Boyuan Feng <boyuan@meta.com>
Copy link

mergify bot commented Sep 22, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @BoyuanFeng.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants