-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[torch.compile][Minor Fix] Gate cudagraph_unsafe tag for torch>=2.9 #25304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Boyuan Feng <boyuan@meta.com>
cc @xuechendi @bigPYJ1151 Thanks for #25298! This PR should be a proper fix. The issue is that the tag is only available in pytorch 2.9+. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly gates the usage of torch._C.Tag.cudagraph_unsafe
based on the PyTorch version, ensuring compatibility with versions 2.9 and newer. The implementation uses is_torch_equal_or_newer
which is a clean approach. However, the version-checking logic is duplicated in both vllm/attention/layer.py
and tests/compile/silly_attention.py
. My review focuses on refactoring this duplication into a centralized utility to improve code maintainability and prevent potential future inconsistencies.
vllm/attention/layer.py
Outdated
if is_torch_equal_or_newer("2.9.0.dev"): | ||
tag_cudagraph_unsafe = (torch._C.Tag.cudagraph_unsafe, ) | ||
except AttributeError: | ||
else: | ||
tag_cudagraph_unsafe = () # type: ignore[assignment] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic for determining tag_cudagraph_unsafe
based on the PyTorch version is also present in tests/compile/silly_attention.py
. To improve maintainability and avoid potential inconsistencies in the future, it would be better to define this in a single, shared location (e.g., vllm/utils
) and import it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to vllm/utils/__init__.py
.
Don't keep it in vllm/attention/layer.py
since other ops (e.g., moe) may use it in the future.
Don't keep it in vllm/compilation
to avoid circular import.
Signed-off-by: Boyuan Feng <boyuan@meta.com>
return prompt_token_len | ||
|
||
|
||
if is_torch_equal_or_newer("2.9.0.dev"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May you also add condition check for attribute? We are working on xpu(intel GPU) and Hpu(intel gaudi), we use torch-XPU or torch-HPU, which I am afraid will not have cudagraph related attributes or assert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes added current_platform.is_cuda_alike()
Signed-off-by: Boyuan Feng <boyuan@meta.com>
This pull request has merge conflicts that must be resolved before it can be |
torch._C.Tag.cudagraph_unsafe
is only used for PyTorch >= 2.9.