Skip to content

Conversation

NickLucche
Copy link
Collaborator

Small quality of life change to keep the way we interact with KV cache layout consistent as I explain a bit here #22735.
cc @zhenwei-intel to keep me true on the XPU-related change. PS that limit will prevent XPU from being compatible with heteroTP as of now.

cc @LucasWilkinson as this is another kv layout constraint which is good to keep in mind.

Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the handling of KV cache layout settings to be more consistent, especially for XPU which requires the 'NHD' layout. It introduces a centralized set_kv_cache_layout function, which is an improvement over direct environment variable manipulation. My review includes a suggestion to make the validation of user-provided layouts more robust. A significant part of this PR is the addition of the vllm/entrypoints/cli/coordinate.py file, which appears to be a refactoring of the server startup logic, though it's not mentioned in the PR description.

Signed-off-by: NickLucche <nlucches@redhat.com>
Comment on lines 33 to 34
KVCacheLayoutType = Literal["NHD", "HND"]
_KV_CACHE_LAYOUT_OVERRIDE: KVCacheLayoutType | None = None
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LucasWilkinson any other meaningful layout we want to allow here?


# Supported xPUs and types of kv transfer buffer.
# {xPU: tuple of supported kv buffer types}
_NIXL_SUPPORTED_XPUS = {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think xPU was a great name for a generic PU, but since Intel already claimed it this is just confusing now >.<

@zhenwei-intel
Copy link
Contributor

LGTM

Signed-off-by: NickLucche <nlucches@redhat.com>
Copy link
Collaborator

@jikunshang jikunshang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for fixing. LGTM!

@jikunshang jikunshang added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 15, 2025
@jikunshang jikunshang merged commit 2e41f5a into vllm-project:main Sep 15, 2025
58 checks passed
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: NickLucche <nlucches@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
frontend ready ONLY add when PR is ready to merge/full CI is needed v1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants