-
-
Notifications
You must be signed in to change notification settings - Fork 9.5k
[MISC] add cpu_kvcache_space_bytes to CacheConfig #19812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MISC] add cpu_kvcache_space_bytes to CacheConfig #19812
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @andyxning, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request addresses a previously undeclared configuration parameter by formally adding cpu_kvcache_space_bytes
to the CacheConfig
class. This change ensures that the CPU K/V cache space, which is already in use by the CPU backend, is properly defined and documented within the system's central cache configuration, improving clarity and maintainability.
Highlights
- Configuration Update: Added a new field,
cpu_kvcache_space_bytes
, to theCacheConfig
class. This parameter, which is already utilized by CPU backend components (cpu.py
andcpu_worker.py
), is now formally declared within the configuration structure.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds the cpu_kvcache_space_bytes
field to the CacheConfig
dataclass. This change seems appropriate as the PR description indicates this field is already in use in other parts of the codebase (cpu.py
, cpu_worker.py
) and this formalizes its declaration.
The main feedback is a minor stylistic improvement to the docstring of the new field for better consistency with Python conventions (PEP 257) and existing docstrings in the file.
Additionally, the PR description template has not been fully completed (checklist items are unchecked, placeholder text remains). It would be beneficial to fill this out or mark items as N/A for clarity.
vllm/config.py
Outdated
"""(CPU backend only) CPU key-value cache space. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring formatting for cpu_kvcache_space_bytes
is a bit unconventional for a single effective line of documentation and inconsistent with other docstrings in this file (e.g., lines 1506-1508 for calculate_kv_scales
).
According to PEP 257, for one-liner docstrings, the closing """
should be on the same line. The current multi-line formatting with the closing """
on a new, indented line is unusual for such a short docstring.
Suggesting a standard single-line format for conciseness and consistency.
"""(CPU backend only) CPU key-value cache space."""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
a5f8619
to
f674ccf
Compare
f674ccf
to
28b6a15
Compare
28b6a15
to
31c1842
Compare
cc @Isotr0py @bigPYJ1151 do you know about the history of this field? It doesn't seem to exist even before #16835 |
Seems this field was initially added to CPU executor in the earliest CPU support PR (#3634) with declaration missing in |
Signed-off-by: Andy Xie <andy.xning@gmail.com>
31c1842
to
040c692
Compare
In fact we thought this field is too device-specific, so didn't add it to |
Signed-off-by: Andy Xie <andy.xning@gmail.com> Signed-off-by: juncheoll <th6re8e@naver.com>
Signed-off-by: Andy Xie <andy.xning@gmail.com> Signed-off-by: fhl <2410591650@qq.com>
Signed-off-by: Andy Xie <andy.xning@gmail.com>
Signed-off-by: Andy Xie <andy.xning@gmail.com>
Signed-off-by: Andy Xie <andy.xning@gmail.com> Signed-off-by: Will Eaton <weaton@redhat.com>
Signed-off-by: Andy Xie <andy.xning@gmail.com>
Signed-off-by: Andy Xie <andy.xning@gmail.com>
Signed-off-by: Andy Xie <andy.xning@gmail.com> Signed-off-by: avigny <47987522+avigny@users.noreply.github.com>
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.Purpose
cpu_kvcache_space_bytes
is used incpu.py
andcpu_worker.py
files. Such as:vllm/vllm/worker/cpu_worker.py
Lines 250 to 255 in 8b6e1d6
vllm/vllm/platforms/cpu.py
Lines 128 to 139 in 8b6e1d6
But it is not declared in
CacheConfig
.Test Plan
NA
Test Result
NA
(Optional) Documentation Update