fix(serve): respect user-set CUDA_VISIBLE_DEVICES in gpu_env#751
fix(serve): respect user-set CUDA_VISIBLE_DEVICES in gpu_env#751paxiaatucsdedu wants to merge 1 commit intolightseekorg:mainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical issue where the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughgpu_env in Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request correctly modifies the gpu_env method to respect an existing CUDA_VISIBLE_DEVICES environment variable. The implementation is robust, handling various formats of the variable and including a clear bounds check to prevent errors when not enough GPUs are available. The fallback to the original sequential assignment when the variable is not set is preserved. The changes are accompanied by a comprehensive set of unit tests that cover the new logic, including edge cases and different backend launchers. The code quality is high, and the changes effectively address the described problem.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bindings/python/tests/test_serve.py`:
- Around line 463-505: Add a unit test that asserts the tp_size <= 0 guard in
gpu_env raises: create a pytest that constructs the appropriate launcher (e.g.,
SglangWorkerLauncher), sets args = argparse.Namespace(tensor_parallel_size=0)
(and/or a negative value), and calls launcher.gpu_env(args, dp_rank=0,
env={"CUDA_VISIBLE_DEVICES": "0,1"}) inside pytest.raises(ValueError,
match="tp_size|tensor_parallel_size|<= 0") to ensure the check at line 68 in
bindings/python/src/smg/serve.py is exercised; name the test something like
test_gpu_env_raises_on_non_positive_tp_size and add it alongside the existing
gpu_env tests.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: a593117e-1bd3-4baf-85db-73334ed4b5a6
📒 Files selected for processing (2)
bindings/python/src/smg/serve.pybindings/python/tests/test_serve.py
When CUDA_VISIBLE_DEVICES is already set (e.g. via Docker -e), treat it as the available GPU pool instead of overriding it with sequential IDs. Adds bounds check, tp_size validation, and unit tests. Signed-off-by: paxiaatucsdedu <paxia@ucsd.edu>
8dd3105 to
ced6ba0
Compare
Description
Problem
When running
smg serveinside a Docker container withCUDA_VISIBLE_DEVICESset(e.g.
docker run -e CUDA_VISIBLE_DEVICES=4 ...), the gpu_env method inWorkerLauncher unconditionally overrides the variable with sequential IDs starting
from 0. This causes workers to land on the wrong GPU, leading to OOM errors when
another process is already using GPU 0.
Solution
Modify gpu_env to check whether
CUDA_VISIBLE_DEVICESis already set in theenvironment. If it is, treat it as the available GPU pool and slice into it by
dp_rankandtp_size. If it is not set (or empty), fall back to the originalsequential assignment (
0,1,...). A bounds check raises a clearValueErrorwhenthe pool has fewer GPUs than the requested
dp_rank * tp_sizerange.Changes
into an existing
CUDA_VISIBLE_DEVICESpool instead of overriding it. Add boundscheck with descriptive error message.
(parametrized across multiple GPU layouts), cross-backend verification (vllm),
bounds check (
ValueError), and empty-string fallback.Test Plan
Manual verification (Docker):
Added unit test at bindings/python/tests/test_serve.py
Checklist
cargo +nightly fmtpassescargo clippy --all-targets --all-features -- -D warningspassesSummary by CodeRabbit
Improvements
Tests