Skip to content

Conversation

@WoosukKwon
Copy link
Collaborator

@WoosukKwon WoosukKwon commented Nov 27, 2025

  • Support FULL cuda graphs (for mixed batch)
  • Refactoring for future code reuse

Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the CudaGraphManager to improve its structure and generalize CUDA graph capturing. The logic for determining CUDA graph sizes, preparing inputs for capture, and orchestrating the capture process is extracted into separate utility functions. This improves modularity and readability. The key change is the generalization from capturing graphs based on batch_size to num_tokens, which allows for more flexible support for different batch types (prefill, decode, mixed). I found one critical issue in the implementation that needs to be addressed.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@WoosukKwon WoosukKwon merged commit 11ea5ec into main Nov 27, 2025
8 of 9 checks passed
@WoosukKwon WoosukKwon deleted the woosuk/v2-cudagraph-refactor branch November 27, 2025 05:38
@github-project-automation github-project-automation bot moved this to Done in NVIDIA Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants