Skip to content

Conversation

@WoosukKwon
Copy link
Collaborator

No description provided.

Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

@mergify mergify bot added the v1 label Nov 29, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a potential race condition related to the prefill_len buffer. By replacing the UvaBuffer with a CpuGpuBuffer and adding an explicit copy_to_gpu call, the change ensures that updates to prefill_len on the CPU are safely synchronized to the GPU before being used in subsequent kernels. The implementation is correct and effectively mitigates the risk of reading stale data on the GPU. This is a good correctness improvement.

@WoosukKwon WoosukKwon merged commit 4a80ad0 into main Nov 29, 2025
10 of 11 checks passed
@WoosukKwon WoosukKwon deleted the woosuk/v2-prefill-len branch November 29, 2025 04:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants