[Core] Support logprobs with spec decode + async scheduling #29223

njhill · 2025-11-22T05:10:47Z

No description provided.

gemini-code-assist

Code Review

This pull request adds support for logprobs with speculative decoding and asynchronous scheduling. The changes involve refactoring how cumulative token counts are calculated and passed to correctly process logprobs in these scenarios. The modifications in vllm/v1/sample/rejection_sampler.py and vllm/v1/worker/gpu_model_runner.py seem correct and well-structured. New tests are added to cover these cases. My main concern is the significant increase in tolerance in tests/v1/sample/test_logprobs.py for comparing logprobs, which might hide numerical precision issues. Please see the specific comment for details.

tests/v1/sample/test_logprobs.py

mergify · 2025-11-22T14:51:49Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @njhill.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Nick Hill <nhill@redhat.com>

benchislett · 2025-11-25T20:13:56Z

vllm/v1/sample/rejection_sampler.py

+        cu_num_tokens = None
+        if return_cu_num_tokens:
+            cu_num_tokens = [0] + valid_mask.sum(axis=1).cumsum().tolist()
+        if len(discard_req_indices) > 0:


Why is this done after computing cu_num_tokens?

Because cu_num_tokens is used to index into the logprobs tensors that don't take the discarded indices into account, so doing it beforehand results in incorrect output.

Originally it was done before which was a bug, fixed by #29216.

vllm/v1/worker/gpu_model_runner.py

benchislett

Looks good overall

njhill requested review from 22quinn and houseroad as code owners November 22, 2025 05:10

mergify bot added the v1 label Nov 22, 2025

gemini-code-assist bot reviewed Nov 22, 2025

View reviewed changes

tests/v1/sample/test_logprobs.py Show resolved Hide resolved

njhill requested a review from benchislett November 22, 2025 05:13

mergify bot added the needs-rebase label Nov 22, 2025

[Core] Support logprobs with spec decode + async scheduling

cc55c14

Signed-off-by: Nick Hill <nhill@redhat.com>

njhill force-pushed the as-sd-logprobs branch from 3caee35 to cc55c14 Compare November 22, 2025 18:42

njhill requested review from ApostaC, WoosukKwon, alexm-redhat, heheda12345, robertgshaw2-redhat and ywang96 as code owners November 22, 2025 18:42

mergify bot removed the needs-rebase label Nov 22, 2025

njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 22, 2025

This was referenced Nov 23, 2025

[Tracking Issue][Performance]: Speculative decoding performance/QoL improvements #28947

Open

Async Scheduling Plan #27679

Open

benchislett reviewed Nov 25, 2025

View reviewed changes

vllm/v1/worker/gpu_model_runner.py Show resolved Hide resolved

benchislett reviewed Nov 25, 2025

View reviewed changes

benchislett approved these changes Nov 25, 2025

View reviewed changes

njhill merged commit 4e57c65 into vllm-project:main Nov 25, 2025
49 checks passed

njhill deleted the as-sd-logprobs branch November 25, 2025 20:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Core] Support logprobs with spec decode + async scheduling #29223

[Core] Support logprobs with spec decode + async scheduling #29223

njhill commented Nov 22, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

mergify bot commented Nov 22, 2025

Uh oh!

benchislett Nov 25, 2025

Uh oh!

njhill Nov 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

benchislett left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Core] Support logprobs with spec decode + async scheduling #29223

[Core] Support logprobs with spec decode + async scheduling #29223

Conversation

njhill commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mergify bot commented Nov 22, 2025

Uh oh!

benchislett Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

njhill Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

benchislett left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

njhill commented Nov 22, 2025 •

edited

Loading

njhill Nov 25, 2025 •

edited

Loading