[None][fix] Avoid pre-window SWA over-allocation on disagg gen server by Shixiaowei02 · Pull Request #13845 · NVIDIA/TensorRT-LLM

Shixiaowei02 · 2026-05-07T08:56:54Z

@coderabbitai summary

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

Shixiaowei02 · 2026-05-07T09:34:55Z

/bot run

tensorrt-cicd · 2026-05-07T09:42:11Z

PR_Github #47183 [ run ] triggered by Bot. Commit: a6a74b3 Link to invocation

tensorrt-cicd · 2026-05-07T10:49:29Z

PR_Github #47183 [ run ] completed with state SUCCESS. Commit: a6a74b3
/LLM/main/L0_MergeRequest_PR pipeline #37139 completed with status: 'SUCCESS'

CI Report

Link to invocation

…#13845) Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

…#13845) Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com> Signed-off-by: Fanrong Li <lfr-0531@users.noreply.github.com>

For a disagg gen init request entering the V2 scheduler, context_current_position is still 0 — it is only advanced to prompt_len in _prepare_disagg_gen_transmission_complete after KV transfer completes. Passing history_length=context_current_position degenerates to history_length=0, the SWA stale range collapses to an empty interval, and pre-window blocks are still allocated before KV transfer — defeating the over-allocation fix from NVIDIA#13845/NVIDIA#14377. Use req.prompt_len, which is known statically at scheduler time and correctly conveys that the entire prompt is history. Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

github-actions Bot assigned Shixiaowei02 May 7, 2026

Shixiaowei02 requested a review from chuangz0 May 7, 2026 08:57

Shixiaowei02 marked this pull request as ready for review May 7, 2026 09:33

Shixiaowei02 requested review from a team as code owners May 7, 2026 09:33

Shixiaowei02 requested review from joyang-nv and removed request for a team May 7, 2026 09:33

Avoid pre-window SWA over-allocation on disagg gen server

a6a74b3

Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

Shixiaowei02 force-pushed the feat/deepseek_v4_mem branch from 9a3f789 to a6a74b3 Compare May 7, 2026 09:33

Shixiaowei02 requested review from lfr-0531, lowsfer, qiaoxj07 and yizhang-nv May 7, 2026 09:35

Shixiaowei02 added the deepseek-v4 label May 7, 2026

lowsfer approved these changes May 7, 2026

View reviewed changes

Shixiaowei02 merged commit 67b0b17 into NVIDIA:feat/deepseek_v4 May 7, 2026
6 of 7 checks passed

Shixiaowei02 deleted the feat/deepseek_v4_mem branch May 7, 2026 11:06

lfr-0531 pushed a commit that referenced this pull request May 7, 2026

[None][fix] Avoid pre-window SWA over-allocation on disagg gen server (…

9c57516

…#13845) Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>

Shixiaowei02 mentioned this pull request May 27, 2026

[TRTLLM-13017][fix] disagg gen init: use prompt_len for SWA history_length #14627

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[None][fix] Avoid pre-window SWA over-allocation on disagg gen server#13845

[None][fix] Avoid pre-window SWA over-allocation on disagg gen server#13845
Shixiaowei02 merged 1 commit into
NVIDIA:feat/deepseek_v4from
Shixiaowei02:feat/deepseek_v4_mem

Shixiaowei02 commented May 7, 2026 •

edited

Loading

Uh oh!

Shixiaowei02 commented May 7, 2026

Uh oh!

tensorrt-cicd commented May 7, 2026

Uh oh!

tensorrt-cicd commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Shixiaowei02 commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

Shixiaowei02 commented May 7, 2026

Uh oh!

tensorrt-cicd commented May 7, 2026

Uh oh!

tensorrt-cicd commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Shixiaowei02 commented May 7, 2026 •

edited

Loading