Skip to content

[TRTLLM-12112][feat] Support v2 KV cache stats#13953

Merged
lfr-0531 merged 10 commits into
NVIDIA:feat/deepseek_v4from
yizhang-nv:feat/kv-cache-v2-stats-port
May 20, 2026
Merged

[TRTLLM-12112][feat] Support v2 KV cache stats#13953
lfr-0531 merged 10 commits into
NVIDIA:feat/deepseek_v4from
yizhang-nv:feat/kv-cache-v2-stats-port

Conversation

@yizhang-nv
Copy link
Copy Markdown
Member

@coderabbitai summary

Description

Port V2 KV cache stats support onto the DeepSeek V4 feature branch.

This PR adds V2 KV cache stats delta tracking and wires it into the existing PyExecutor stats path so LLM.get_stats() and server polling can surface kvCacheStats / kvCacheIterationStats for KV cache manager V2. It also adds V1/V2 alignment coverage for basic block counts, SWA windows, and stacked multi-window reuse scenarios.

Test Coverage

  • python3 -m py_compile tests/integration/defs/kv_cache/test_kv_cache_iteration_stats_alignment.py
  • git diff --check -- tests/integration/defs/kv_cache/test_kv_cache_iteration_stats_alignment.py
  • /home/yizhan/.local/bin/pre-commit run --files tests/integration/defs/kv_cache/test_kv_cache_iteration_stats_alignment.py
  • B200 LLM_MODELS_ROOT=/scratch.trt_llm_data/llm-models pytest -q -s tests/integration/defs/kv_cache/test_kv_cache_iteration_stats_alignment.py
    • 3 passed, 3 warnings in 153.30s
    • log: /home/scratch.yizhan_sw_1/logs/2026-05-08/6u1g-0015/LLM_MODELS_ROOT__scratch_trt_llm_data_llm-models___01-35-50.stdout.log

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@yizhang-nv yizhang-nv force-pushed the feat/kv-cache-v2-stats-port branch 7 times, most recently from c022723 to c02889d Compare May 11, 2026 16:37
@yizhang-nv yizhang-nv marked this pull request as ready for review May 12, 2026 05:36
@yizhang-nv yizhang-nv requested review from a team as code owners May 12, 2026 05:36
@yizhang-nv yizhang-nv requested review from dongxuy04, hchings and liji-nv and removed request for a team May 12, 2026 05:36
@yizhang-nv yizhang-nv force-pushed the feat/kv-cache-v2-stats-port branch from c02889d to 4e40e05 Compare May 13, 2026 02:25
@yizhang-nv yizhang-nv requested a review from lowsfer May 13, 2026 02:33
@yizhang-nv yizhang-nv force-pushed the feat/kv-cache-v2-stats-port branch 8 times, most recently from c705ef1 to 5273874 Compare May 14, 2026 06:15
@lfr-0531 lfr-0531 force-pushed the feat/deepseek_v4 branch from 0a93d10 to 118e7a5 Compare May 14, 2026 07:44
@lfr-0531
Copy link
Copy Markdown
Collaborator

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48918 [ run ] triggered by Bot. Commit: 8cc9c6c Link to invocation

Comment thread tensorrt_llm/runtime/kv_cache_manager_v2/_core/_kv_cache.py Outdated
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48918 [ run ] completed with state SUCCESS. Commit: 8cc9c6c
/LLM/main/L0_MergeRequest_PR pipeline #38667 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@yizhang-nv yizhang-nv changed the title [None][fix] Support v2 KV cache stats [TRTLLM-12112][fix] Support v2 KV cache stats May 19, 2026
@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49133 [ run ] triggered by Bot. Commit: 107553d Link to invocation

@yizhang-nv yizhang-nv changed the title [TRTLLM-12112][fix] Support v2 KV cache stats [TRTLLM-12112][feaet] Support v2 KV cache stats May 19, 2026
@yizhang-nv yizhang-nv changed the title [TRTLLM-12112][feaet] Support v2 KV cache stats [TRTLLM-12112][feat] Support v2 KV cache stats May 19, 2026
@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49147 [ run ] triggered by Bot. Commit: bd74724 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49133 [ run ] completed with state ABORTED. Commit: 107553d

Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49147 [ run ] completed with state SUCCESS. Commit: bd74724
/LLM/main/L0_MergeRequest_PR pipeline #38832 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@nvpohanh
Copy link
Copy Markdown
Collaborator

@lowsfer could you review this? thanks

yizhang-nv added 10 commits May 19, 2026 19:04
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
@yizhang-nv yizhang-nv force-pushed the feat/kv-cache-v2-stats-port branch from bd74724 to 9ec657f Compare May 20, 2026 02:10
@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49308 [ run ] triggered by Bot. Commit: 9ec657f Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49308 [ run ] completed with state SUCCESS. Commit: 9ec657f
/LLM/main/L0_MergeRequest_PR pipeline #38970 completed with status: 'SUCCESS'

CI Report

Link to invocation

@lfr-0531 lfr-0531 merged commit 14c51a4 into NVIDIA:feat/deepseek_v4 May 20, 2026
6 checks passed
lfr-0531 pushed a commit to lfr-0531/TensorRT-LLM that referenced this pull request May 29, 2026
Filtered out disaggregation and AutoDeploy-only changes for user/fanrongl/dsv4_model.

Signed-off-by: Fanrong Li <lfr-0531@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants