Skip to content

[None][fix] Revert 'Add TestServePrefixAwareScheduling base on LMBenchmark/synthetic-multi-round-qa'#13573

Merged
tburt-nv merged 1 commit intomainfrom
revert-13243-prefix-aware-tests-fixs
Apr 28, 2026
Merged

[None][fix] Revert 'Add TestServePrefixAwareScheduling base on LMBenchmark/synthetic-multi-round-qa'#13573
tburt-nv merged 1 commit intomainfrom
revert-13243-prefix-aware-tests-fixs

Conversation

@tburt-nv
Copy link
Copy Markdown
Collaborator

@tburt-nv tburt-nv commented Apr 28, 2026

The kv_cache/test_prefix_aware_scheduling.py tests added by #13243 fail in CI.

Summary by CodeRabbit

  • Refactor

    • Simplified internal prefix-reuse caching logic in the scheduler to improve efficiency.
    • Optimized node deletion behavior in data structure handling.
    • Updated block accounting APIs for more streamlined resource management.
  • Tests

    • Removed integration and unit tests related to prefix-aware scheduling validation.
    • Updated test configurations to reflect changed testing scope.

@tburt-nv tburt-nv requested a review from a team as a code owner April 28, 2026 20:48
@tburt-nv tburt-nv requested a review from leslie-fang25 April 28, 2026 20:48
@tburt-nv tburt-nv changed the title Revert "[None][tests] Add TestServePrefixAwareScheduling base on LMBenchmark/synthetic-multi-round-qa" [None][fix] Revert 'Add TestServePrefixAwareScheduling base on LMBenchmark/synthetic-multi-round-qa' Apr 28, 2026
@tburt-nv tburt-nv changed the title [None][fix] Revert 'Add TestServePrefixAwareScheduling base on LMBenchmark/synthetic-multi-round-qa' [None][fix] Revert 'Add TestServePrefixAwareScheduling base on LMBenchmark/synthetic-multi-round-qa' Apr 28, 2026
@tburt-nv tburt-nv merged commit b5c41f2 into main Apr 28, 2026
11 of 14 checks passed
@tburt-nv tburt-nv deleted the revert-13243-prefix-aware-tests-fixs branch April 28, 2026 20:50
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 28, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: fffee15b-d8ed-48d8-a3cf-13aea8bd7122

📥 Commits

Reviewing files that changed from the base of the PR and between c30d9c7 and bf41aa5.

📒 Files selected for processing (7)
  • cpp/include/tensorrt_llm/batch_manager/templatedTrie.h
  • tensorrt_llm/_torch/pyexecutor/scheduler/scheduler.py
  • tests/integration/defs/kv_cache/test_prefix_aware_scheduling.py
  • tests/integration/test_lists/test-db/l0_b200.yml
  • tests/integration/test_lists/test-db/l0_h100.yml
  • tests/unittest/_torch/executor/test_kvcache_aware_router.py
  • tests/unittest/_torch/executor/test_py_scheduler.py

📝 Walkthrough

Walkthrough

The PR removes the Python-side caching layer for prefix-reuse summaries in the scheduler, simplifies related unit tests, deletes integration tests validating prefix-aware scheduling behavior, and updates test configurations to exclude these tests.

Changes

Cohort / File(s) Summary
Core Implementation
cpp/include/tensorrt_llm/batch_manager/templatedTrie.h, tensorrt_llm/_torch/pyexecutor/scheduler/scheduler.py
Removes prefix-reuse summary caching in Python scheduler and eliminates the node back-pointer reset in the C++ trie deletion path. Updates block accounting APIs to remove cached_summary parameters and adjusts draft-token presence checks from method calls to property access. Removes clamping logic for remaining_space calculation.
Integration Test Removal
tests/integration/defs/kv_cache/test_prefix_aware_scheduling.py
Deletes the entire test suite that validated prefix-aware scheduling end-to-end behavior, including server launch, health checks, and metrics parsing.
Test Configuration Updates
tests/integration/test_lists/test-db/l0_b200.yml, tests/integration/test_lists/test-db/l0_h100.yml
Removes prefix-aware scheduling test entries from GPU-tier test lists, reducing the number of executed integration tests for these configurations.
Unit Test Simplifications
tests/unittest/_torch/executor/test_kvcache_aware_router.py, tests/unittest/_torch/executor/test_py_scheduler.py
Simplifies mock verification in routing tests by removing call-count assertions while retaining return value validation. Removes two unit tests covering draft-token edge cases in mixed batches and zero-draft scenarios.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch revert-13243-prefix-aware-tests-fixs

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant