[None][fix] Fix bugs in WindowBlockManager destructor statistics#12448
[None][fix] Fix bugs in WindowBlockManager destructor statistics#12448eopXD merged 1 commit intoNVIDIA:mainfrom
Conversation
613657e to
06dfda8
Compare
📝 WalkthroughWalkthroughModified the Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
/bot run |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp (1)
757-762: Mark write-once local variables asconst.Lines 757 and 760 declare
reusedUniqueBlocksPercentageandcacheHitRatewhich are initialized once and never reassigned. They should be declaredconstto improve intent and align with the coding guideline: "A variable that is not modified after its initialization should be declared asconst."Suggested refactor
- double reusedUniqueBlocksPercentage = mAllocNewBlocks == 0 + double const reusedUniqueBlocksPercentage = mAllocNewBlocks == 0 ? 0.0 : static_cast<double>(mReusedUniqueBlocks) / static_cast<double>(mAllocNewBlocks) * 100.0; - double cacheHitRate = (mReusedBlocks + mMissedBlocks) == 0 ? 0.0 + double const cacheHitRate = (mReusedBlocks + mMissedBlocks) == 0 ? 0.0 : static_cast<double>(mReusedBlocks) / (static_cast<double>(mReusedBlocks) + static_cast<double>(mMissedBlocks));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp` around lines 757 - 762, The two local variables reusedUniqueBlocksPercentage and cacheHitRate in kvCacheManager.cpp are initialized once and never mutated, so mark them const to express intent; locate their declarations (reusedUniqueBlocksPercentage and cacheHitRate) inside the method in class/namespace handling KV cache statistics (around the block computing reused vs new/missed blocks) and change their declarations from "double" to "const double" without altering the existing initialization expressions or numeric casts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp`:
- Around line 760-762: Compute the denominator using a wider integer type to
avoid 32-bit signed overflow: cast mReusedBlocks and mMissedBlocks to a 64-bit
unsigned or signed integer (e.g., std::uint64_t or std::int64_t) before summing
and use that sum for the zero check and division; then compute cacheHitRate by
dividing static_cast<double>(mReusedBlocks) by the wider-typed denominator (or
convert the denominator to double) to preserve precision. Also mark
reusedUniqueBlocksPercentage and cacheHitRate as const since they are assigned
only once. Update references to mReusedBlocks and mMissedBlocks in this
expression so the check uses the wider-typed sum rather than adding two
SizeType32 values directly.
---
Nitpick comments:
In `@cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp`:
- Around line 757-762: The two local variables reusedUniqueBlocksPercentage and
cacheHitRate in kvCacheManager.cpp are initialized once and never mutated, so
mark them const to express intent; locate their declarations
(reusedUniqueBlocksPercentage and cacheHitRate) inside the method in
class/namespace handling KV cache statistics (around the block computing reused
vs new/missed blocks) and change their declarations from "double" to "const
double" without altering the existing initialization expressions or numeric
casts.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e096c772-9a0e-47a4-b073-d63e99e1ea86
📒 Files selected for processing (1)
cpp/tensorrt_llm/batch_manager/kvCacheManager.cpp
|
PR_Github #39913 [ run ] triggered by Bot. Commit: |
|
PR_Github #39913 [ run ] completed with state
|
06dfda8 to
7f3ed29
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #40035 [ run ] triggered by Bot. Commit: |
|
PR_Github #40035 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #40264 [ run ] triggered by Bot. Commit: |
|
PR_Github #40264 [ run ] completed with state
|
7f3ed29 to
5a81a82
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #40412 [ run ] triggered by Bot. Commit: |
|
PR_Github #40412 [ run ] completed with state
|
5a81a82 to
d9d08bc
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #40706 [ run ] triggered by Bot. Commit: |
|
PR_Github #40706 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #40829 [ run ] triggered by Bot. Commit: |
|
PR_Github #40829 [ run ] completed with state
|
d9d08bc to
b36b0b1
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #42101 [ run ] triggered by Bot. Commit: |
|
PR_Github #42101 [ run ] completed with state
|
|
PR_Github #43215 [ run ] triggered by Bot. Commit: |
|
PR_Github #43215 [ run ] completed with state |
|
/bot run --disable-fail-fast |
|
PR_Github #43329 [ run ] triggered by Bot. Commit: |
|
PR_Github #43329 [ run ] completed with state
|
b36b0b1 to
e5ac14d
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #43624 [ run ] triggered by Bot. Commit: |
|
PR_Github #43624 [ run ] completed with state
|
e5ac14d to
fdaa56e
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #44325 [ run ] triggered by Bot. Commit: |
|
PR_Github #44325 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #44456 [ run ] triggered by Bot. Commit: |
|
PR_Github #44456 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #44602 [ run ] triggered by Bot. Commit: |
|
PR_Github #44602 [ run ] completed with state
|
…rting Fix several bugs in the KV cache statistics logged during WindowBlockManager destruction: 1. Division-by-zero when mTotalInputTokens is 0 (reused tokens %). 2. Wrong zero-guard for reusedUniqueBlocksPercentage: guarded mAllocTotalBlocks instead of the actual denominator mAllocNewBlocks. 3. Potential integer overflow in cacheHitRate denominator: the sum mReusedBlocks + mMissedBlocks was computed as int32 before cast. 4. Use double instead of float for computed ratios to avoid precision loss for large counter values (float has only 23-bit mantissa). 5. Fix format specifiers from %lu to %d to match SizeType32 (int32_t). Signed-off-by: Yueh-Ting Chen <yueh.ting.chen@gmail.com>
fdaa56e to
9e2590e
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #44916 [ run ] triggered by Bot. Commit: |
|
/bot run --disable-fail-fast |
|
PR_Github #45118 [ run ] triggered by Bot. Commit: |
|
PR_Github #45118 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #45324 [ run ] triggered by Bot. Commit: |
|
PR_Github #45324 [ run ] completed with state |
Description
Fix several bugs in the KV cache statistics logged during
WindowBlockManagerdestruction:mTotalInputTokenscould be 0 when computing reused tokens percentagereusedUniqueBlocksPercentageguardedmAllocTotalBlocksinstead of the actual denominatormAllocNewBlockscacheHitRatedenominator sum was computed asint32_tbefore cast to floating pointfloat(23-bit mantissa) replaced withdoublefor computed ratios to avoid precision loss with large counter values%lureplaced with%dto matchSizeType32(int32_t)Test Coverage
pytest tests/unittest/PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help