System Info
To my surprise, using tensorrt-llm 1.3.0rc4, I have seen these metrics:
prefillworker-1 | 2026-03-04T00:39:03.034016Z [warning ] Received negative values for kv blocks: kv_active_block: -2, kv_total_blocks: 7934. Setting them to 0 in published metrics. [common.publisher] file=/workspace/trtllm/common/publisher.py line=385
stats = await self._llm_engine.get_kv_cache_events_async(timeout=5)
async for stat in stats:
request_active_slots = stat["numActiveRequests"]
request_total_slots = stat["maxNumActiveRequests"]
kv_active_block = stat["kvCacheStats"]["usedNumBlocks"]
kv_total_blocks = stat["kvCacheStats"]["maxNumBlocks"]
Who can help?
Just using a minimal example for disagg prefill, with prefill first
Information
Tasks
Reproduction
Expected behavior
actual behavior
additional notes
Before submitting a new issue...
System Info
To my surprise, using tensorrt-llm 1.3.0rc4, I have seen these metrics:
Who can help?
Just using a minimal example for disagg prefill, with prefill first
Information
Tasks
examplesfolder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
actual behavior
additional notes
Before submitting a new issue...