v0.7.0
What's New
New Features
- Single-pass native HS capture with prefix-cache fill and multi-step accumulator (#453)
- Hook-based capture inside
VLLMWithUncertainty._raw_generate(replaces the prior re-encode pass) - Two-phase
_fill_prefix_gaps:- Phase 1 (within-group): copies missing prefix from the highest-prefill donor across requests with identical prompts
- Phase 2 (cross-group): token-level longest-common-prefix against a global donor — handles vLLM's automatic prefix caching across distinct prompts
- Multi-step accumulator: correct per-request hidden states across iterative TTC strategies (beam search, MUR, phi-decoding) where capture spans multiple
generate()calls
- Hook-based capture inside
Compatibility
- Downstream packages using
VLLMWithUncertaintyfor hidden-state attribution should pinlm-polygraph>=0.7.0. The HS-capture contract has expanded to include per-request attribution and prefix-cache-aware filling. - ThinkBooster's
hook_hs_extension v2requires this release.
Full Changelog: v0.6.0...v0.7.0