What's New in v1.3.0
Added
- Smart context management:
ObservationMaskingManager replaces SummarizingConversationManager as the default conversation manager. Based on Lindenbauer et al. 2025 "The Complexity Trap" — observation masking halves per-run costs with no quality loss
- Three configurable strategies via
blueclaw.yaml: mask (default, replaces old tool outputs with placeholders), summarize (legacy LLM summarization), hybrid (mask first, summarize only after N turns)
context section in blueclaw.yaml: strategy, mask_after (default 10), summarize_after (hybrid only, default 43)
- Context metrics in traces:
context_masked_chars and context_strategy fields on RunTrace
trace show displays context strategy and masked char count when present
trace stats shows Context Management section: runs with masking, avg/total chars masked, strategy breakdown
BeforeModelCallEvent hook for proactive masking within multi-tool invocations (same pattern as Strands' SlidingWindowConversationManager)
reduce_context fallback chain: aggressive mask (M=0) then delegate to SummarizingConversationManager
- 35 new tests (400 total)
scripts/bench_context.py — multi-turn benchmark runner for comparing context strategies. Delta token tracking, isolated workspaces per strategy, error recovery, response capture. Supports --strategy, --model, --mask-after, --output flags
- Benchmark prompt files for 3 workload categories:
search-small (small outputs), retrieval-large (full page fetches), mixed-workflow (search + fetch)
Changed
- Default conversation manager switched from
SummarizingConversationManager to ObservationMaskingManager — existing behavior available via context.strategy: summarize
Installation
pip install blueclaw==1.3.0
Links