2026.05.6.2
Release Summary
This release ships the fix from PR #164: local provider tokenizer and summary flow corrections for self-hosted OpenAI-compatible backends.
Highlights
- Fixed the plain-text summary reducer so it uses the effective summary prompt budget instead of over-splitting work into many small serial summary calls.
- Added a per-call summary timeout with a safe fallback that preserves prior summary state if the summary model stalls.
- Stopped self-hosted chat completions from falling back to extra
/tokenizerequests when the provider already returnsusagemetrics. - Updated the example config with the new
summary.callTimeoutSecondssetting. - Added regression coverage for single-pass summarization, summary timeout fallback behavior, and self-hosted usage handling.
Included Changes
- Fix local provider tokenizer and summarize flow (#164)
Validation
go test ./internal/agent/memory ./internal/llm/openai ./internal/agentd ./internal/config -run 'Test(BuildContextForProvider_SummaryTimeoutKeepsPriorState|SummarizeChunkUsesSinglePassWhenInputFitsPromptBudget|SelfHostedChatUsesReturnedUsageWithoutTokenizeFallback)$' -count=1