2026.05.6.2

Art9681 released this 06 May 23:42

· 193 commits to develop since this release

cfe5793

Release Summary

This release ships the fix from PR #164: local provider tokenizer and summary flow corrections for self-hosted OpenAI-compatible backends.

Highlights

Fixed the plain-text summary reducer so it uses the effective summary prompt budget instead of over-splitting work into many small serial summary calls.
Added a per-call summary timeout with a safe fallback that preserves prior summary state if the summary model stalls.
Stopped self-hosted chat completions from falling back to extra /tokenize requests when the provider already returns usage metrics.
Updated the example config with the new summary.callTimeoutSeconds setting.
Added regression coverage for single-pass summarization, summary timeout fallback behavior, and self-hosted usage handling.

Included Changes

Fix local provider tokenizer and summarize flow (#164)

Validation

go test ./internal/agent/memory ./internal/llm/openai ./internal/agentd ./internal/config -run 'Test(BuildContextForProvider_SummaryTimeoutKeepsPriorState|SummarizeChunkUsesSinglePassWhenInputFitsPromptBudget|SelfHostedChatUsesReturnedUsageWithoutTokenizeFallback)$' -count=1

Assets 2