Skip to content

2026.05.6.2

Choose a tag to compare

@Art9681 Art9681 released this 06 May 23:42
· 193 commits to develop since this release
cfe5793

Release Summary

This release ships the fix from PR #164: local provider tokenizer and summary flow corrections for self-hosted OpenAI-compatible backends.

Highlights

  • Fixed the plain-text summary reducer so it uses the effective summary prompt budget instead of over-splitting work into many small serial summary calls.
  • Added a per-call summary timeout with a safe fallback that preserves prior summary state if the summary model stalls.
  • Stopped self-hosted chat completions from falling back to extra /tokenize requests when the provider already returns usage metrics.
  • Updated the example config with the new summary.callTimeoutSeconds setting.
  • Added regression coverage for single-pass summarization, summary timeout fallback behavior, and self-hosted usage handling.

Included Changes

  • Fix local provider tokenizer and summarize flow (#164)

Validation

  • go test ./internal/agent/memory ./internal/llm/openai ./internal/agentd ./internal/config -run 'Test(BuildContextForProvider_SummaryTimeoutKeepsPriorState|SummarizeChunkUsesSinglePassWhenInputFitsPromptBudget|SelfHostedChatUsesReturnedUsageWithoutTokenizeFallback)$' -count=1