Release Async Context Compression v1.7.0 · Fu-Jie/openwebui-extensions

Overview

This patch release makes summary generation failures non-blocking by default so transient upstream errors no longer interrupt the active chat. It also adds an explicit operator valve to preserve the old hard-failure behavior when needed.

Bug Fixes

Graceful summary failure handling (Issue #74): Background summary LLM failures now default to a silent skip instead of re-raising into the active chat flow.
Chat continuity preserved during transient upstream errors: Short-lived upstream provider failures such as 502s now log the summary error and continue the current chat without saving a summary for that turn.

New Features

summary_fail_mode valve: Added a new valve with silent and raise modes so operators can choose between chat-friendly degradation and strict failure visibility.
Regression coverage for both modes: Added tests for the default silent path and the opt-in raise path.

Migration Notes

No breaking changes. Default behavior is now summary_fail_mode="silent". Set summary_fail_mode="raise" if you need the previous hard-failure behavior for debugging.

Overview

This patch release broadens summary-response parsing so the filter can accept both classic chat-completions payloads and Responses-style output payloads. It also improves empty-summary diagnostics without persisting reasoning-only fields.

Bug Fixes

Alternate summary payload support: _call_summary_llm() now accepts summary text from choices[].message.content, output_text content parts, and Responses-style output message items.
Stale choices-only gate removed: The summary call path no longer rejects valid provider payloads just because they omit choices.
Clearer empty-summary errors: When no final summary text is present, the filter now reports a compact response-shape summary instead of a misleading generic format error.

Behavior Notes

Reasoning-only output is ignored: reasoning_content, thinking, and reasoning output items are not treated as summary text, so private chain-of-thought is not written into chat memory.
No change to 1.6.3 fail-mode behavior: summary_fail_mode continues to control whether upstream summary-call errors are silent or raised.

Migration Notes

No breaking changes. If a provider returns only reasoning fields and no final answer text, the filter will skip saving a summary for that turn and log the response shape for debugging.

Overview

This release adds branch-aware summary storage and reuse. Cached summaries are now validated against ordered message references and payload fingerprints before they are injected, so summaries from sibling branches or edited history are rejected instead of being reused in the wrong conversation branch.

Because the persisted summary schema changed from count-only/current-summary storage to branch-aware rows, this release is versioned as 1.7.0.

New Features

Branch-aware summary reuse: Stored summaries carry message ids and fingerprints. The filter reuses only the newest summary that is valid for the current branch.
Single-table summary storage: Branch-aware rows now live directly in chat_summary; all branch summaries are stored in that table.
Safer schema upgrades: Count-only legacy chat_summary tables, or tables that still enforce one row per chat, are rebuilt so unsafe summaries are regenerated.

Branch Example

OpenWebUI chats can form a tree when a user edits or forks from an earlier message. Version 1.7.0 treats a summary as valid only for the branch whose ordered message ids and payload fingerprints it covers.

Example with a fork that does not align with a compression boundary:

A chat first grows on the main branch: 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 -> 9 -> 10.
The first compression stores a branch-aware summary covering messages 1-5.
More messages arrive on the same branch, and a later compression derives a new summary covering 1-10 by summarizing the previous 1-5 summary plus messages 6-10.
The user then forks from message 7, which sits between the two compression boundaries, and creates a sibling branch: 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8b -> 9b.
On that sibling branch, the 1-10 summary is rejected because it contains live sibling refs 8-10 from the original branch. The filter can still reuse the nearest valid ancestor summary, 1-5, then keep 6 -> 7 -> 8b -> 9b as live tail context.
When the sibling branch is compressed, it derives and stores a separate summary such as 1-9b from the old 1-5 summary plus the sibling branch's live tail. Both 1-10 and 1-9b remain available, but only the newest summary valid for the current branch is injected.

This is why the filter records exact message identity instead of only storing a compressed message count. A count-based summary for "10 messages" cannot tell whether those 10 messages belong to the current branch or a sibling branch.

Configuration Notes

max_summary_tokens must be strictly less than 80% of the summary model input window. The reserved space lets the next compression pass send the previous summary plus new messages to the summary model; if the previous summary can occupy the whole input window, repeated compression cannot make meaningful progress. Invalid valve settings raise a configuration error instead of being silently lowered.

Migration Notes

This release changes the summary database schema:

Older count-only chat_summary tables are dropped and recreated; those old summaries are discarded and regenerated on future compression.
If schema inspection fails, the plugin leaves existing tables untouched and disables summary persistence instead of running destructive DDL.

Update or reinstall the filter so OpenWebUI's stored function content includes the new schema logic and validation changes.

Version Changes

Plugin Updates

Async Context Compression: v1.6.5 → v1.7.0 | 📖 README

New Contributors

@NexZhu contributed in #94

📚 Documentation Portal
🐛 Report Issues

Full Changelog: async-context-compression-v1.6.5...async-context-compression-v1.7.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async Context Compression v1.7.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Overview

Bug Fixes

New Features

Migration Notes

Overview

Bug Fixes

Behavior Notes

Migration Notes

Overview

New Features

Branch Example

Configuration Notes

Migration Notes

Version Changes

Plugin Updates

New Contributors

Contributors

Uh oh!