LHA 1.0.6 is a patch release focused on making experimental Input Slimming safer for real long-running sessions, while improving compaction reliability, TUI rendering correctness, and CLI-backed agent consistency. This release also publishes lha-llm 1.0.1 with improved context-window error handling.
Highlights
- Adds a new experimental live-zone Input Slimming strategy, input_slimming_live_zone, which slims current tool outputs while preserving cached history prefixes.
- Splits Input Slimming into clear historical and live-zone modes, keeps them mutually exclusive in / experimental, and reports scoped savings as slim hist or slim live.
- Strengthens retrieval safety by protecting recent live outputs, keeping the lha_input_retrieve tool schema stable for live-zone follow-ups, and preserving retrieval for existing markers in conflict mode.
- Improves auto-compaction decisions by accounting for raw history pressure even when the slimmed request is smaller, preventing Input Slimming from hiding compaction needs.
- Makes remote compaction more resilient by trimming older thread items and retrying when the compaction prompt exceeds the model context window.
- Fixes streamed-answer TUI repainting so finalized responses cannot leave stale or visually reordered terminal cells, especially in mixed CJK/ASCII output.
- Shows the active provider in the TUI sidebar alongside the model, making custom-provider sessions easier to inspect.
- Ensures CLI-backed review and delegated agent jobs inherit the parent turn’s reasoning effort setting.
Why It Matters
Input Slimming is still experimental and default-off, but 1.0.6 makes it more practical to evaluate in large-tool-output workflows. Historical slimming remains focused on reducing old tool-output pressure, while the new live-zone strategy targets current tool outputs without rewriting persistent history. Both paths keep originals recoverable through lha_input_retrieve, and the TUI now makes it clearer which strategy produced the savings.
This release also closes several reliability gaps that show up in long sessions: compaction now considers the raw history that would actually be summarized, remote compaction can recover from context-window failures by dropping the oldest items, and provider context_length_exceeded responses are mapped to a dedicated context-window error. Together, these changes help LHA keep moving instead of letting token-saving optimizations mask the need to compact.
Finally, the TUI and job-execution fixes make day-to-day runs feel more consistent: streamed answers finalize cleanly on screen, provider configuration is visible in the sidebar, and spawned review/explorer jobs respect the same reasoning-effort choices as the parent session.