Problem
Long sessions accumulate tokens in the chat history until the model's
context window is full and quality degrades or responses are cut off.
Proposed behaviour — /compact
/compact
summarising conversation…
✓ compressed 42 messages → 1 summary (saved ~3.2k tokens)
The command:
- Sends the current chat history to the model asking it to produce
a compact summary of what has been discussed and decided
- Replaces
Agent._chat with a new Chat containing only the
system prompt + one assistant message with the summary
- Prints how many messages were compressed and estimated token savings
UI for context fullness (separate sub-feature)
Show context usage next to the stats line after each response:
↑ 1.2k ↓ 384 · 45 tok/s · 2.3s [████░░░░░░ 38% ctx]
Implementation note
Token counting for the context bar uses the accumulated
_session_prompt_tok + _session_pred_tok from issue #29 divided
by the model's get_context_length() (from issue #2 implementation).
Problem
Long sessions accumulate tokens in the chat history until the model's
context window is full and quality degrades or responses are cut off.
Proposed behaviour — /compact
The command:
a compact summary of what has been discussed and decided
Agent._chatwith a new Chat containing only thesystem prompt + one assistant message with the summary
UI for context fullness (separate sub-feature)
Show context usage next to the stats line after each response:
Implementation note
Token counting for the context bar uses the accumulated
_session_prompt_tok+_session_pred_tokfrom issue #29 dividedby the model's
get_context_length()(from issue #2 implementation).