-
-
Notifications
You must be signed in to change notification settings - Fork 4
Context Compression
Refact compresses chat history in staged passes, preserving important markers while reducing token pressure.
The chat-history crate implements a four-stage compression pipeline in history_limit.rs:
- Deduplicate context files
- Compress tool results
- Fix tool calls
- Limit history
This pipeline is the main history reduction path before messages are sent onward.
CompressionStrength is the coarse control enum used by the history limiter:
AbsentLowMediumHigh
These modes help decide how aggressively the history can be reduced when context pressure rises.
A visible compression_report role is used for deterministic/reactive compression reporting.
The report carries extra schema data in extra.compression_report with fields such as:
kind: "chat_compression_report"context_files_removedcontext_messages_droppedtool_results_truncatedtokens_beforetokens_afterestimated_tokens_savedreduction_percent
The report is meant to be preserved across model-switch/new-thread sanitization and inserted at a stable boundary.
The runtime tracks compression state on both chat session and runtime snapshots via:
is_compressingcompression_phasecompression_reason
The prompt’s runtime lifecycle is:
checkingrunningappliedskippedfailed
These fields are surfaced through runtime updates so the UI and consumers can tell whether compression is in progress and why.
When the system summarizes portions of history, the extra metadata uses:
extra.compression.kind = "llm_segment_summary"
That marks summaries as a distinct compression artifact rather than ordinary chat text.
Compression previews redact secrets before exposing them in generated summaries or UI previews. This keeps token-saving previews useful without leaking sensitive contents.
Compression has to respect special internal message roles, especially plans and plan deltas. See Hidden Roles and Plans for the hidden-role contract, and Memory and Knowledge for the broader project-state picture.
flowchart LR
A[history_limit] --> B[dedup context files]
B --> C[compress tool results]
C --> D[fix tool calls]
D --> E[limit history]
E --> F[compression_report]
See also Chat System.
Refact on GitHub: https://github.com/JegernOUTT/refact
- Agent Modes
- Agent Tools
- Task Planner & Cards
- Worktrees
- Subagents
- Memory & Knowledge
- Hidden Roles & Plans
- Context Compression
- Scheduler & Cron
- Processes & PTY
- Buddy
- MCP
- Skills, Commands & Hooks
- Marketplace
- Chat System
- Providers
- Caps & Models
- Code Completion (FIM)
- AST
- VecDB
- Exec Runtime
- HTTP API
- Checkpoints & Git
- Voice