Skip to content

Context Compression

refact-planner edited this page Jun 7, 2026 · 1 revision

Context Compression

Refact compresses chat history in staged passes, preserving important markers while reducing token pressure.

4-stage history_limit pipeline

The chat-history crate implements a four-stage compression pipeline in history_limit.rs:

  1. Deduplicate context files
  2. Compress tool results
  3. Fix tool calls
  4. Limit history

This pipeline is the main history reduction path before messages are sent onward.

Compression strength

CompressionStrength is the coarse control enum used by the history limiter:

  • Absent
  • Low
  • Medium
  • High

These modes help decide how aggressively the history can be reduced when context pressure rises.

Compression report

A visible compression_report role is used for deterministic/reactive compression reporting.

The report carries extra schema data in extra.compression_report with fields such as:

  • kind: "chat_compression_report"
  • context_files_removed
  • context_messages_dropped
  • tool_results_truncated
  • tokens_before
  • tokens_after
  • estimated_tokens_saved
  • reduction_percent

The report is meant to be preserved across model-switch/new-thread sanitization and inserted at a stable boundary.

Runtime compression state

The runtime tracks compression state on both chat session and runtime snapshots via:

  • is_compressing
  • compression_phase
  • compression_reason

The prompt’s runtime lifecycle is:

  • checking
  • running
  • applied
  • skipped
  • failed

These fields are surfaced through runtime updates so the UI and consumers can tell whether compression is in progress and why.

Segment summarization

When the system summarizes portions of history, the extra metadata uses:

  • extra.compression.kind = "llm_segment_summary"

That marks summaries as a distinct compression artifact rather than ordinary chat text.

Secret redaction in previews

Compression previews redact secrets before exposing them in generated summaries or UI previews. This keeps token-saving previews useful without leaking sensitive contents.

Compression and special roles

Compression has to respect special internal message roles, especially plans and plan deltas. See Hidden Roles and Plans for the hidden-role contract, and Memory and Knowledge for the broader project-state picture.

Sketch of the pipeline

flowchart LR
  A[history_limit] --> B[dedup context files]
  B --> C[compress tool results]
  C --> D[fix tool calls]
  D --> E[limit history]
  E --> F[compression_report]
Loading

See also Chat System.

Clone this wiki locally