-
-
Notifications
You must be signed in to change notification settings - Fork 4
Chat System
Stateful chat orchestration, SSE delivery, tool execution, compression, and trajectory persistence in the engine.
The chat subsystem is centered on ChatSession, which owns the thread transcript, runtime state, command queue, SSE event stream, compression flags, and trajectory persistence metadata. It coordinates the full path from a user command to LLM streaming, tool execution, and eventual persistence of the chat trajectory.
Authoritative behavior comes from AGENTS.md and the src/chat/ modules, especially session.rs, queue.rs, stream_core.rs, linearize.rs, history_limit.rs, and trajectories.rs.
ChatSession is the mutable in-memory representation of a chat thread. Relevant state includes:
chat_idmessagesruntime: RuntimeStatecommand_queue: VecDeque<CommandRequest>-
event_txfor SSE delivery -
event_seqmonotonic counter - compression fields:
is_compressing,compression_phase,compression_reason - draft fields:
draft_message,draft_usage - trajectory fields:
trajectory_dirty,trajectory_version,trajectory_save_in_flight,trajectory_save_queued - abort / wakeup coordination:
abort_flag,abort_notify,user_interrupt_flag,queue_notify
SessionState enum values are:
IdleGeneratingExecutingToolsPausedWaitingIdeWaitingUserInputCompletedError
stateDiagram-v2
[*] --> Idle
Idle --> Generating: UserMessage / RetryFromIndex / Regenerate
Generating --> ExecutingTools: model emits tool calls
Generating --> Idle: stream finishes without tool calls
Generating --> Completed: assistant/tool flow ends with completion
Generating --> WaitingUserInput: tool asks questions / wait_agents / ask_questions
Generating --> WaitingIde: IDE-dependent tool result required
Generating --> Paused: pause-required / user decision boundary
ExecutingTools --> Generating: tool results processed, continue loop
ExecutingTools --> Completed: task_done / agent_finish
ExecutingTools --> WaitingUserInput: tool decision path needs user input
ExecutingTools --> Idle: abort
Paused --> Generating: ApproveTools / RejectTools outcome resumes
WaitingIde --> Generating: IdeToolResult
WaitingIde --> Idle: Abort
WaitingUserInput --> Generating: queued resume command
WaitingUserInput --> Idle: Abort
Completed --> Idle: new command / regenerate path
Error --> Idle: recovery / next queued command
Evidence for terminal/active runtime logic appears in session.rs (is_terminal_runtime_state) and queue.rs where the queue processor gates on Generating, ExecutingTools, Paused, and WaitingIde.
Canonical flow from user input to model output is:
flowchart LR
U[UserMessage] --> Q[command queue]
Q --> P[prepare
system prompt + knowledge RAG + history limit]
P --> L[linearize]
L --> S[LLM stream]
S --> C[StreamCollector]
C --> T[tool calls]
T -->|continue| P
T -->|finish| R[save trajectory / update runtime]
-
UserMessage enters the queue
- Commands are enqueued and processed by
queue.rs. - Priority user messages may be injected first.
- Commands are enqueued and processed by
-
Prepare phase
-
prepare_session_preamble_and_knowledge()builds the prompt. - Authoritative plan in
AGENTS.mdstates preparation includes:- system prompt
- knowledge RAG
- history limit
-
-
Linearization
-
linearize.rsmerges consecutive user messages. - It strips linearization-only messages such as summarization artifacts and compression reports.
- It also strips thinking blocks for LLM cache compatibility.
-
-
LLM streaming
-
stream_core.rsdrives the HTTP/SSE or websocket stream. - Stream deltas are accumulated via a
StreamCollectorimplementation.
-
-
Tool calls
- Tool call deltas are collected, finalized, and executed.
- The loop returns to prepare/stream when tool output requires further model turns.
-
Persistence and loop termination
- Trajectories are saved during or after each major boundary.
- Final runtime state becomes
Idle,Completed,WaitingUserInput, orErrordepending on outcome.
Chat SSE is served from the subscription endpoint documented in AGENTS.md:
GET /v1/chats/subscribe?chat_id={id}
Events carry a monotonic seq: u64. Clients must reconnect if they detect a gap in sequence numbers.
Authoritative event types listed in AGENTS.md:
SnapshotStreamStartedStreamDeltaStreamFinishedMessageAddedMessageUpdatedMessageRemovedMessagesTruncatedThreadUpdatedQueueUpdatedRuntimeUpdatedPauseRequired
The sequence number is incremented on each emitted chat event. Because subscribers receive an ordered stream, any missing sequence number indicates the client missed one or more events and should resubscribe and request a fresh snapshot.
Background process completion is not a dedicated SSE envelope in the authoritative contract; it is represented as a hidden event(process_completed) message delivered through MessageAdded.
POST /v1/chats/{chat_id}/commands accepts queued chat commands. The authoritative command set is:
UserMessageSetParamsUpdateMessageRemoveMessageTruncateMessagesRetryFromIndexAbortApproveToolsRejectToolsBranchFromChatRestoreFromTrajectoryClearDraftSetDraftRegenerate
-
UserMessage,RetryFromIndex, andRegeneratecan trigger generation. -
ApproveTools/RejectToolsare used when the session is paused for user decision. -
BranchFromChatandRestoreFromTrajectoryare trajectory-oriented commands that interact with persisted chat history. -
ClearDraftandSetDraftmanipulate the transient draft state.
The delta op set documented in AGENTS.md is:
AppendContentAppendReasoningSetToolCallsSetThinkingBlocksAddCitationAddServerContentBlockSetUsageMergeExtra
These operations represent the incremental assembly of a streamed assistant message.
stream_core.rs contains the low-level LLM transport and delta processing.
SetThinkingBlocks deltas are merged via merge_thinking_blocks() rather than replaced naively. The merge logic is designed to preserve Anthropic-style thinking content and signatures across streaming updates.
The authoritative merge order is:
- match by
(type, index) - then
(type, id) - then
(type, signature) - signatures are opaque and latest-wins replacement is used when needed
This preserves the stability of thinking/signature blocks while still allowing incremental streaming updates.
The stream core explicitly handles Anthropic reasoning/thinking blocks so that signatures are not lost during merge and finalization. This is important for provider compatibility and for replaying assistant state accurately.
linearize.rs is responsible for converting the stored conversation into a model-friendly sequence.
Key behaviors:
- merges consecutive user messages
- strips thinking blocks for LLM cache compatibility
- suppresses linearization-only messages such as summarization artifacts
- preserves required anchor messages according to compression exemptions
history_limit.rs re-exports the shared history limiting logic from refact_chat_history::history_limit.
The authoritative AGENTS.md summary describes a 4-stage compression pipeline:
- deduplicate context files
- compress tool results
- fix tool calls
- limit history
CompressionStrength values are:
AbsentLowMediumHigh
Trajectories are persisted under:
.refact/trajectories/{chat_id}.json
trajectories.rs handles:
- saving and loading trajectory snapshots
- restoring sessions from trajectory data
- listing and subscribing to trajectory events
- repairing and validating trajectory identity
ChatSession::new_with_trajectory() rebuilds an in-memory session from persisted data. This is the mechanism behind restore/reload flows and chat continuity across restarts.
The queue processor in queue.rs is the execution engine for commands. It:
- drains priority user messages
- handles allowed commands while paused or waiting on IDE input
- prepares preamble and knowledge before generation
- invokes
start_generation() - saves trajectories at important boundaries
Observed runtime gating includes:
-
WaitingIdeonly accepts IDE result commands andAbort -
Pausedonly accepts tool-decision commands andAbort - busy states are
GeneratingandExecutingTools
Refact on GitHub: https://github.com/JegernOUTT/refact
- Agent Modes
- Agent Tools
- Task Planner & Cards
- Worktrees
- Subagents
- Memory & Knowledge
- Hidden Roles & Plans
- Context Compression
- Scheduler & Cron
- Processes & PTY
- Buddy
- MCP
- Skills, Commands & Hooks
- Marketplace
- Chat System
- Providers
- Caps & Models
- Code Completion (FIM)
- AST
- VecDB
- Exec Runtime
- HTTP API
- Checkpoints & Git
- Voice