Skip to content

Memory and Knowledge

refact-planner edited this page Jun 7, 2026 · 1 revision

Memory and Knowledge

How Refact stores durable project memory: graph facts, vector search, autoinjected context, and typed task memories.

Overview

Refact keeps project memory in several complementary layers:

  • a knowledge graph for structured relationships and facts,
  • a VecDB for semantic retrieval over project content,
  • autoinjected context files that turn search results into model-visible context,
  • and typed task memories stored under the project state directory.

These pieces live around the project-scoped .refact/ directory and are used alongside chat history and tool output.

Project state on disk

Project-local state is rooted under <project>/.refact/ and includes:

  • knowledge/ — knowledge graph and memory artifacts
  • trajectories/ — chat/agent trajectories
  • tasks/ — task storage
  • integrations.d/ — project integrations
  • vecdb/ — vector database state

The engine AGENTS file describes project state as .refact/ with trajectories/, knowledge/, tasks/, and integrations/.

Knowledge graph

The knowledge graph implementation is in src/knowledge_graph/ and is backed by petgraph::DiGraph.

Core pieces:

  • kg_structs.rs — graph data structures and node/edge metadata
  • kg_builder.rs — build/update graph content from project inputs
  • kg_cleanup.rs — cleanup and pruning logic
  • kg_staleness.rs — freshness/staleness tracking
  • kg_query.rs — query helpers over the graph

Typical responsibilities include:

  • building the graph from facts and relationships,
  • removing stale or duplicate nodes,
  • determining when nodes need refresh,
  • and answering graph queries for downstream tools.
flowchart LR
  Inputs[project files / memories / tasks] --> Builder[kg_builder]
  Builder --> G[(petgraph DiGraph)]
  G --> Cleanup[kg_cleanup]
  G --> Staleness[kg_staleness]
  G --> Query[kg_query]
Loading

VecDB

VecDB is the semantic search layer. In the engine AGENTS file it is described as SQLite + vec0 semantic search.

The refact-vecdb crate includes:

  • vdb_sqlite.rs — SQLite/vec0 storage operations
  • vdb_highlev.rs — higher-level VecDB orchestration
  • vdb_structs.rs — search and index data structures
  • vdb_file_splitter.rs — generic file chunking
  • vdb_markdown_splitter.rs — markdown-aware chunking
  • vdb_trajectory_splitter.rs — trajectory/text chunking
  • ast_file_splitter.rs — AST-aware file chunking
  • fetch_embedding.rs — embedding fetch logic
  • vdb_emb_aux.rs — embedding table helpers and cleanup utilities

The core VecDB trait in crates/refact-core/src/vecdb_types.rs exposes operations like:

  • vecdb_search(...)
  • vecdb_search_with_embedding(...)
  • embed_query(...)
  • vectorizer_enqueue_files(...)
  • remove_file(...)
  • get_status(...)

Splitters and embeddings

VecDB does not index every file as one blob. It uses splitters specialized by content type:

  • trajectory splitter for conversation/agent traces,
  • markdown splitter for markdown documents,
  • file splitter for general text,
  • AST splitter for code-aware chunks.

Embeddings are fetched through the configured embedding provider, then stored in SQLite/vec0 for cosine/KNN-style semantic retrieval.

Autoinjection pipeline

Refact autoinjects useful content into chat through a pipeline that starts from @ commands and ends with token-aware postprocessing.

High-level flow:

  1. @ commands are parsed from the user/chat input.
  2. Matching content is turned into context_file messages.
  3. Postprocessing resolves paths, merges/normalizes context, and trims content.
  4. Token-aware truncation prioritizes useful parts, with AST-aware ordering when available.

The relevant code areas include:

  • src/tools/ and src/at_commands/ for command parsing and tool routing,
  • crates/refact-postprocessing/src/pp_context_files.rs for context-file postprocessing,
  • crates/refact-vecdb/src/ast_file_splitter.rs for AST-guided chunking,
  • crates/refact-agentic/src/mode_transition.rs for context file rendering/budgeting.
flowchart LR
  AtCmd[@ commands] --> Context[context_file messages]
  Context --> Resolve[path resolution + dedup]
  Resolve --> Post[postprocessing]
  Post --> Trim[token-aware truncation]
  Trim --> Model[LLM input]
Loading

Typed task memories

The tasks/memory system stores typed memory records rather than only freeform notes. The wiki task categories called out in the prompt match the project’s task-memory concept:

  • decision
  • spec
  • gotcha
  • risk
  • handoff
  • progress
  • postmortem
  • brief

These are used to preserve structured task knowledge across sessions and to support project-scoped recall.

Relationship to chat history

Knowledge and VecDB sit alongside chat trajectory storage, but they are not the same thing:

  • knowledge graph = structured nodes/edges
  • VecDB = semantic retrieval over chunks/embeddings
  • trajectory = chronological agent/chat history
  • task memories = durable typed notes for work items

For chat-side compression rules and hidden roles, see Hidden Roles and Plans. For history reduction behavior, see Context Compression.

Clone this wiki locally