Agent Memory Engine

Local-first persistent memory and project knowledge runtime for coding agents.

Agent Memory Engine

A local-first MCP runtime that gives coding agents persistent, evidence-backed project memory and grounded project knowledge across sessions.

Why it exists

Coding agents face a fundamental problem: every session starts cold.

They forget project context between sessions.
They repeatedly scan repositories to re-learn what modules do.
They lose debugging lessons and historical decisions.
Flat RAG cannot distinguish stable constraints, past incidents, architecture decisions, and raw code evidence.
Even large context windows still require intelligent prioritization and token budgeting.

Memory Engine solves this by maintaining a structured, evidence-backed memory tree alongside an indexed project knowledge base — both local, both automatic, no infrastructure required.

Core capabilities

Capability	Details
Persistent memory tree	MemoryNode hierarchy: constraints, architecture, modules, decisions, incidents, procedures
Evidence-backed memory	Each node links to source Evidence entries (test output, code references, review notes)
Candidate staging	Reflection generates MemoryCandidates before promoting to the live tree
Confidence-aware promotion	create / update / merge / supersede / discard / needs_review
Conflict detection	High-risk areas (auth, schema, state-machine, retry) flagged for review
Ancestor consolidation	Parent node summaries auto-updated after each promotion
Agent-native recall	Intent-aware retrieval before coding tasks — no manual queries
Progressive inspection	Drill down into any memory node, its children, and linked evidence
Automatic post-task reflection	Agent reports outcome → system decides whether and how to retain knowledge
Knowledge ingestion	Markdown, code, ADR, test reports, runtime logs, git diffs
Local FTS5 search	SQLite FTS5 with porter tokenizer; no external search engine
Optional semantic retrieval	Persistent local sqlite-vec backend with sentence-transformers / Ollama embeddings (Phase 13); default OFF, no required deps
Lexical structured fallback	Full retrieval without vector backend or Docker
Unified ContextPack	Memory + knowledge merged, deduplicated, token-budgeted
Retrieval traceability	Per-signal score breakdown in every response
Local-first privacy	All data stays inside `.memory-engine/`; no telemetry, no cloud calls
Python MCP server	stdio transport; no TypeScript, no Docker, no external daemon
Zero-touch bootstrap	Auto-initializes on first MCP connection
Incremental indexing	JSON manifest; only changed files re-indexed on subsequent runs
Git-aware synchronization	Detects branch, HEAD commit, staged/modified files via safe read-only Git commands
Branch-aware retrieval	Prefers memory from the current branch; falls back to mainline then global
Branch-scoped memory writes	New memories stamped with branch name and scope; mainline promotion is explicit
Multi-granularity memory	Four retrieval layers (proposition → paragraph → chunk → module summary) created at write-time; selected at query-time by intent
Deterministic proposition extraction	Atomic facts extracted from docstrings, security comments, raise statements, markdown bullets — no LLM required
Intent-aware granularity routing	`bug_fix` retrieves constraint/risk propositions; `architecture_review` retrieves module summaries
Query-time context assembly	Proposition hits optionally expand to parent paragraphs; architecture queries attach module summaries
Memory retention & compaction	Candidate expiry, stale/superseded archival, multi-source compaction with full lineage — no physical deletion
Protected memory types	`constraint`, `security_rule`, `architecture`, `decision` excluded from all auto-archive and auto-compaction
Agent memory policy	Canonical `AGENT_MEMORY_POLICY.md` auto-installed to `CLAUDE.md` / `.cursor/rules/` on first bootstrap; CLI available for manual control
Project context seeding	`seed_project_context` MCP tool + `memory seed` CLI wizard — write initial constraints, decisions, and project overview on day 1, eliminating the cold-start problem
Windows installer	PowerShell installer (`scripts/install.ps1`) for Windows-native setup without Docker, WSL, or cloud services

Quick Start

macOS / Linux:

git clone https://github.com/uudam42/agent-memory-engine.git
cd agent-memory-engine
bash scripts/install.sh

Windows (PowerShell):

git clone https://github.com/uudam42/agent-memory-engine.git
cd agent-memory-engine
.\scripts\install.ps1

The installer checks Git and Python 3.11+, installs uv when needed, resolves dependencies, runs a health check, and prints a ready-to-copy MCP configuration for Cursor or Claude Code. No Docker, WSL, or cloud services required.

Default Deployment Model

Agent Memory Engine runs locally as a stdio MCP server. The default setup uses Python, uv, and local SQLite/FTS5 storage. Docker, cloud databases, and external embedding services are not required for the standard local workflow.

MCP Stdio Mode (recommended)

Runs locally through the MCP client using uv run. Supported by Cursor, Claude Code, and any client that implements the MCP stdio transport.

Option A — explicit project root (Cursor, most clients):

{
  "mcpServers": {
    "memory-engine": {
      "command": "uv",
      "args": [
        "run",
        "--directory",
        "/absolute/path/to/agent-memory-engine",
        "memory-engine-mcp",
        "--project-root",
        "/absolute/path/to/your-project"
      ]
    }
  }
}

Option B — project root via environment variable (Claude Code):

{
  "mcpServers": {
    "memory-engine": {
      "command": "uv",
      "args": [
        "run",
        "--directory",
        "/absolute/path/to/agent-memory-engine",
        "memory-engine-mcp"
      ],
      "env": {
        "MEMORY_ENGINE_PROJECT_ROOT": "/absolute/path/to/your-project"
      }
    }
  }
}

Note: Replace all /absolute/path/to/... with real paths on your machine. Config file location and workspace-variable support differ by client:

Cursor: .cursor/mcp.json or global Cursor MCP settings

Claude Code: ~/.claude.json or project-level config

Run bash scripts/install.sh to get a pre-filled config block with your actual paths.

FastAPI / HTTP Mode (optional)

A FastAPI application (memory_engine/main.py) is included for direct API experimentation, demos, or containerized environments. It is not required for the standard MCP workflow.

uvicorn memory_engine.main:app --reload
# API docs at http://localhost:8000/docs

Manual setup (advanced)

# Install uv once
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies
uv sync

# Run MCP server directly
uv run memory-engine-mcp --project-root /path/to/project --log-level DEBUG

Open your target project and start coding

That's it. Memory Engine starts automatically on the first MCP connection and handles bootstrap, indexing, and memory management from there.

What happens automatically

User opens project
     │
     ▼
MCP client starts memory-engine-mcp via stdio
     │
     ▼
Project root resolved (.git / pyproject.toml / package.json marker)
     │
     ▼
.memory-engine/ created (if first use)
     │
     ▼
README, ADRs, architecture docs, constraints indexed first
     │
     ▼
Broader source files indexed incrementally in background
     │
     ▼
Agent starts non-trivial coding task
     │
     ▼  [automatic]
retrieve_agent_context called
     │   → relevant constraints, incidents, decisions, procedures, source refs
     │
     ▼
Agent implements and validates
     │
     ▼  [automatic, on success]
reflect_and_write called
     │   → system evaluates retention gates
     │   → creates MemoryCandidates if worthy
     │   → promotes to memory tree
     │   → consolidates ancestor summaries
     │
     ▼
memory_status shows updated counts

Architecture

graph TD
    A[User / Coding Agent] --> B[MCP Client]
    B -->|stdio| C[Python MCP Server]
    C --> D[Agent Skills]
    C --> E[Service Layer]
    C --> F[Knowledge Layer]
    D -->|recall / inspect / reflect| E
    E -->|memory lifecycle| G[(Memory Tree\nSQLite)]
    F -->|ingest / search| H[(Knowledge Base\nSQLite + FTS5)]
    F --> I[Optional Vector Backend]
    E --> G
    F --> H
    C --> J[Bootstrap & Incremental Index]
    J --> K[.memory-engine/\nproject-local storage]
    G --> K
    H --> K

Agent calling chain

sequenceDiagram
    participant Agent
    participant MCP as MCP Server
    participant Skills as Agent Skills
    participant Services as Service Layer
    participant DB as SQLite / FTS5

    Agent->>MCP: retrieve_agent_context(task, files, symbols)
    MCP->>Skills: QueryAnalyzer.analyze()
    Skills->>DB: MemoryNode recall (intent-weighted SQL)
    Skills->>DB: KnowledgeSearch (FTS5 + vector RRF)
    Skills->>Skills: Rank, compose, dedup, token-trim
    MCP-->>Agent: ContextPack (memory + knowledge + trace)

    Note over Agent: implements and validates

    Agent->>MCP: reflect_and_write(task, outcome, verification_status)
    MCP->>Skills: ReflectionSkill.analyze() — gate check
    Skills->>Services: PostTaskService.reflect_and_write()
    Services->>Services: PromotionService.promote()
    Services->>Services: ConsolidationService.update_ancestors()
    Services->>DB: persist MemoryNode updates
    MCP-->>Agent: {outcome: "persisted", candidates_promoted: 2}

Memory lifecycle

Task result
    │
    ▼
ReflectionSkill.analyze()
    │  gates: outcome ≠ failed/reverted, verification_status, confidence ≥ threshold,
    │         summary word count, known-trivial patterns
    │
    ├─ skip → return {skip_reason}
    │
    └─ pass ▼
    │
MemoryCandidate generation
    ├─ constraint     (importance 0.92)
    ├─ procedure      (importance 0.72)
    ├─ incident/debug (importance 0.85)
    ├─ module         (importance 0.62)
    └─ decision       (importance 0.82)
    │
    ▼
PromotionService.promote()
    ├─ create     — new node
    ├─ update     — same title, content refreshed
    ├─ merge      — near-duplicate (Jaccard ≥ 0.80)
    ├─ supersede  — existing node confirmed wrong
    ├─ discard    — low value / already known
    └─ needs_review — conflicts with high-confidence existing node
    │
    ▼
ConsolidationService.update_ancestors()
    │  parent.summary = concat(children.summaries)
    ▼
cache invalidated + memory_revision bumped

Node statuses

Status	Meaning	Default retrieval
`candidate`	Staged, pending promotion decision	Excluded
`active`	Live, returned in recall	Included
`stale`	Outdated; preserved for history	Penalized
`superseded`	Replaced by newer node	Excluded
`needs_review`	Flagged conflict; human review recommended	Excluded
`archived`	Historical / audit-only (Phase 11)	Excluded
`compacted`	Synthesis of source memories (Phase 11)	Included with lineage
`expired`	Candidate never promoted past window (Phase 11)	Excluded

Phase 11: Memory Retention

Memory does not grow indefinitely. The MemoryRetentionService runs lifecycle transitions:

# Diagnose
memory retention status <project>
memory retention report <project>

# Apply (dry-run by default)
memory retention run <project>
memory retention run <project> --no-dry-run

# Restore an archived memory
memory retention restore <project> <memory-id>

Protected types (constraint, security_rule, architecture, decision) are never auto-archived or auto-compacted.

See docs/memory_retention_compaction.md.

Knowledge lifecycle

Documents / code / ADRs / tests / logs / diffs
    │
    ▼
redact()  ← 8 patterns: API keys, tokens, passwords, private keys,
           │             connection strings, JWTs, AWS keys, Slack tokens
    ▼
SHA-256 content hash → dedup check
    │
    ▼
Source-type chunker
    ├─ Markdown    → heading-based sections (≤1200 tokens)
    ├─ Code        → class/function blocks   (≤1000 tokens)
    ├─ Test report → result windows
    ├─ Log         → sliding windows         (≤600 tokens)
    └─ Diff/Patch  → hunk-based chunks
    │
    ▼
KnowledgeDocument + KnowledgeChunk (SQLite)
    │
    ├─ FTS5 insert    (lexical — always available)
    └─ Vector upsert  (optional — InMemoryVectorIndex or Qdrant)
    │
    ▼
hybrid retrieval → RRF fusion → source-quality ranking
    │
    ▼
UnifiedContextPack (40% of token budget)

─── Phase 10: multi-granularity write path (runs in parallel) ───────────────

same raw content
    │
    ├─ paragraph_segmenter  → KnowledgeParagraphORM  → knowledge_paragraphs_fts
    ├─ proposition_extractor → KnowledgePropositionORM → knowledge_propositions_fts
    └─ summarizer           → KnowledgeChunkSummaryORM → knowledge_summaries_fts
    │
    ▼
multigranular retrieval (25% of knowledge budget)
    │
    ▼
UnifiedContextPack.multigranular_chunks (independent of knowledge_chunks budget)

Directory structure

memory_engine/
├── main.py                  ← FastAPI app (dev / direct API use)
├── cli.py                   ← Debug CLI
├── config.py                ← Pydantic Settings
│
├── agent/                   ← Stage 8 namespace (re-exports)
│   ├── skills/              → memory_engine.skills
│   ├── policies/            → reflection gate constants
│   └── contracts/           → agent I/O domain models
│
├── skills/                  ← agent-facing behaviors (recall, inspect, reflect)
├── services/                ← domain orchestration (promotion, consolidation)
├── knowledge/               ← ingestion, chunking, FTS5, vector, search, fusion, cache
│                               proposition_extractor, paragraph_segmenter, summarizer,
│                               granularity_router, multigranular_search  (Phase 10)
├── repositories/            ← persistence abstraction (memory_node, candidate, evidence)
├── models/                  ← Pydantic domain + SQLAlchemy ORM
│
├── bootstrap/               ← local runtime (project_root, storage, security, state)
├── runtime/                 ← Stage 8 namespace (re-exports bootstrap + cache + config)
│
├── mcp/                     ← MCP adapter (tools, resources, server, project_context)
├── api/                     ← FastAPI routes
└── db/                      ← SQLite session + init

docs/
├── architecture/            ← system-overview, memory-lifecycle, knowledge-pipeline,
│                               retrieval-pipeline, mcp-integration, local-runtime,
│                               multigranular_memory_architecture  (Phase 10)
└── guides/                  ← quickstart, configuration, privacy-and-security

tests/
├── test_phase4.py – test_phase7.py   ← phase integration tests
└── test_*.py                          ← unit and component tests

MCP tools

Tool	Purpose
`seed_project_context`	Call once on new projects. Write initial constraints, decisions, project description, and conventions directly as active memory nodes. Auto-extracts from README.md when fields are omitted. Eliminates cold-start problem.
`retrieve_agent_context`	Retrieve memory + knowledge before a coding task. Phase 10: pass `task_intent`, `preferred_layers`, `proposition_types` to guide granularity routing.
`inspect_memory`	Drill into a MemoryNode, its children, and evidence
`inspect_knowledge`	Inspect a KnowledgeChunk or source file range (redacted)
`reflect_and_write`	Report validated work to the reflection pipeline
`memory_status`	Project health, retrieval mode, index counts, revisions
`refresh_project_knowledge`	Trigger incremental rescan; also builds Phase 10 proposition/paragraph/summary layers

MCP resources

Resource	Content
`memory://project/current/constraints`	Active project constraints
`memory://project/current/architecture`	Architecture and module summaries
`memory://project/current/status`	Bootstrap state, retrieval mode, health
`memory://project/current/recent-incidents`	Recent debug incidents
`memory://project/current/memory-tree-summary`	Memory tree outline
`memory://project/current/agent-policy`	Generated agent policy
`memory://project/current/git-context`	Current Git state (branch, HEAD, staged/modified files — no remote URLs)
`memory://project/current/branch-memory-summary`	Memories organized by branch scope
`memory://project/current/sync-status`	Incremental sync and index freshness
`memory://project/current/retention-status`	Lifecycle counts, archive/expiry candidates (Phase 11)
`memory://project/current/compaction-report`	Compaction group candidates (Phase 11)

Phase 11: Agent Memory Policy

The policy is installed automatically. On first bootstrap (first MCP tool call after setup), Memory Engine writes a workflow policy block directly into the project's CLAUDE.md (Claude Code) and generates .memory-engine/generated/AGENT_MEMORY_POLICY.md. No manual step required.

The policy block contains explicit call/skip decision rules for both tools:

retrieve_agent_context — call before editing production code, tests, CI, dependencies, or config; before debugging; when touching ≥ 2 files or any unfamiliar subsystem. Skip for pure explanations, single-line typo fixes, or when already called for the same task this session.
reflect_and_write — call after tests_passed or build_success on ≥ 2 changed files or a non-trivial architectural decision. Skip if tests failed, task was reverted, only one trivial file changed, or no validation ran.

For manual control or Cursor adapter installation:

# Re-generate policy (idempotent, preserves user content outside markers)
memory policy generate --project-root .

# Install or update client adapters explicitly
memory policy install --project-root . --client claude-code
memory policy install --project-root . --client cursor

# Check adapter installation status
memory policy status --project-root .

See docs/agent_memory_policy.md.

Local storage

your-project/.memory-engine/
├── config.yaml              ← edit to customize; never overwritten
├── project_state.json       ← bootstrap status, revisions
├── memory.db                ← all data (memories, knowledge, candidates)
├── indexes/manifests/       ← incremental indexing file manifest
├── generated/
│   └── AGENT_MEMORY_POLICY.md
├── bootstrap/bootstrap_report.json
├── constraints.md           ← human-authored; safe to commit
├── team-rules.md            ← human-authored; safe to commit
└── decisions.md             ← human-authored; safe to commit

Add .memory-engine/ to .gitignore (generated hint on first bootstrap). The three human-authored .md files may optionally be committed.

Reset: rm -rf your-project/.memory-engine/

Project context seeding (Phase 12)

On a brand-new project the memory database is empty, so retrieve_agent_context returns nothing useful for the first several sessions. Seeding eliminates this cold-start problem by writing initial memory nodes on day one.

Option A — MCP tool `seed_project_context` (agent-driven)

Call once when memory_status shows active_memories == 0:

{
  "description": "Task scheduling system with retry and lifecycle management.",
  "constraints": [
    "Terminal states (COMPLETED, FAILED) must never be exited.",
    "Concurrent execution of the same task is forbidden."
  ],
  "decisions": [
    "SQLite over PostgreSQL — zero-infrastructure local deployment.",
    "Event sourcing for audit trail."
  ],
  "tech_stack": ["Python", "FastAPI", "SQLite"],
  "conventions": ["PRs require one reviewer.", "No force-push to main."]
}

All fields are optional. When description is omitted, README.md is auto-scanned for constraint/decision headings and bullet lists (no LLM). Nodes are written directly as active with confidence=1.0 — authoritative human input bypasses the candidate/promotion pipeline.

Option B — CLI wizard `memory seed` (interactive)

memory seed my-project --project-root /path/to/project

Runs a step-by-step prompt for description, tech stack, constraints, decisions, and conventions, then writes nodes immediately.

Phase 13 — Local Persistent Vector Retrieval

Lexical FTS5 retrieval misses cross-wording matches: a query for "DB schema upgrade" will not rank a chunk that says "database migration". Phase 13 adds a persistent local semantic index so retrieval also matches on meaning, while keeping the default install lightweight.

Query
  ├── FTS5 lexical retrieval        (always active)
  ├── sqlite-vec semantic retrieval (optional, default OFF)
  ├── Branch / revision / lifecycle filters
  ├── RRF candidate fusion
  ├── Deterministic multi-signal ranker
  └── Token-budgeted UnifiedContextPack

Design guarantees

Default uv sync installs nothing extra — no model downloads, no new required deps.
sqlite-vec is the persistent backend: no Docker, no external service.
Sentence-transformers and Ollama are optional local embedding providers. No cloud APIs.
FTS5 lexical fallback is unchanged when semantic is disabled.
Branch / revision / lifecycle safety filters are never bypassed by semantic results.
Content is redacted before embedding (same redact() path as the rest of the engine).

Enable it

Option A — installer prompt (recommended for new installs)

The installer asks whether to enable semantic retrieval at the end:

bash scripts/install.sh
# → "Enable semantic retrieval? [y/N]"
# → y: installs sentence-transformers + sqlite-vec

Then activate per project (writes to .memory-engine/config.yaml, persists across restarts):

memory semantic status --enable --project-root /your/project

Option B — manual

# 1. Install deps
uv pip install 'memory-engine[semantic-transformers]'   # sentence-transformers + sqlite-vec
# or: uv pip install 'memory-engine[semantic-ollama]'   # Ollama + sqlite-vec

# 2. Persist to project config (survives terminal restarts)
memory semantic status --enable --project-root /your/project

Option C — env vars (CI / temporary override)

export MEMORY_ENGINE_SEMANTIC_ENABLED=1
export MEMORY_ENGINE_EMBEDDING_PROVIDER=sentence_transformers
export MEMORY_ENGINE_EMBEDDING_MODEL=BAAI/bge-small-en-v1.5

Env vars always take precedence over config.yaml. When enabled and available, retrieval_mode becomes hybrid; otherwise the engine stays in lexical_structured_fallback and memory_status returns a suggestions field explaining how to activate it.

CLI

memory semantic status  --project-root .   # provider/backend/model + embedded counts
memory semantic doctor  --project-root .   # availability, dimension, orphan checks
memory semantic reindex --project-root .   # incremental embedding (--full to rebuild)
memory semantic clear   --project-root . --confirm   # clears vectors only

Vectors live in .memory-engine/vector.db. clear never touches source knowledge or memory trees. Full details: docs/local_vector_retrieval.md.

Human-authored seed knowledge

Create these files to provide stable project knowledge that cannot be safely inferred from code alone:

.memory-engine/constraints.md

# Project Constraints

## Auth
Do not bypass JWT validation. All routes require Bearer token.

## Database
Never use raw SQL. SQLAlchemy ORM only.

## Scheduler
Terminal task states (COMPLETED, FAILED, CANCELLED) are immutable.

.memory-engine/team-rules.md

# Team Rules

- PRs require 2 approvals before merge
- All public APIs must have OpenAPI documentation
- Log structured JSON only (no print statements in production code)

These files are indexed as high-priority knowledge on bootstrap and returned in context before relevant tasks.

Retrieval modes

Default local mode: `lexical_structured_fallback`

Active when no persistent vector backend is available (default for local use).

Signals used for ranking:

SQLite FTS5 lexical match (BM25)
Module-path overlap with current task files
Symbol overlap with current task symbols
Memory tree proximity
Node importance and confidence
Freshness (recency weighting)
Project-scoped TTL cache

Enhanced mode: `hybrid` (Phase 13)

Active when semantic retrieval is enabled and both sqlite-vec and a local embedding provider are available. Adds real cosine similarity over chunk, paragraph, proposition, and summary embeddings via RRF fusion. The semantic_similarity score in the retrieval trace reflects the actual vector score (no longer always 0.0). See the Phase 13 section above.

Vector retrieval is optional. The default local mode works without sqlite-vec, Qdrant, Docker, or any external service.

Phase 10: multi-granularity retrieval

Active automatically when Phase 10 knowledge is indexed (created by refresh_project_knowledge or the normal ingest path).

Four retrieval layers, created deterministically at write-time (no LLM):

Layer	Unit	Typical use
Proposition	One atomic fact (`shell=False is enforced`)	Bug fixes, security audits
Paragraph	One function or heading section	Feature work, code explanation
Chunk	Fixed-size content block (existing)	General keyword search
Module summary	Whole-file digest with key symbols	Architecture review, onboarding

The layer selected for a query is determined by task_intent:

# in retrieve_agent_context:
task_intent = "bug_fix"        # → propositions first (constraint/security_rule/risk)
task_intent = "architecture_review"  # → module summaries first
task_intent = "feature_implementation"  # → paragraphs + propositions + summaries

Callers can also override directly:

preferred_layers = ["proposition"]          # force proposition-only
proposition_types = ["security_rule"]       # sub-filter by type

Results appear in multigranular_chunks alongside the existing knowledge_chunks.

See docs/architecture/multigranular_memory_architecture.md for the full design.

Privacy and security

Local-only: all data stays in .memory-engine/; nothing leaves your machine
No telemetry: no usage data sent anywhere
No cloud embedding: no external API calls by default
No Docker: not required for any feature
Path boundaries: all file reads restricted to resolved project root
Symlink protection: links escaping project root rejected
Secret redaction: runs before persistence and before MCP output
Default exclusions: .env, secrets/, *.pem, *.key, node_modules/, .git/, binary files, files over 5 MB
No auto Git commits: never — Git is read-only (rev-parse, branch, status, merge-base only)
No remote URL exposure: Git remote URLs are never returned in any output
No Git identity exposure: user name and email are never collected or returned
No destructive Git commands: commit, reset, push, clean, checkout, merge, rebase, fetch are all blocked
No writes outside .memory-engine/: guaranteed

Configuration

Generated at .memory-engine/config.yaml on first bootstrap:

project:
  name: auto
  root_path: auto

runtime:
  auto_bootstrap: true
  auto_recall: true
  auto_reflect: true
  auto_index_on_start: true
  incremental_indexing: true

privacy:
  mode: local
  redact_secrets: true
  allow_network_embedding: false

knowledge:
  include:
    - README.md
    - docs/**
    - src/**
    - app/**
    - lib/**
    - tests/**
  exclude:
    - node_modules/**
    - .git/**
    - .venv/**
    - dist/**
    - build/**
    - .env
    - secrets/**
  max_file_size_mb: 5

retrieval:
  default_token_budget: 6000
  cache_enabled: true
  vector_backend: auto
  allow_degraded_fallback: true
  # Phase 9: branch-aware retrieval
  branch_aware_ranking: true
  prefer_current_branch: true
  include_ancestor_branch_memory: true
  include_mainline_fallback: true

runtime:
  # Phase 9: Git-aware synchronization
  git_aware_sync: true
  check_git_status_on_retrieval: true
  auto_incremental_sync: true

memory:
  # Phase 9: branch-scoped write policy
  branch_scope_on_feature_work: current_branch
  mainline_promotion_requires_confirmation: true

privacy:
  # Phase 9: Git identity protection
  expose_git_remote_url: false
  redact_git_identity: true

User edits are preserved on re-bootstrap.

Demo scenario

Scheduler project. Task: Add exponential retry backoff without breaking terminal task state semantics.

Agent calls retrieve_agent_context:

{
  "constraints": [
    {
      "title": "Terminal State Immutability",
      "summary": "COMPLETED, FAILED, CANCELLED are terminal states. Any operation that transitions out of a terminal state is a critical bug.",
      "importance": 0.95
    }
  ],
  "incidents": [
    {
      "title": "Retry Loop Re-entered Terminal Task",
      "summary": "In v0.8.2, a retry race condition re-entered a COMPLETED task. Root cause: retry check did not verify terminal status before re-queuing.",
      "importance": 0.88
    }
  ],
  "knowledge_chunks": [
    {
      "source_path": "docs/adr/003-retry-policy.md",
      "preview": "Decision: use exponential backoff with jitter. Max 5 retries..."
    }
  ],
  "retrieval_trace": [...],
  "meta": {
    "retrieval_mode": "lexical_structured_fallback",
    "vector_backend": "ephemeral",
    "warnings": ["Semantic vector retrieval is unavailable..."]
  }
}

Agent implements retry logic with terminal-state guard.
Tests pass. Agent calls reflect_and_write:

{
  "outcome": "persisted",
  "candidates_promoted": 2,
  "consolidation_notes": ["Parent 'Scheduler Core' summary updated"]
}

Debug CLI

For maintainers, demos, and troubleshooting only. Not the normal workflow.

memory-engine debug status --project-root /path/to/project
memory-engine debug bootstrap --project-root /path/to/project
memory-engine debug index --project-root /path/to/project
memory-engine debug recall "add retry backoff" --project-root /path/to/project
memory-engine debug inspect <node-id>
memory-engine debug reset-project --project-root /path/to/project

Testing

# Run all tests
pytest -v

# Run focused
pytest tests/test_phase7.py -v
pytest tests/test_phase6.py -v
pytest -k "recall" -v

511 tests currently passing. All deterministic. No external services required. (Semantic retrieval tests use mocked providers or pytest.importorskip("sqlite_vec") — no models are downloaded during the test run.)

Limitations and future work

Persistent local vector backend — current InMemoryVectorIndex does not survive process restarts
Optional Qdrant backend — interface exists; client not installed by default; enables semantic cross-lingual retrieval
PyPI publishing — pip install memory-engine-mcp not yet available
Binary packaging — no binary installer yet
Streamable HTTP remote mode — stdio only; no team-shared HTTP transport yet
Team-shared memory — each project has isolated local storage; no shared team memory yet
Authentication and permissions — no per-user or per-team access control yet
Richer code parsing — chunking is line-range based; AST-aware parsing is future work
Larger repository benchmarks — not yet validated at monorepo scale
Retention scheduling — memory retention run is explicit; no background daemon

Contributing

Read docs/architecture/system-overview.md first.
Preserve service-layer boundaries: MCP/API layers stay thin.
Keep business logic in skills/, services/, knowledge/.
New knowledge source types → knowledge/chunkers.py + new SourceType enum value.
New MCP tools → mcp/tools.py, thin wrapper only; delegate to services.
New domain model → models/domain.py or models/knowledge_domain.py.
Add tests for all new behavior.
Never weaken project-root security boundaries.
Run pytest -v and confirm all tests pass before opening a PR.

Development

# Install dev dependencies (includes pytest, httpx)
uv sync --extra dev

# Run tests
uv run pytest -v

# Start FastAPI service (dev / direct API use — not required for MCP)
uvicorn memory_engine.main:app --reload
# API docs at http://localhost:8000/docs

# Run MCP server directly (stdio, blocks until client disconnects)
uv run memory-engine-mcp --project-root /path/to/project --log-level DEBUG

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.github/workflows		.github/workflows
docs		docs
examples/sample-mcp-configs		examples/sample-mcp-configs
memory_engine		memory_engine
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Agent Memory Engine

Why it exists

Core capabilities

Quick Start

Default Deployment Model

MCP Stdio Mode (recommended)

FastAPI / HTTP Mode (optional)

Manual setup (advanced)

Open your target project and start coding

What happens automatically

Architecture

Agent calling chain

Memory lifecycle

Node statuses

Phase 11: Memory Retention

Knowledge lifecycle

Directory structure

MCP tools

MCP resources

Phase 11: Agent Memory Policy

Local storage

Project context seeding (Phase 12)

Option A — MCP tool seed_project_context (agent-driven)

Option B — CLI wizard memory seed (interactive)

Phase 13 — Local Persistent Vector Retrieval

Enable it

CLI

Human-authored seed knowledge

Retrieval modes

Default local mode: lexical_structured_fallback

Enhanced mode: hybrid (Phase 13)

Phase 10: multi-granularity retrieval

Privacy and security

Configuration

Demo scenario

Debug CLI

Testing

Limitations and future work

Contributing

Development

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Option A — MCP tool `seed_project_context` (agent-driven)

Option B — CLI wizard `memory seed` (interactive)

Default local mode: `lexical_structured_fallback`

Enhanced mode: `hybrid` (Phase 13)

Packages