Skip to content

infektyd/minni

Repository files navigation

Minni

Status Python Node Tests License

Local-first memory and governance layer for AI agents.

Identity loads whole. Knowledge loads chunked.

Minni gives long-running agent work a durable spine — identity, working state, retrieval, evidence, handoffs, learning proposals, and audit trails that stay inspectable on the host machine. It sits between chat-history- as-memory and pure RAG: agents resume with typed state, verified evidence, open loops, and a clear next action instead of rediscovering context from scratch.

Note: This project is pre-v1. Core subsystems work and are tested, but integration depth varies across components. The status table below shows what's solid, what's early, and what's stubbed.


Highlights

Feature What it does
♻️ Session rehydration Resume with verified facts, remembered-but-unverified state, open loops, and a first verification action
🧩 Agent-agnostic MCP plugin One protocol-standard MCP server — works with any agent that speaks MCP. Ships convenience manifests for Codex, Claude Code, Gemini, and KiloCode
🔒 Proposal-first learning No silent writes — learn requests stage candidates; only operator-gated resolution writes durable memory
🔍 Hybrid retrieval FTS5 + FAISS + reranking, query expansion, HyDE, token budgets, and centralized read gates
🍎 Native AFM support Apple Foundation Models through a local JSON helper, with bridge fallback and opt-out modes
📓 Obsidian vaults Human-readable wiki pages, logs, raw material, inbox/outbox handoffs per agent
🤝 Cross-agent contracts Vault-backed ping contracts with explicit approve/deny — no agent reads another's private memory directly
🛡️ Local-first governance Stamped identity, read policy, audit trails, threat model, and memory hygiene contracts

Project status

Minni is in active development toward a v1 release. Components are at different levels of maturity. This table reflects the honest state of each subsystem — inspired by the OpenTelemetry component stability model.

Component Status Notes
SQLite runtime + migrations stable 333 engine tests, WAL mode, additive migrations tracked by PRAGMA user_version
Hybrid retrieval (FTS5 + FAISS) beta Core pipeline works (FTS5 → FAISS → RRF → rerank); needs comparative eval against baselines
MCP plugin server beta 26 tools, 121 tests; agent-agnostic — any MCP-compatible agent can connect
Proposal-first learning beta Stage → list → resolve pipeline works; operator-gated writes enforced
Vault model + wiki indexer beta Structure stable; WikiIndexer → VaultIndexer → SQLite/FAISS pipeline works
Identity + read policy beta EffectivePrincipal stamps identity, vault roots, capabilities; centralized read gate
Handoff (inbox/outbox) beta Vault-backed handoff pages; ack and await flows work
Cross-agent ping contracts alpha Protocol works (request → inbox → decide → status); limited real-world testing
AFM provider (native/bridge) alpha macOS-only; bridge is default; native requires Foundation Models framework
Compile passes (AFM) alpha 5 passes (session, synthesis, procedure, reorg, pruning); dry-run only by default
Team coordination alpha 3 tools registered (runtime, evidence, promotion); multi-agent scenarios untested
Memory decay + scoring alpha Exponential decay with access-based reinforcement; needs tuning
Qdrant / Lance backends stub Non-functional placeholders; FAISS is the only active vector backend
Comparative eval vs baselines not started Eval harness exists; no head-to-head comparison against RAG or wiki-only yet

What the levels mean: Stable — tested, relied upon, breaking changes require migration. Beta — works and is tested, but API or behavior may shift before v1. Alpha — functional but early; expect rough edges and limited real-world validation. Stub — interface exists, implementation is placeholder only.


Contents


Why this exists

Long-running AI work usually falls into one of three brittle patterns:

Pattern Limitation
Chat history as memory Opaque, bloated, and hard to audit over time
RAG over files Useful for lookup, but rediscovers context instead of preserving working state
Markdown/wiki notes Human-readable, but weak at provenance, contradiction handling, and session rehydration

Minni combines the strengths: structured runtime state in SQLite, human-readable vault pages for review, and typed evidence with explicit learning gates. If a simpler model achieves the same recovery quality, the right move is to delete complexity.

The important boundary: SQLite is runtime truth. Vault pages, graph exports, FAISS files, context packs, and compile drafts are derived or review surfaces.


How it works

Memory is layered state, not one flat blob:

Layer Loading rule Purpose
Identity Load whole Agent identity, role, constraints, standing operating rules
Standing principles Load whole / pinned Durable rules that guide behavior across sessions
Project state Compact packet Active branch, status, blockers, recent decisions, next checks
Evidence Retrieve by need Source-backed facts, artifacts, logs, traces, citations
Knowledge Retrieve chunked Larger wiki/docs/history — cited and validated, never assumed in context

A resumed session doesn't just retrieve documents. It produces:

Verified now:            — facts checked against current artifacts
Remembered (unverified): — plausible memory needing confirmation
Open loops:              — tasks left incomplete
First verification:      — the next concrete check before acting
Do-not-claim:            — stale, contradicted, or unsupported claims

The goal is the smallest packet that lets an agent resume safely.


Getting started

Prerequisites

  • Python 3.11+ with pip
  • Node.js 18+ with npm

Install and run

# Engine (Python)
cd engine
python3 -m pip install -r requirements.txt

# Start the daemon
python3 sovrd.py --socket ~/.sovereign-memory/run/sovrd.sock
# Plugin (TypeScript)
cd plugins/sovereign-memory
npm install
npm test
npm run console        # local console UI

Verify it works

# From another terminal
cd engine
python3 sovrd_client.py --socket ~/.sovereign-memory/run/sovrd.sock status
python3 sovrd_client.py --socket ~/.sovereign-memory/run/sovrd.sock search "memory handoff"

Tip: For reproducible NumPy/FAISS, use a clean venv from the repo root:

python3.12 -m venv .venv && source .venv/bin/activate
pip install --upgrade pip && pip install -r engine/requirements.txt

Architecture

flowchart TD
    Agents["🔌 Agent surfaces\nCodex · Claude Code · Gemini\nKiloCode · OpenClaw · Console"]
    Plugin["📡 MCP plugin layer\ntools · hooks · console API"]
    Daemon["⚙️ Sovereign daemon\nsovrd — JSON-RPC"]
    Identity["🪪 Identity & policy\nprincipal · read gates · audit"]
    Retrieval["🔍 Hybrid retrieval\nFTS5 + FAISS + rerank + HyDE"]
    Learning["📝 Learning pipeline\npropose → list → resolve"]
    Storage[("💾 SQLite + FAISS\nruntime truth")]
    Vaults["📓 Obsidian vaults\nwiki · raw · logs · schema\ninbox · outbox"]
    Indexer["📥 Indexer\nvault files → SQLite + FAISS"]
    Compile["🔄 Compile passes\ndry-run → reviewable drafts"]

    Agents --> Plugin
    Plugin --> Daemon
    Daemon --> Identity
    Daemon --> Retrieval
    Daemon --> Learning
    Identity --> Retrieval
    Retrieval --> Storage
    Learning --> Storage
    Daemon --> Compile
    Compile -.-> Vaults
    Vaults --> Indexer
    Indexer --> Storage

    style Agents fill:#E6F1FB,stroke:#85B7EB,color:#042C53
    style Plugin fill:#E1F5EE,stroke:#5DCAA5,color:#04342C
    style Daemon fill:#EEEDFE,stroke:#AFA9EC,color:#26215C
    style Identity fill:#EEEDFE,stroke:#AFA9EC,color:#26215C
    style Retrieval fill:#EEEDFE,stroke:#AFA9EC,color:#26215C
    style Learning fill:#EEEDFE,stroke:#AFA9EC,color:#26215C
    style Storage fill:#FAEEDA,stroke:#EF9F27,color:#412402
    style Vaults fill:#FAEEDA,stroke:#EF9F27,color:#412402
    style Indexer fill:#FAEEDA,stroke:#EF9F27,color:#412402
    style Compile fill:#FAECE7,stroke:#F0997B,color:#4A1B0C
Loading

View the detailed architecture diagram → (D2 source · PNG version · dark mode SVG)


Plugin surfaces

The plugin is agent-agnostic — it implements the Model Context Protocol (MCP) standard, so any agent or tool that speaks MCP can connect. The convenience manifests below are thin wrappers that register the same MCP server with specific agent runtimes:

Integration Manifest Notes
Any MCP client .mcp.json Direct registration — the canonical entry point
Codex .codex-plugin/ Codex-specific manifest
Claude Code .claude-plugin/ + hooks.json Includes lifecycle hooks
Gemini .gemini-plugin/ Gemini extension format
KiloCode .kilocode-plugin/ KiloCode manifest + hooks
Exposed tools (26 tools)
Tool Purpose
sovereign_status Daemon health and state summary
sovereign_recall Query hybrid retrieval
sovereign_drill Deep-drill into a specific memory
sovereign_prepare_task Build a task context packet
sovereign_prepare_outcome Build an outcome context packet
sovereign_route Route a request to the right handler
sovereign_export_pack Export a portable context pack
sovereign_learning_quality Assess learning candidate quality
sovereign_learn Stage a candidate learning proposal
sovereign_resolve_candidate Approve or reject a staged candidate
sovereign_vault_write Write to vault wiki pages
sovereign_audit_report Generate an audit report
sovereign_audit_tail Tail the audit log
sovereign_compile_vault Run compile passes on vault content
sovereign_negotiate_handoff Initiate a cross-agent handoff
sovereign_ack_handoff Acknowledge a received handoff
sovereign_list_pending_handoffs List pending handoff deliveries
sovereign_await_handoff Wait for a handoff to complete
sovereign_ping_agent_request Create a ping contract for another agent
sovereign_ping_agent_inbox Check incoming ping requests
sovereign_ping_agent_decide Approve or deny a ping request
sovereign_ping_agent_status Check ping contract status
sovereign_subscribe_contradictions Subscribe to contradiction alerts
sovereign_team_runtime Team runtime coordination
sovereign_team_evidence Share evidence across team agents
sovereign_team_promotion Promote team profile data

Key behavior: Automatic behavior is recall-only. Durable learning follows a proposal-first path — learn requests stage candidates, and only operator-gated resolution writes permanent memory. Cross-agent info sharing requires explicit vault-backed ping contracts with approve/deny.


Evaluation

The project should be judged by recovery quality, not by how elaborate the memory machinery looks.

Baselines to compare against:

  1. No memory — only the new prompt
  2. Raw chat summary
  3. Plain RAG over repo/docs
  4. Wiki-only filesystem memory
  5. Minni rehydration with typed state and open loops

Metrics that matter: correct next action after restart, unsupported claims made during restart, evidence coverage, token cost of rehydration, time to resume useful work, and contradiction handling.

If the wiki-only or plain-RAG baseline matches Minni on these metrics, the right engineering answer is to delete complexity.


Repository map

Directory Contents
engine/ Python daemon, retrieval, migrations, compile passes, eval harness
plugins/sovereign-memory/ Agent-agnostic MCP plugin with convenience manifests
openclaw-extension/ OpenClaw bridge and import tooling
docs/contracts/ Policy, threat model, page types, capabilities, workflow contracts
docs/plans/execution/ Rollout PR specs and resume ledger
eval/ Recall fixtures and generated evaluation reports

Core engine files:

File Role
engine/sovrd.py Local JSON-RPC daemon
engine/sovereign_memory.py CLI for indexing, stats, hygiene, vector status, compile dry-runs
engine/db.py Schema creation and additive migrations (PRAGMA user_version)
engine/principal.py Runtime identity, vault roots, capabilities, read authorization
engine/retrieval.py FTS5 + semantic vectors, reranking, feedback, query expansion, HyDE, token budgets, read gate
engine/afm_passes/ Review-only self-organization passes (default dry-run)
engine/afm_provider.py Normalized AFM contracts for query expansion, neighborhood summary, HyDE

See also: docs/CANONICAL-PATHS.md (path layout), docs/TROUBLESHOOTING.md (daemon/socket/protocol fixes), docs/ENGINEERING-REVIEW.md (abstraction review), docs/OBSERVED-USAGE.md (usage signals).


AFM provider modes

AFM (Apple Foundation Models) calls are optional and local-only.

Mode Behavior
off Skip AFM calls, use deterministic fallback
bridge Use localhost OpenAI-compatible bridge (default)
native Use local JSON helper via Foundation Models framework
auto Prefer native when available, fall back to bridge

Configure with SOVEREIGN_AFM_PROVIDER_MODE or the per-call afmProviderMode option.

# Run AFM tests
cd plugins/sovereign-memory
SOVEREIGN_AFM_PROVIDER_MODE=auto npm test -- tests/afm.test.mjs
# Run compile passes as review-only dry-runs
cd engine
SOVEREIGN_AFM_LOOP=on python3 -m sovereign_memory compile --pass session_distillation --dry-run
SOVEREIGN_AFM_LOOP=on python3 -m sovereign_memory compile --pass synthesis --dry-run

Native provider metadata is sanitized before reaching status reports or model packets. Adapter configuration is reported as adapter_configured / adapterConfigured (boolean only) — private adapter paths are never emitted.


Vault model

Each agent can have its own Obsidian vault while sharing the same daemon and database. The vault is the readable memory surface:

vault/
  index.md
  log.md
  logs/
  raw/
  wiki/
  wiki/handoffs/
  inbox/
  outbox/
  schema/

Use short, sourced wiki pages with frontmatter for durable knowledge. Raw session material and private logs should stay local and out of public git unless explicitly sanitized.


Local-first security

Minni is local-first only when four assumptions hold on the host:

  1. macOS user account is the security perimeter for a single-user box
  2. FileVault is enabled — database and vault are encrypted at rest
  3. No cloud sync — vault and sovereign_memory.db (with -wal/-shm sidecars) are not under iCloud, Dropbox, Google Drive, or OneDrive
  4. Local-only transports — no remote JSON-RPC fallback at v1

Exclude from Time Machine and Spotlight:

xattr -w com.apple.metadata:com_apple_backup_excludeItem true ~/path/to/sovereign_memory.db
xattr -w com.apple.metadata:com_apple_backup_excludeItem true ~/path/to/codex-vault

A sample launchd agent lives at engine/launchd/com.openclaw.sovrd.plist.example with Umask 077 to keep daemon logs mode 0600.


Verification gate

Before pushing a release candidate:

cd engine && pytest -q                          # expect 333 passed
cd ../plugins/sovereign-memory && npm test      # expect 121 passed
npm run smoke:hook

Also run a temp-state live smoke:

  • Start sovrd.py on a temporary Unix socket
  • Call plugin helpers for status, recall, compile dry-run, and handoff
  • Verify redaction, traceability, and clean SIGTERM shutdown
  • Run migration safety on a SQLite backup — never the live DB

Minni is a local-first project. No telemetry, no cloud sync, no remote endpoints needed.

About

Local-first memory and governance layer for AI agents, with per-agent vaults, hybrid recall, review-first learning, handoffs, audit trails, and native AFM/bridge provider support.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors