mema

Verifiable seven-layer memory infrastructure for AI agents.

Markdown-vault substrate + bi-temporal facts + epistemic cognitive layer + purpose-bound governance + hybrid retrieval (keyword + IDF + vector + graph

temporal + policy) + SHA-256-hash-chained audit log + verifiable memory assets (UAL + content hash + anchor lifecycle).

Designed for regulated enterprise contexts — Swiss / EU financial services, healthcare, public sector — where audit replay, hard erasure, multi-tenant isolation, jurisdiction-aware governance, and inspectable storage matter as much as benchmark recall.

Status

v2.10.0 — Architecture-complete checkpoint for the v3.0 evidence-package push. RRF is now wired into /v2/recall as an opt-in fusion mode (fusion: "rrf"); cognitive records have approve/reject parity with facts and entities; the LongMemEval harness gains --retrieval-mode {hybrid,bm25,vector,full-context} and --fusion {weighted,rrf} for one-flag ablations; new bench/locomo-harness.ts runs LoCoMo QA; new bench/swiss-trust-bench.ts ships 9 end-to-end trust scenarios (9/9 passing) — the differentiator no other memory system has. v3.0 itself waits on the full benchmark evidence (LongMemEval N=500 with judge, LoCoMo QA full run, Zep/Hindsight apples-to-apples).

231 tests passing across 26 test files (551 expect() assertions)
96.0% Precision@1 on a 25-query retrieval benchmark over a real 347-document corpus (vs 44.0% for the v1 baseline — +52 percentage points). External harness for LongMemEval included in bench/.
Acceptance gate rejected ~27% of LLM-proposed facts for failing source-evidence checks on a real 20-episode smoke run of Ardin's vault (111 drafts → 55 auto-approved / 30 auto-rejected for evidence failure / 26 held for human review). Honest framing: evidence-check failure can mean hallucination, alias/synonym mismatch, or extraction phrasing mismatch — a "hallucination caught" claim needs human-labeled ground truth before it's defensible.
First LongMemEval result (Wu et al., ICLR 2025, retrieval-only, 176 questions): Overall Hit@1=46.0%, Hit@5=84.1%, Hit@10=91.5%; knowledge-update 65.8 / 88.2 / 93.4; temporal-reasoning 30.0 / 86.7 / 96.7; multi-session 32.5 / 72.5 / 80.0. Honest framing: this is session-level retrieval recall, not LongMemEval official answer-correctness — the LLM judge layer is the next milestone.
Draft → approved/rejected lifecycle on L2 facts and entities, with PROPOSE/APPROVE/REJECT audit ops in the hash chain.
Strict policy mode (MEMA_POLICY_MODE=strict) denies missing governance, jurisdiction mismatches, and regulated-cloud routing without human review — Swiss enterprise mode.
Hard-erase audit provenance captures pre-erasure record_id + content/metadata hashes + legal_basis without retaining content.
Atomic writes everywhere in v2 — tmp + fsync + rename via src/v2/atomic.ts. The README invariant now holds.
Graph-influenced ranking — derived_from in-degree, temporal recency, contradiction penalty added to score components.
Ollama embedder — opt-in transformer-quality local embeddings.
MCP v2 surface live — 9 tools for Claude Code / Cursor / any MCP client
Full architecture documented in docs/WHITEPAPER.md

v1 is preserved unchanged at /v1/* endpoints for backwards compatibility. New deployments should use v2.

The seven layers

┌─────────────────────────────────────────────────────────────────┐
│  L7  Asset       (content_hash + metadata_hash + UAL + anchor)  │
├─────────────────────────────────────────────────────────────────┤
│  L6  Audit       (SHA-256 hash-chained log + sealed witness)    │
├─────────────────────────────────────────────────────────────────┤
│  L5  Retrieval   (keyword + IDF + vector + graph + policy)      │
├─────────────────────────────────────────────────────────────────┤
│  L4  Governance  (purpose, retention, provenance, hard-erase)   │
├─────────────────────────────────────────────────────────────────┤
│  L3  Cognitive   (experiences, observations, beliefs, reflect)  │
├─────────────────────────────────────────────────────────────────┤
│  L2  Semantic    (entities + facts + bi-temporal validity)      │
├─────────────────────────────────────────────────────────────────┤
│  L1  Episodic    (raw conversations, documents, tool calls)     │
└─────────────────────────────────────────────────────────────────┘

Each layer is one or more TypeScript files under src/v2/layer{N}-*.ts. The filesystem layout mirrors the architecture: data/episodes/, data/facts/, data/cognitive/, data/v2-entities/, data/_meta/audit.sqlite, data/_meta/vectors.sqlite, data/_meta/anchors.sqlite.

Inspired by Zep (bi-temporal facts), Hindsight (epistemic separation), Mem0 (production memory pipeline), and OriginTrail/DKG (verifiable knowledge assets). Ships without graph-DB substrate, online LLM extraction, or blockchain dependencies. See docs/WHITEPAPER.md for full related-work positioning.

Quick start

git clone https://github.com/machtsinnch/mema && cd mema
bun install

# Start the server (with permissive rate-limit for development)
MACHTSINN_RATE_LIMIT_BURST=10000 ./scripts/start.sh

# Verify
curl http://localhost:3001/health

# Run the full test suite
bun test

# Import a corpus
bun scripts/import-tree.ts /path/to/your/markdown/folders

# Build the vector index (idempotent, one-time per corpus change)
curl -X POST http://localhost:3001/v2/vector/reindex -H "x-api-key: dev-ardin"

# Run the v2 recall benchmark
python3 bench/recall-benchmark-v2.py

Acceptance lifecycle for untrusted producers (v2.7+)

LLM extractors and other untrusted producers do not write directly into the retrieval surface. They propose drafts; an evidence-checked review step promotes them to approved (or marks them rejected).

raw episode ─▶ LLM extractor ─▶ DRAFT fact/entity (status: "draft")
                                        │
                                        ▼
                              evidence-check guard
                                        │
                            ┌───────────┴───────────┐
                            ▼                       ▼
                       APPROVED                  REJECTED
                  (visible in recall)       (kept for audit,
                                             never retrievable)

Pipeline:

# 1) Extract drafts (status="draft", with evidence excerpts)
bun scripts/extract-facts-llm.ts --owner ardin

# 2a) Auto-review high-confidence drafts (>=0.9 + evidence passes)
bun scripts/review-proposals.ts --owner ardin --auto

# 2b) Interactively review the remainder
bun scripts/review-proposals.ts --owner ardin

# 3) Wire + reindex only after drafts have been resolved
bun scripts/wire-entity-graph.ts
curl -X POST http://localhost:3001/v2/vector/reindex -H "x-api-key: dev-ardin"

The evidence-check guard runs server-side on /v2/fact/:id/approve and returns 422 evidence_check_failed when the proposed fact's subject or object strings do not appear (case-insensitive substring) in the source episode body. Pass force: true to override for synonym/alias cases. Every state transition appends an APPROVE or REJECT entry to the hash-chained audit log; verifyChain() includes these in the chain.

Records written through /v2/fact and /v2/entity without an explicit status field default to approved — the lifecycle is opt-in and fully backward-compatible with existing vaults.

Architecture invariants (DO NOT BREAK)

Filesystem is the source of truth. SQLite (audit, vectors, anchors) and any future index is derived state, rebuildable from the markdown vault.
All write paths use atomic write (temp + rename + fsync). As of v2.8.0 every v2 layer writer uses atomicWriteFile from src/v2/atomic.ts — no direct writeFileSync remains in the v2 surface.
All read endpoints filter through canRead (v1) or owner !== query.owner → deny (v2). No exceptions.
Uniform 404 for not-found vs not-readable.
Path sanitization on every user-supplied path segment (including inside UALs after URL-decode).
N=3 promotion rule for v1 generalized layer is server-side enforced.
Audit log is append-only with hash chain + external sealed witness.
Untrusted producers write drafts only (v2.7+). LLM-derived facts and entities are gated by the acceptance lifecycle before they enter the retrieval surface.

HTTP API surface

v2 (recommended)

Method	Endpoint	Layer	Purpose
POST	`/v2/observe`	L1	Ingest a raw episode
POST	`/v2/fact`	L2	Record a semantic fact (bi-temporal). Pass `status: "draft"` + `evidence_excerpt` for untrusted producers
POST	`/v2/fact/:id/invalidate`	L2	Mark a fact invalidated/superseded
POST	`/v2/fact/:id/approve`	L2	Promote a draft fact to `approved` (runs server-side evidence check unless `force:true`)
POST	`/v2/fact/:id/reject`	L2	Reject a draft fact (requires `reason`)
GET	`/v2/facts/drafts`	L2	List all draft facts for the owner (review tools)
GET	`/v2/facts/valid-at?at=...&include_drafts=true`	L2	Facts valid at a given timestamp
POST	`/v2/entity`	L2	Create an entity. Pass `status: "draft"` for untrusted producers
POST	`/v2/entity/:id/approve`	L2	Promote a draft entity to `approved`
POST	`/v2/entity/:id/reject`	L2	Reject a draft entity (requires `reason`)
GET	`/v2/entities/drafts`	L2	List all draft entities for the owner
GET	`/v2/entity/find/:name`	L2	Resolve name/alias to entity
POST	`/v2/entity/:keeperId/merge/:mergedId`	L2	Merge two entities
POST	`/v2/cognitive`	L3	Record an experience/observation/belief
POST	`/v2/reflect`	L3	Run automated reflection
POST	`/v2/governance/build`	L4	Compute a governance block from source
POST	`/v2/erase`	L4	Hard-erase a record (tombstone + audit)
POST	`/v2/recall`	L5	Hybrid retrieval (returns verifiable packets)
POST	`/v2/vector/reindex`	L5	Rebuild vector index
GET	`/v2/graph/derived-from/:id`	L5	Walk supporting records
GET	`/v2/audit/log`	L6	Query the audit log
GET	`/v2/audit/verify`	L6	Verify the hash chain integrity
POST	`/v2/asset/wrap`	L7	Wrap a record as a verifiable asset
POST	`/v2/asset/anchor`	L7	Anchor an asset to a target
GET	`/v2/asset/anchors?ual=...`	L7	List anchors for caller
POST	`/v2/asset/verification-status`	L7	Transition lifecycle state

v1 (legacy, preserved)

Op	Endpoint
WRITE	`POST /v1/remember`
RETRIEVE	`POST /v1/recall`
READ	`GET /v1/memory/:id`
UPDATE	`PUT /v1/memory/:id`
FORGET	`POST /v1/forget` (soft)
PROMOTE	`POST /v1/promote`
LINK	`POST /v1/link`
HEALTH	`GET /v1/topology/health`
STATS	`GET /v1/stats`
AUDIT	`GET /v1/log`

Auth: x-api-key header. Dev keys: dev-ardin / dev-marcel / dev-founder3.

MCP server (Claude Code / Cursor / any MCP client)

Add to ~/.claude.json or ~/.cursor/mcp.json:

{
  "mcpServers": {
    "machtsinn": {
      "command": "bun",
      "args": ["/absolute/path/to/mema/src/mcp.ts"],
      "env": {
        "MACHTSINN_URL": "http://localhost:3001",
        "MACHTSINN_KEY": "dev-ardin",
        "MACHTSINN_ACTOR": "claude-code"
      }
    }
  }
}

v2 tools: memory_v2_observe · memory_v2_fact · memory_v2_recall · memory_v2_reflect · memory_v2_audit_log · memory_v2_audit_verify · memory_v2_erase · memory_v2_asset_wrap · memory_v2_asset_anchor

Graph view

Three ways to see the network of memories:

Obsidian (with layer coloring)

Open data/ as an Obsidian vault, then install the layer-coloring config:

./scripts/install-obsidian-config.sh           # writes data/.obsidian/graph.json
# or, by hand: cp docs/obsidian-graph.example.json data/.obsidian/graph.json

Cmd+G to open the graph. Each layer renders in its own colour:

Layer	Path	Colour
L1 episodes	`episodes/`	cyan `#66cccc`
L2 facts	`facts/`	amber `#ffcc66`
L3 cognitive	`cognitive/`	purple `#cc99ff`
L2 v2 entities	`v2-entities/`	green `#99cc99`
v1 generalized hubs	`generalized/`	gold `#daa520`
v1 user notes	`users/`	white `#ffffff`
v1 entity-scoped	`entities/`	gray `#888888`

The same palette is used by the built-in viewer below.

Built-in `/graph` viewer (zero-dependency)

http://localhost:3001/graph

Loads in any browser. Enter your API key in the form, click Load graph. Canvas force-directed layout with pan / zoom / drag / hover tooltips. Same colour palette as the Obsidian config.

Any external tool via JSON

curl 'http://localhost:3001/v2/graph?limit=2000' -H 'x-api-key: dev-ardin'

Returns {nodes, edges, stats} ready for cytoscape, vis-network, Gephi, D3.

v1 tools (preserved): memory_remember · memory_recall · memory_show · memory_forget · memory_promote · memory_stats · memory_health · memory_log

Verifiable Memory Assets (Layer 7)

Every record can be wrapped as an asset — promoting it from a plain markdown file to a versioned, hash-stamped, UAL-addressable verifiable artifact:

ual: mema://owner/ardin/fact/marcel-r/memory/01KR...
content_hash: sha256:abc...
metadata_hash: sha256:def...
asset_version: 1
verification_status: anchored   # unverified | verified | anchored
anchored_at: 2026-05-15T14:32:11Z
anchor_targets: [local, customer-audit-bundle]

What recall returns

/v2/recall always returns score, score_components, governance, why_retrieved, and excerpt for every hit. Cryptographic asset metadata is opt-in — ual, content_hash, metadata_hash, asset_version, and verification_status are populated only after the record has been wrapped via /v2/asset/wrap:

{
  "kind": "fact",
  "score": 0.86,
  "score_components": { "idf": 0.72, "title": 0.80, "vector": 0.41, ... },
  "why_retrieved": "rare-term keyword match + title match + semantic similarity (0.41)",
  "governance": { "allowed": true, "reason": "policy_pass" },
  "excerpt": "Pillar 3a tax optimization strategy — ...",

  // Present only when the record was wrapped:
  "ual": "mema://owner/ardin/fact/marcel-r/memory/01KR...",
  "content_hash": "sha256:abc...",
  "metadata_hash": "sha256:def...",
  "asset_version": 1,
  "verification_status": "anchored"
}

The system-wide verifiability guarantee lives in the L6 audit log — every recall is hash-chained with prev_hash / curr_hash + external sealed witness. The L7 per-record verifiability is the upgrade for records you want to expose externally with provenance.

A downstream consumer of a wrapped record can independently verify the hit by re-hashing the file at ual and comparing to content_hash. Inspired by OriginTrail's DKG Knowledge Asset model, without the blockchain dependency.

Threat model & adversarial hardening

mema v2 underwent three independent adversarial reviews. Mitigations shipped:

Attack	Mitigation
Audit row deletion (mid-stream)	seq-contiguity check
Audit suffix-drop	`sqlite_sequence` comparison + external witness file
`sqlite_sequence` reset bypass	external sealed witness (`data/_meta/audit-witness.log`) cross-checked at verifyChain time
Audit chain fork via race	`appendAudit` wrapped in `db.transaction()` (BEGIN IMMEDIATE)
Cross-tenant recall leak	`recall()` owner filter is deny-by-default for missing owner
Cross-tenant anchor leak	`listAnchors(owner, ual?)` is owner-scoped
UAL path traversal	`SAFE_SEGMENT` regex `/^[A-Za-z0-9_.\-]+$/` after URL-decode
NaN/Inf confidence poisoning	`clampConfidence()` at every write boundary + defensive clamp at read
Disk-fill DoS	2 MB body cap per v2 request (configurable via `MACHTSINN_V2_MAX_BODY_BYTES`)
Silent retrieval failure (rg missing)	`ripgrepAcross` checks exit code, throws on missing binary
Vector cross-embedder pollution	`vectorSearch` filters by embedder name

Full details in docs/WHITEPAPER.md §4.4–4.5.

Test coverage

v1 isolation + security:          38 tests (5 files)
v2 six-layer smoke (end-to-end):   3 tests
v2 professional:                  18 tests
v2 verifiable assets:             12 tests
v2 security-hardening round 1:    12 tests
v2 security-hardening round 2:    14 tests
─────────────────────────────────────────────
Total:                            97 tests, all green

bun test runs them all in ~300 ms.

Stack

Bun + TypeScript (>= 1.1.0)
Hono for HTTP
bun:sqlite (audit, vectors, anchors stores)
@modelcontextprotocol/sdk for MCP server
ripgrep for keyword search (system dependency)
gray-matter + js-yaml for frontmatter
ulid for IDs

No graph DB, no vector DB extension, no blockchain. Optional: OPENAI_API_KEY enables the OpenAIEmbedder for semantic retrieval (auto-detected; falls back to LocalHashEmbedder when absent).

License

Business Source License 1.1, converting to Apache 2.0 on 2030-05-15. Non-production use (evaluation, academic research, internal development) is free. Production use requires a commercial license — contact the Licensor.

Versions v2.0.0 through v2.8.0 remain MIT-licensed at their git tags on this repo. See LICENSE, NOTICE-LICENSE-HISTORY.md, and LICENSE-MIT-PRE-V2.9.md for the full history.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
.github/workflows		.github/workflows
bench		bench
docs		docs
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
COMPETITOR-PROMPT-INTEL.md		COMPETITOR-PROMPT-INTEL.md
LICENSE		LICENSE
LICENSE-MIT-PRE-V2.9.md		LICENSE-MIT-PRE-V2.9.md
NOTICE-LICENSE-HISTORY.md		NOTICE-LICENSE-HISTORY.md
README.md		README.md
RESUME-HERE.md		RESUME-HERE.md
bun.lock		bun.lock
index.ts		index.ts
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mema

Status

The seven layers

Quick start

Acceptance lifecycle for untrusted producers (v2.7+)

Architecture invariants (DO NOT BREAK)

HTTP API surface

v2 (recommended)

v1 (legacy, preserved)

MCP server (Claude Code / Cursor / any MCP client)

Graph view

Obsidian (with layer coloring)

Built-in `/graph` viewer (zero-dependency)

Any external tool via JSON

Verifiable Memory Assets (Layer 7)

What recall returns

Threat model & adversarial hardening

Test coverage

Stack

License

About

Uh oh!

Releases 11

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mema

Status

The seven layers

Quick start

Acceptance lifecycle for untrusted producers (v2.7+)

Architecture invariants (DO NOT BREAK)

HTTP API surface

v2 (recommended)

v1 (legacy, preserved)

MCP server (Claude Code / Cursor / any MCP client)

Graph view

Obsidian (with layer coloring)

Built-in /graph viewer (zero-dependency)

Any external tool via JSON

Verifiable Memory Assets (Layer 7)

What recall returns

Threat model & adversarial hardening

Test coverage

Stack

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 11

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Built-in `/graph` viewer (zero-dependency)

Packages