Memoria

Episodic memory layer for LLMs — temporal-first, spatial-aware, narrative-preserving.

Where mem0 stores facts, Memoria stores episodes — raw entries kept verbatim, with extracted facts and reflections layered on top, and time/place modeled as first-class citizens.

Architecture (3 layers)

Layer	Table	Purpose
L1 Episodic	`episodic_memory`	Raw entries + timestamp + lat/lon (PostGIS) + embedding. Append-only.
L2 Semantic	`semantic_fact`	Distilled facts (subject/predicate/object). Bi-temporal (`valid_from`/`valid_to` + `superseded_by` chain) — old facts preserved, never overwritten.
L3 Reflection	`reflection`	LLM-generated weekly/monthly summaries with FK back to source episodes.

Retrieval is intent-routed: an LLM router classifies each query into one of temporal_recall, spatial_recall, emotional_recall, pattern_insight, factual_lookup, open — and picks layers + filters accordingly.

Quickstart (dev)

# 1. Start Postgres (with pgvector + PostGIS)
docker compose up -d
# 2. Configure
cp .env.example .env
# (smart endpoints + MCP need MEMORIA_OPENAI_API_KEY; core endpoints don't)
# 3. Install
pip install -e ".[dev]"          # core only
pip install -e ".[smart,dev]"    # add LLM-driven endpoints
# 4. Migrate
alembic upgrade head
# 5. Run
uvicorn memoria.api.main:app --reload --port 8080
# 6. Mint your first bearer token (UUID can be any uuid4 you choose)
memoria keys create --user 00000000-0000-0000-0000-000000000001
# → prints key_id, scopes, and the plaintext token (shown once — save it)

Every API request (REST or MCP) must carry Authorization: Bearer <token>. The token is hashed before storage; only the plaintext printed by memoria keys create can be used. Lost it? Revoke and mint a new one — memoria keys revoke <key_id>.

API — two modes

Memoria offers core (pure data, no LLM) and smart (LLM-orchestrated) endpoints. Smart endpoints are thin wrappers around core. Pick whichever fits your integration:

Caller wants…	Use
Full control: own LLM, own embedding, own intent routing	core
Drop-in convenience: pass natural language, get answers	smart (requires `pip install 'memoria[smart]'`)

All four endpoints below require Authorization: Bearer <token>. user_id is bound from the authenticated key — no longer accepted in the request body (T-001).

Core — `POST /v1/raw-memories`

{
  "text": "오늘 서울에서 핫도그를 먹었어",
  "occurred_at": "2026-05-11T18:00:00Z",
  "embedding": [0.012, -0.034, ...],          // caller-supplied, 1536-dim
  "location": { "name": "서울", "lat": 37.566, "lon": 126.978 },
  "entities": { "food": ["핫도그"], "activities": ["eat"] },
  "importance": 4
}

Core — `POST /v1/retrieve`

{
  "query_embedding": [...],                   // caller-supplied
  "routed": {
    "intent": "spatial_recall",
    "locations": [{ "name": "인천", "radius_hint": "근처",
                    "lat": 37.456, "lon": 126.706 }]
  },
  "top_k": 5
}

Smart — `POST /v1/memories`

{ "text": "오늘 서울에서 핫도그를 먹었어" }

→ runs LLM extract → geocode → embed → persist L1 → returns {episode_id, extracted}.

Smart — `POST /v1/query`

{
  "query": "인천 근처에서 밥을 먹었던거 같은데 뭘 먹었지?",
  "top_k": 5
}

→ router classifies as spatial_recall (anchor=인천, radius=20km) → geocode → PostGIS ST_DWithin on L1 → vector re-rank → returns hits with distance_m. A downstream LLM, prompted with prompts/compose.py, can then ask "혹시 인천이 아니라 서울에서 핫도그 드신 거 말씀이신가요?"

Image attachments (Phase 5 v1 — T-008 / #74)

Image-only v1 — upload a binary, get back a mem-asset://<uuid> URI, embed it into any episode via attachments[]. The retrieval response inlines the resolved MediaRef (mime, bytes, width, height, download_url) for any hit that carries attachments; text-only hits skip the join entirely.

# 1. Upload a binary.
ASSET=$(curl -sf -X POST http://localhost:8080/v1/media-assets \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@./photo.jpg" | jq -r '.uri')   # → mem-asset://<uuid>

# 2. Save an episode that references it.
curl -sf -X POST http://localhost:8080/v1/memories \
  -H "Authorization: Bearer $TOKEN" \
  -H 'Content-Type: application/json' \
  -d "{\"text\":\"오늘 산 핫도그\",\"attachments\":[\"$ASSET\"]}"

# 3. Retrieve — hit.attachments[].download_url is what you GET back.

MEMORIA_MEDIA_BACKEND=local (default) — writes under MEMORIA_MEDIA_LOCAL_DIR (./.media).
MEMORIA_MEDIA_BACKEND=s3 — requires MEMORIA_MEDIA_S3_BUCKET + standard AWS credential chain. Install with pip install 'memoria[media]' for boto3.

VLM caption, visual embedding, video/audio, web/extension UI, and non-owner sharing are explicit follow-ups — see PLAN.md Phase 5 roadmap.

Web onboarding (Phase 1.5 — backend)

Magic-link email auth + a dashboard API mirroring the CLI. The browser frontend itself (Next.js / React SPA) is a separate follow-up; the endpoints below are fully usable from curl/httpx today.

Magic-link sign-in

# Request a sign-in link (email is sent or logged depending on provider)
curl -X POST http://localhost:8080/v1/auth/magic-link/request \
  -H 'Content-Type: application/json' -d '{"email":"you@example.com"}'

# Open the URL from the email (or stdout in dev) — browser is redirected
# to /dashboard with the session cookie set:
curl -i http://localhost:8080/v1/auth/magic-link/consume?token=...

Provider:

MEMORIA_EMAIL_PROVIDER=log (default) — prints the email to stdout for dev. The magic link URL is in the body.
MEMORIA_EMAIL_PROVIDER=resend + MEMORIA_RESEND_API_KEY=... — sends through Resend's HTTP API.

Dashboard API (web session)

# After consume, curl --cookie-jar / --cookie keeps the session cookie:
curl -b cookies.txt http://localhost:8080/v1/web/me
curl -b cookies.txt http://localhost:8080/v1/web/keys
curl -b cookies.txt -X POST http://localhost:8080/v1/web/keys \
  -H 'Content-Type: application/json' \
  -d '{"name":"laptop","scopes":["memories:read","memories:write"]}'
# → response includes one-time plaintext token

Data rights

# Export — returns a job row whose download_url is a data: URI
curl -b cookies.txt -X POST http://localhost:8080/v1/me/export

# Account deletion — typed-confirmation gate; soft-deletes immediately,
# hard-purges 30 days later via the cron command below.
curl -b cookies.txt -X POST http://localhost:8080/v1/me/delete-account \
  -H 'Content-Type: application/json' \
  -d '{"confirm_email":"you@example.com"}'

Support / cron CLI

# Reverse a soft-delete during the 30-day grace window
memoria account undelete <user_id>

# Hard-purge expired soft-deleted accounts (daily cron)
memoria account purge
memoria account purge --dry-run   # preview only, no commit

Phase 2 — Hook ingest (auto-capture)

POST /v1/raw-memories/from-hook is the auto-capture endpoint every client (Claude Code hook, mobile app, browser extension) POSTs into. Source-agnostic envelope:

{
  "envelope_version": "v1",                      // unknown → 400 + Memoria-Min-Version header
  "source": "claude-code:personal/main",         // tag for the per-key allow/deny filter
  "session_id": "abc-123",                       // (session_id, prompt_hash) → idempotency
  "text": "what's the weather"                   // server redacts sk/AWS/JWT before persisting
  // attachments?: [{filename, mime, bytes}]     // metadata only — never the binary
  // occurred_at?: "2026-05-12T..."              // server defaults to now()
}

Idempotency: a retry with the same (session_id, prompt_hash) returns {deduped: true, episode_id: ...} pointing at the original row. Clients MAY retry as aggressively as backoff permits.
Source filter: PUT /v1/web/keys/{id}/sources from the dashboard configures source_allowlist / source_denylist. Patterns are exact match or prefix* suffix wildcard; deny wins on overlap.
Redaction: PLAN-default ship list — sk-..., sk-ant-..., AKIA... / ASIA..., JWT shape. Email / phone masking deferred to per-key opt-in (loses the "who did I talk to" signal too easily).
Rate limit: default 60 req/min per key, in-memory single-process.

Sample hook configs

See docs/hooks/ for Claude Code UserPromptSubmit configs across macOS / Linux / Windows. The Tier 2 installer at scripts/install/ merges the right one for the OS into the user's existing ~/.claude/settings.json and stores the bearer token in the OS credential store (Keychain / Credential Manager / libsecret).

# macOS / Linux (installer reads MEMORIA_TOKEN env var)
curl -fsSL http://localhost:8080/install.sh | MEMORIA_TOKEN=... sh

# Windows
$env:MEMORIA_TOKEN = '<token>'
iwr http://localhost:8080/install.ps1 | iex

Production hardening (script signature verification, atomic settings merge, end-to-end against a running Claude Code instance) is tracked as a follow-up.

Phase 3 — Background workers

L2 semantic-fact extractor

Polls L1 (episodic_memory.l2_processed_at IS NULL), extracts durable subject/predicate/object facts via the LLM, and writes them to semantic_fact with bi-temporal preservation (old contradicted rows keep their data — only valid_to and superseded_by_id are set).

# Run one batch (intended to be cron'd every 60s)
memoria worker l2-poll
memoria worker l2-poll --batch 200          # bigger batch

# Linux: cron entry
* * * * * /usr/local/bin/memoria worker l2-poll >> /var/log/memoria-l2.log

# systemd-timer alternative (preferred for clean restart semantics)
[Unit] Description=Memoria L2 extractor
[Service] ExecStart=/usr/local/bin/memoria worker l2-poll
[Timer]   OnCalendar=*:0/1   Unit=memoria-l2.service

mcp:tool-call:* rows are skipped (stamped processed but no facts extracted — those are server-side lookup logs, not user content).
An entry with zero durable facts is still stamped processed (the PLAN "no facts marker" contract).
LLM failures leave the row pending for the next poll.

L3 reflection cron

Pulls L1 entries from a sliding window (weekly = last 7 days, monthly = last 30 days), summarizes via LLM, writes one reflection row per active user with source_episode_ids pointing back. Tool-call rows are filtered from the input and from source_episode_ids.

# Weekly — cron on Mondays 04:00
memoria worker l3-poll --period weekly
0 4 * * 1 /usr/local/bin/memoria worker l3-poll --period weekly

# Monthly — cron on the 1st 04:00
memoria worker l3-poll --period monthly
0 4 1 * * /usr/local/bin/memoria worker l3-poll --period monthly

# Debug / backfill: restrict to one user
memoria worker l3-poll --period weekly --user <uuid>

v1 simplifications: rolling window (not ISO-calendar boundaries), no idempotency dedup (cron schedules the run; running twice creates two rows), embedding column left NULL on the reflection row until a later embed pass lands.

OAuth 2.0 for spec-compliant MCP clients (T-011)

Some MCP clients (notably claude.ai Custom Connectors, newer Claude Desktop versions) only support OAuth 2.0 — they don't expose a "paste a bearer token" field. Memoria implements Authorization Code + PKCE + Dynamic Client Registration per the MCP authorization spec:

Endpoint	Purpose
`GET /.well-known/oauth-authorization-server`	RFC 8414 metadata — the client discovers the rest from here.
`POST /v1/oauth/register`	Dynamic Client Registration (RFC 7591). Public clients only (PKCE). Returns a fresh `client_id`.
`GET /v1/oauth/authorize`	Browser entry point. Redirects to magic-link sign-in if no session, else to the consent screen.
`POST /v1/oauth/authorize`	Consent submission (approve / deny). Approve mints an authorization code.
`POST /v1/oauth/token`	Code + PKCE verifier → `{access_token, refresh_token, expires_in}`. Also accepts `grant_type=refresh_token`.
`POST /v1/oauth/revoke`	RFC 7009. Always 200, even on unknown token.

The issued access_token is accepted as a Bearer on every protected route — same code path as api-key plaintexts. Lifetimes: 1-hour access, 30-day refresh, 10-minute authorization code.

Connecting claude.ai

claude.ai → Custom Connectors → Add.
URL: your public Memoria URL + /mcp/ (trailing slash). For local testing, expose :8080 via cloudflared tunnel --url http://localhost:8080.
Submit. claude.ai will DCR-register, redirect you through magic- link sign-in, show the Memoria consent screen, and complete the token exchange. The MCP tools (recall_memories, save_memory, list_recent, list_near) appear in the connector panel.

The dashboard /oauth/consent page shows which client is asking for access and which scopes it wants before you approve.

MCP — connecting Claude Code / Cursor / ChatGPT

Memoria exposes a narrow MCP surface at POST /mcp (streamable HTTP transport). Every request carries the same bearer token as the REST endpoints.

Tools

Tool	When to call
`recall_memories(query, top_k=5)`	Natural-language recall. Routes via LLM (temporal / spatial / semantic / pattern / factual) and returns ranked hits. Excludes its own tool-call logs by default to avoid log-of-log noise.
`save_memory(text, occurred_at?)`	Only when the user explicitly says "remember this". Everything else is captured automatically by hooks (Phase 2).
`list_recent(since?, limit=20)`	Chronological listing. Includes MCP tool-call logs — so "어제 누구 생일 물었지?" finds yesterday's `recall_memories` call.
`list_near(place, radius="근처")`	Spatial listing around a named place. Geocoded server-side; skips the LLM intent classifier.

Tool-call logging

Every successful MCP tool invocation produces one episodic_memory row tagged source="mcp:tool-call:<tool_name>". This is a deterministic side-effect — the server records calls already in flight, not the LLM's discretionary decision to save user prompts (that anti-pattern is what Phase 2 hooks avoid). The logs become queryable like any other memory: list_recent includes them, recall_memories excludes them by default.

Result payloads larger than 4 KB (JSON-encoded) are truncated with a truncated=true marker in entities.

Claude Code

--transport http is required — without it, claude mcp add treats the URL as a stdio command and the registration silently does nothing useful.

Trailing slash matters. The Starlette mount at /mcp issues a 307 redirect to /mcp/, and the Claude Code MCP client (and most HTTP MCP clients) do not follow POST redirects. Always register the URL with the trailing slash.

claude mcp add --transport http memoria http://localhost:8080/mcp/ \
  --header "Authorization: Bearer <your-token>"

Cursor / ChatGPT custom GPT / other MCP clients

Configure an HTTP MCP server with:

URL: http://localhost:8080/mcp/
Header: Authorization: Bearer <your-token>

Read-side tools (recall_memories, list_recent, list_near) require the key's scopes to include memories:read. save_memory requires memories:write. The CLI defaults to both: memoria keys create --user <uuid> --scopes memories:read,memories:write.

Testing

pytest                        # smoke tests, no DB needed
pytest tests/integration -m e2e   # TODO — requires docker-compose up

Repo layout

src/memoria/
  api/             FastAPI routes
  services/        ingest · retrieval · listings · tool_call_log · llm · embeddings · geocoding
  prompts/         extract · route · compose (system fragments)
  models.py        SQLAlchemy 2.0 — L1/L2/L3 + api_key + geocode_cache
  schemas.py       Pydantic API contracts
  auth.py          Principal + require_principal + sha256 key hashing
  cli.py           `memoria keys create | revoke | pause | list`
  mcp_server.py    FastMCP — recall / save / list_recent / list_near + tool-call logging
migrations/        Alembic — pgvector + PostGIS DDL + api_key + source column
tickets/           file-based ticket tracker (T-001..T-010)

Status

T-001 — auth foundation: done. api_key + Principal + bearer-token routes; user_id removed from request bodies; CLI mints / revokes / pauses keys.
T-002 — MCP server: done. 4 tools at /mcp + deterministic tool-call logging tagged source="mcp:tool-call:<tool>".
T-010 — web onboarding & dashboard backend: done (frontend deferred). Magic-link sign-in, web-session dashboard API, data export, soft-delete + grace + cron purge.
T-003 — hook ingest (server): done. /v1/raw-memories/from-hook with envelope versioning + redaction + idempotency + source filter
- rate limit; sample hook configs + installer skeletons in docs/hooks/ and scripts/install/.
T-009 — browser extension (skeleton): done. Manifest V3 WebExtension at extension/ capturing ChatGPT.com / Claude.ai prompts. Manual sideload only — see extension/README.md.
T-004 — L2 semantic-fact extractor: done. memoria worker l2-poll runs the cron-poll worker; bi-temporal supersedence preserved; tool-call rows skipped.
T-005 — L3 reflection cron: done. memoria worker l3-poll --period weekly|monthly; tool-call rows excluded from reflection input + source_episode_ids.
T-006 — importance signal: done. Heuristic over the LLM extractor's entities + emotions replaces the hardcoded 0.5 in _rerank; high-importance hits actually re-rank now.
T-010 frontend: done. Vite + React SPA at web/ with four dashboard pages: keys (mint / revoke / pause / source-allowlist editor), memories (browse + per-row delete), installer (copy-to- clipboard one-liner with token templated in), account (data export download + soft-delete with email confirmation). Build via cd web && npm install && npm run build; FastAPI serves web/dist/ at root.
T-011 OAuth 2.0: done. Authorization Code + PKCE + DCR per the MCP authorization spec. /oauth/authorize, /oauth/token, /oauth/register, /oauth/revoke, plus .well-known/oauth- authorization-server. Issued access tokens authenticate every protected route same as api-key plaintexts. Unblocks claude.ai Custom Connectors + spec-compliant MCP clients that won't accept raw bearer tokens.

Next tickets (see tickets/README.md):

T-007 (A2A) / T-008 (Media) — deferred until Phase 3 stabilizes
Production hardening: deployment target, Resend prod key, pricing / legal / TOS

Name		Name	Last commit message	Last commit date
Latest commit History 155 Commits
.claude/commands		.claude/commands
.github		.github
docs		docs
extension-browse		extension-browse
extension		extension
marketing		marketing
migrations		migrations
scripts		scripts
src/memoria		src/memoria
tests		tests
tickets		tickets
web		web
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
Dockerfile.postgres		Dockerfile.postgres
PLAN.md		PLAN.md
README.md		README.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Memoria

Architecture (3 layers)

Quickstart (dev)

API — two modes

Core — POST /v1/raw-memories

Core — POST /v1/retrieve

Smart — POST /v1/memories

Smart — POST /v1/query

Image attachments (Phase 5 v1 — T-008 / #74)

Web onboarding (Phase 1.5 — backend)

Magic-link sign-in

Dashboard API (web session)

Data rights

Support / cron CLI

Phase 2 — Hook ingest (auto-capture)

Sample hook configs

Phase 3 — Background workers

L2 semantic-fact extractor

L3 reflection cron

OAuth 2.0 for spec-compliant MCP clients (T-011)

Connecting claude.ai

MCP — connecting Claude Code / Cursor / ChatGPT

Tools

Tool-call logging

Claude Code

Cursor / ChatGPT custom GPT / other MCP clients

Testing

Repo layout

Status

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Core — `POST /v1/raw-memories`

Core — `POST /v1/retrieve`

Smart — `POST /v1/memories`

Smart — `POST /v1/query`

Packages