Episodic memory layer for LLMs — temporal-first, spatial-aware, narrative-preserving.
Where mem0 stores facts, Memoria stores episodes — raw entries kept verbatim, with extracted facts and reflections layered on top, and time/place modeled as first-class citizens.
| Layer | Table | Purpose |
|---|---|---|
| L1 Episodic | episodic_memory |
Raw entries + timestamp + lat/lon (PostGIS) + embedding. Append-only. |
| L2 Semantic | semantic_fact |
Distilled facts (subject/predicate/object). Bi-temporal (valid_from/valid_to + superseded_by chain) — old facts preserved, never overwritten. |
| L3 Reflection | reflection |
LLM-generated weekly/monthly summaries with FK back to source episodes. |
Retrieval is intent-routed: an LLM router classifies each query into one of
temporal_recall, spatial_recall, emotional_recall, pattern_insight,
factual_lookup, open — and picks layers + filters accordingly.
# 1. Start Postgres (with pgvector + PostGIS)
docker compose up -d
# 2. Configure
cp .env.example .env
# (smart endpoints + MCP need MEMORIA_OPENAI_API_KEY; core endpoints don't)
# 3. Install
pip install -e ".[dev]" # core only
pip install -e ".[smart,dev]" # add LLM-driven endpoints
# 4. Migrate
alembic upgrade head
# 5. Run
uvicorn memoria.api.main:app --reload --port 8080
# 6. Mint your first bearer token (UUID can be any uuid4 you choose)
memoria keys create --user 00000000-0000-0000-0000-000000000001
# → prints key_id, scopes, and the plaintext token (shown once — save it)Every API request (REST or MCP) must carry
Authorization: Bearer <token>. The token is hashed before storage; only
the plaintext printed by memoria keys create can be used. Lost it?
Revoke and mint a new one — memoria keys revoke <key_id>.
Memoria offers core (pure data, no LLM) and smart (LLM-orchestrated) endpoints. Smart endpoints are thin wrappers around core. Pick whichever fits your integration:
| Caller wants… | Use |
|---|---|
| Full control: own LLM, own embedding, own intent routing | core |
| Drop-in convenience: pass natural language, get answers | smart (requires pip install 'memoria[smart]') |
All four endpoints below require
Authorization: Bearer <token>.user_idis bound from the authenticated key — no longer accepted in the request body (T-001).
{
"query_embedding": [...], // caller-supplied
"routed": {
"intent": "spatial_recall",
"locations": [{ "name": "인천", "radius_hint": "근처",
"lat": 37.456, "lon": 126.706 }]
},
"top_k": 5
}{ "text": "오늘 서울에서 핫도그를 먹었어" }→ runs LLM extract → geocode → embed → persist L1 → returns {episode_id, extracted}.
{
"query": "인천 근처에서 밥을 먹었던거 같은데 뭘 먹었지?",
"top_k": 5
}→ router classifies as spatial_recall (anchor=인천, radius=20km) →
geocode → PostGIS ST_DWithin on L1 → vector re-rank → returns hits with
distance_m. A downstream LLM, prompted with prompts/compose.py, can then
ask "혹시 인천이 아니라 서울에서 핫도그 드신 거 말씀이신가요?"
Image-only v1 — upload a binary, get back a mem-asset://<uuid> URI,
embed it into any episode via attachments[]. The retrieval response
inlines the resolved MediaRef (mime, bytes, width, height,
download_url) for any hit that carries attachments; text-only hits
skip the join entirely.
# 1. Upload a binary.
ASSET=$(curl -sf -X POST http://localhost:8080/v1/media-assets \
-H "Authorization: Bearer $TOKEN" \
-F "file=@./photo.jpg" | jq -r '.uri') # → mem-asset://<uuid>
# 2. Save an episode that references it.
curl -sf -X POST http://localhost:8080/v1/memories \
-H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: application/json' \
-d "{\"text\":\"오늘 산 핫도그\",\"attachments\":[\"$ASSET\"]}"
# 3. Retrieve — hit.attachments[].download_url is what you GET back.Allowed mimes: image/jpeg|png|webp|gif|heic|heif. Size cap:
MEMORIA_MEDIA_MAX_BYTES (default 20 MiB). Backends:
MEMORIA_MEDIA_BACKEND=local(default) — writes underMEMORIA_MEDIA_LOCAL_DIR(./.media).MEMORIA_MEDIA_BACKEND=s3— requiresMEMORIA_MEDIA_S3_BUCKET+ standard AWS credential chain. Install withpip install 'memoria[media]'for boto3.
VLM caption, visual embedding, video/audio, web/extension UI, and non-owner sharing are explicit follow-ups — see PLAN.md Phase 5 roadmap.
Magic-link email auth + a dashboard API mirroring the CLI. The browser frontend itself (Next.js / React SPA) is a separate follow-up; the endpoints below are fully usable from curl/httpx today.
# Request a sign-in link (email is sent or logged depending on provider)
curl -X POST http://localhost:8080/v1/auth/magic-link/request \
-H 'Content-Type: application/json' -d '{"email":"you@example.com"}'
# Open the URL from the email (or stdout in dev) — browser is redirected
# to /dashboard with the session cookie set:
curl -i http://localhost:8080/v1/auth/magic-link/consume?token=...Provider:
MEMORIA_EMAIL_PROVIDER=log(default) — prints the email to stdout for dev. The magic link URL is in the body.MEMORIA_EMAIL_PROVIDER=resend+MEMORIA_RESEND_API_KEY=...— sends through Resend's HTTP API.
# After consume, curl --cookie-jar / --cookie keeps the session cookie:
curl -b cookies.txt http://localhost:8080/v1/web/me
curl -b cookies.txt http://localhost:8080/v1/web/keys
curl -b cookies.txt -X POST http://localhost:8080/v1/web/keys \
-H 'Content-Type: application/json' \
-d '{"name":"laptop","scopes":["memories:read","memories:write"]}'
# → response includes one-time plaintext token# Export — returns a job row whose download_url is a data: URI
curl -b cookies.txt -X POST http://localhost:8080/v1/me/export
# Account deletion — typed-confirmation gate; soft-deletes immediately,
# hard-purges 30 days later via the cron command below.
curl -b cookies.txt -X POST http://localhost:8080/v1/me/delete-account \
-H 'Content-Type: application/json' \
-d '{"confirm_email":"you@example.com"}'# Reverse a soft-delete during the 30-day grace window
memoria account undelete <user_id>
# Hard-purge expired soft-deleted accounts (daily cron)
memoria account purge
memoria account purge --dry-run # preview only, no commitPOST /v1/raw-memories/from-hook is the auto-capture endpoint every
client (Claude Code hook, mobile app, browser extension) POSTs into.
Source-agnostic envelope:
{
"envelope_version": "v1", // unknown → 400 + Memoria-Min-Version header
"source": "claude-code:personal/main", // tag for the per-key allow/deny filter
"session_id": "abc-123", // (session_id, prompt_hash) → idempotency
"text": "what's the weather" // server redacts sk/AWS/JWT before persisting
// attachments?: [{filename, mime, bytes}] // metadata only — never the binary
// occurred_at?: "2026-05-12T..." // server defaults to now()
}- Idempotency: a retry with the same
(session_id, prompt_hash)returns{deduped: true, episode_id: ...}pointing at the original row. Clients MAY retry as aggressively as backoff permits. - Source filter:
PUT /v1/web/keys/{id}/sourcesfrom the dashboard configuressource_allowlist/source_denylist. Patterns are exact match orprefix*suffix wildcard; deny wins on overlap. - Redaction: PLAN-default ship list —
sk-...,sk-ant-...,AKIA.../ASIA..., JWT shape. Email / phone masking deferred to per-key opt-in (loses the "who did I talk to" signal too easily). - Rate limit: default 60 req/min per key, in-memory single-process.
See docs/hooks/ for Claude Code UserPromptSubmit
configs across macOS / Linux / Windows. The Tier 2 installer at
scripts/install/ merges the right one for the OS
into the user's existing ~/.claude/settings.json and stores the
bearer token in the OS credential store (Keychain / Credential Manager /
libsecret).
# macOS / Linux (installer reads MEMORIA_TOKEN env var)
curl -fsSL http://localhost:8080/install.sh | MEMORIA_TOKEN=... sh# Windows
$env:MEMORIA_TOKEN = '<token>'
iwr http://localhost:8080/install.ps1 | iexProduction hardening (script signature verification, atomic settings merge, end-to-end against a running Claude Code instance) is tracked as a follow-up.
Polls L1 (episodic_memory.l2_processed_at IS NULL), extracts durable
subject/predicate/object facts via the LLM, and writes them to
semantic_fact with bi-temporal preservation (old contradicted rows
keep their data — only valid_to and superseded_by_id are set).
# Run one batch (intended to be cron'd every 60s)
memoria worker l2-poll
memoria worker l2-poll --batch 200 # bigger batch
# Linux: cron entry
* * * * * /usr/local/bin/memoria worker l2-poll >> /var/log/memoria-l2.log
# systemd-timer alternative (preferred for clean restart semantics)
[Unit] Description=Memoria L2 extractor
[Service] ExecStart=/usr/local/bin/memoria worker l2-poll
[Timer] OnCalendar=*:0/1 Unit=memoria-l2.servicemcp:tool-call:*rows are skipped (stamped processed but no facts extracted — those are server-side lookup logs, not user content).- An entry with zero durable facts is still stamped processed (the PLAN "no facts marker" contract).
- LLM failures leave the row pending for the next poll.
Pulls L1 entries from a sliding window (weekly = last 7 days, monthly
= last 30 days), summarizes via LLM, writes one reflection row per
active user with source_episode_ids pointing back. Tool-call rows
are filtered from the input and from source_episode_ids.
# Weekly — cron on Mondays 04:00
memoria worker l3-poll --period weekly
0 4 * * 1 /usr/local/bin/memoria worker l3-poll --period weekly
# Monthly — cron on the 1st 04:00
memoria worker l3-poll --period monthly
0 4 1 * * /usr/local/bin/memoria worker l3-poll --period monthly
# Debug / backfill: restrict to one user
memoria worker l3-poll --period weekly --user <uuid>v1 simplifications: rolling window (not ISO-calendar boundaries), no idempotency dedup (cron schedules the run; running twice creates two rows), embedding column left NULL on the reflection row until a later embed pass lands.
Some MCP clients (notably claude.ai Custom Connectors, newer Claude Desktop versions) only support OAuth 2.0 — they don't expose a "paste a bearer token" field. Memoria implements Authorization Code + PKCE + Dynamic Client Registration per the MCP authorization spec:
| Endpoint | Purpose |
|---|---|
GET /.well-known/oauth-authorization-server |
RFC 8414 metadata — the client discovers the rest from here. |
POST /v1/oauth/register |
Dynamic Client Registration (RFC 7591). Public clients only (PKCE). Returns a fresh client_id. |
GET /v1/oauth/authorize |
Browser entry point. Redirects to magic-link sign-in if no session, else to the consent screen. |
POST /v1/oauth/authorize |
Consent submission (approve / deny). Approve mints an authorization code. |
POST /v1/oauth/token |
Code + PKCE verifier → {access_token, refresh_token, expires_in}. Also accepts grant_type=refresh_token. |
POST /v1/oauth/revoke |
RFC 7009. Always 200, even on unknown token. |
The issued access_token is accepted as a Bearer on every protected
route — same code path as api-key plaintexts. Lifetimes: 1-hour access,
30-day refresh, 10-minute authorization code.
- claude.ai → Custom Connectors → Add.
- URL: your public Memoria URL +
/mcp/(trailing slash). For local testing, expose:8080viacloudflared tunnel --url http://localhost:8080. - Submit. claude.ai will DCR-register, redirect you through magic-
link sign-in, show the Memoria consent screen, and complete the
token exchange. The MCP tools (
recall_memories,save_memory,list_recent,list_near) appear in the connector panel.
The dashboard /oauth/consent page shows which client is asking for
access and which scopes it wants before you approve.
Memoria exposes a narrow MCP surface at POST /mcp (streamable HTTP
transport). Every request carries the same bearer token as the REST
endpoints.
| Tool | When to call |
|---|---|
recall_memories(query, top_k=5) |
Natural-language recall. Routes via LLM (temporal / spatial / semantic / pattern / factual) and returns ranked hits. Excludes its own tool-call logs by default to avoid log-of-log noise. |
save_memory(text, occurred_at?) |
Only when the user explicitly says "remember this". Everything else is captured automatically by hooks (Phase 2). |
list_recent(since?, limit=20) |
Chronological listing. Includes MCP tool-call logs — so "어제 누구 생일 물었지?" finds yesterday's recall_memories call. |
list_near(place, radius="근처") |
Spatial listing around a named place. Geocoded server-side; skips the LLM intent classifier. |
Every successful MCP tool invocation produces one episodic_memory row
tagged source="mcp:tool-call:<tool_name>". This is a deterministic
side-effect — the server records calls already in flight, not the
LLM's discretionary decision to save user prompts (that anti-pattern
is what Phase 2 hooks avoid). The logs become queryable like any other
memory: list_recent includes them, recall_memories excludes them
by default.
Result payloads larger than 4 KB (JSON-encoded) are truncated with a
truncated=true marker in entities.
--transport http is required — without it, claude mcp add treats
the URL as a stdio command and the registration silently does nothing
useful.
Trailing slash matters. The Starlette mount at /mcp issues a
307 redirect to /mcp/, and the Claude Code MCP client (and most
HTTP MCP clients) do not follow POST redirects. Always register the
URL with the trailing slash.
claude mcp add --transport http memoria http://localhost:8080/mcp/ \
--header "Authorization: Bearer <your-token>"Configure an HTTP MCP server with:
- URL:
http://localhost:8080/mcp/ - Header:
Authorization: Bearer <your-token>
Read-side tools (recall_memories, list_recent, list_near) require
the key's scopes to include memories:read. save_memory requires
memories:write. The CLI defaults to both:
memoria keys create --user <uuid> --scopes memories:read,memories:write.
pytest # smoke tests, no DB needed
pytest tests/integration -m e2e # TODO — requires docker-compose upsrc/memoria/
api/ FastAPI routes
services/ ingest · retrieval · listings · tool_call_log · llm · embeddings · geocoding
prompts/ extract · route · compose (system fragments)
models.py SQLAlchemy 2.0 — L1/L2/L3 + api_key + geocode_cache
schemas.py Pydantic API contracts
auth.py Principal + require_principal + sha256 key hashing
cli.py `memoria keys create | revoke | pause | list`
mcp_server.py FastMCP — recall / save / list_recent / list_near + tool-call logging
migrations/ Alembic — pgvector + PostGIS DDL + api_key + source column
tickets/ file-based ticket tracker (T-001..T-010)
- T-001 — auth foundation: done. api_key + Principal + bearer-token routes; user_id removed from request bodies; CLI mints / revokes / pauses keys.
- T-002 — MCP server: done. 4 tools at
/mcp+ deterministic tool-call logging taggedsource="mcp:tool-call:<tool>". - T-010 — web onboarding & dashboard backend: done (frontend deferred). Magic-link sign-in, web-session dashboard API, data export, soft-delete + grace + cron purge.
- T-003 — hook ingest (server): done.
/v1/raw-memories/from-hookwith envelope versioning + redaction + idempotency + source filter- rate limit; sample hook configs + installer skeletons in
docs/hooks/andscripts/install/.
- rate limit; sample hook configs + installer skeletons in
- T-009 — browser extension (skeleton): done. Manifest V3
WebExtension at
extension/capturing ChatGPT.com / Claude.ai prompts. Manual sideload only — see extension/README.md. - T-004 — L2 semantic-fact extractor: done.
memoria worker l2-pollruns the cron-poll worker; bi-temporal supersedence preserved; tool-call rows skipped. - T-005 — L3 reflection cron: done.
memoria worker l3-poll --period weekly|monthly; tool-call rows excluded from reflection input + source_episode_ids. - T-006 — importance signal: done. Heuristic over the LLM
extractor's entities + emotions replaces the hardcoded
0.5in_rerank; high-importance hits actually re-rank now. - T-010 frontend: done. Vite + React SPA at
web/with four dashboard pages: keys (mint / revoke / pause / source-allowlist editor), memories (browse + per-row delete), installer (copy-to- clipboard one-liner with token templated in), account (data export download + soft-delete with email confirmation). Build viacd web && npm install && npm run build; FastAPI servesweb/dist/at root. - T-011 OAuth 2.0: done. Authorization Code + PKCE + DCR per
the MCP authorization spec.
/oauth/authorize,/oauth/token,/oauth/register,/oauth/revoke, plus.well-known/oauth- authorization-server. Issued access tokens authenticate every protected route same as api-key plaintexts. Unblocks claude.ai Custom Connectors + spec-compliant MCP clients that won't accept raw bearer tokens.
Next tickets (see tickets/README.md):
- T-007 (A2A) / T-008 (Media) — deferred until Phase 3 stabilizes
- Production hardening: deployment target, Resend prod key, pricing / legal / TOS
{ "text": "오늘 서울에서 핫도그를 먹었어", "occurred_at": "2026-05-11T18:00:00Z", "embedding": [0.012, -0.034, ...], // caller-supplied, 1536-dim "location": { "name": "서울", "lat": 37.566, "lon": 126.978 }, "entities": { "food": ["핫도그"], "activities": ["eat"] }, "importance": 4 }