Skip to content

bradyoo12/memoria

Repository files navigation

Memoria

Episodic memory layer for LLMs — temporal-first, spatial-aware, narrative-preserving.

Where mem0 stores facts, Memoria stores episodes — raw entries kept verbatim, with extracted facts and reflections layered on top, and time/place modeled as first-class citizens.

Architecture (3 layers)

Layer Table Purpose
L1 Episodic episodic_memory Raw entries + timestamp + lat/lon (PostGIS) + embedding. Append-only.
L2 Semantic semantic_fact Distilled facts (subject/predicate/object). Bi-temporal (valid_from/valid_to + superseded_by chain) — old facts preserved, never overwritten.
L3 Reflection reflection LLM-generated weekly/monthly summaries with FK back to source episodes.

Retrieval is intent-routed: an LLM router classifies each query into one of temporal_recall, spatial_recall, emotional_recall, pattern_insight, factual_lookup, open — and picks layers + filters accordingly.

Quickstart (dev)

# 1. Start Postgres (with pgvector + PostGIS)
docker compose up -d
# 2. Configure
cp .env.example .env
# (smart endpoints + MCP need MEMORIA_OPENAI_API_KEY; core endpoints don't)
# 3. Install
pip install -e ".[dev]"          # core only
pip install -e ".[smart,dev]"    # add LLM-driven endpoints
# 4. Migrate
alembic upgrade head
# 5. Run
uvicorn memoria.api.main:app --reload --port 8080
# 6. Mint your first bearer token (UUID can be any uuid4 you choose)
memoria keys create --user 00000000-0000-0000-0000-000000000001
# → prints key_id, scopes, and the plaintext token (shown once — save it)

Every API request (REST or MCP) must carry Authorization: Bearer <token>. The token is hashed before storage; only the plaintext printed by memoria keys create can be used. Lost it? Revoke and mint a new one — memoria keys revoke <key_id>.

API — two modes

Memoria offers core (pure data, no LLM) and smart (LLM-orchestrated) endpoints. Smart endpoints are thin wrappers around core. Pick whichever fits your integration:

Caller wants… Use
Full control: own LLM, own embedding, own intent routing core
Drop-in convenience: pass natural language, get answers smart (requires pip install 'memoria[smart]')

All four endpoints below require Authorization: Bearer <token>. user_id is bound from the authenticated key — no longer accepted in the request body (T-001).

Core — POST /v1/raw-memories

{
  "text": "오늘 서울에서 핫도그를 먹었어",
  "occurred_at": "2026-05-11T18:00:00Z",
  "embedding": [0.012, -0.034, ...],          // caller-supplied, 1536-dim
  "location": { "name": "서울", "lat": 37.566, "lon": 126.978 },
  "entities": { "food": ["핫도그"], "activities": ["eat"] },
  "importance": 4
}

Core — POST /v1/retrieve

{
  "query_embedding": [...],                   // caller-supplied
  "routed": {
    "intent": "spatial_recall",
    "locations": [{ "name": "인천", "radius_hint": "근처",
                    "lat": 37.456, "lon": 126.706 }]
  },
  "top_k": 5
}

Smart — POST /v1/memories

{ "text": "오늘 서울에서 핫도그를 먹었어" }

→ runs LLM extract → geocode → embed → persist L1 → returns {episode_id, extracted}.

Smart — POST /v1/query

{
  "query": "인천 근처에서 밥을 먹었던거 같은데 뭘 먹었지?",
  "top_k": 5
}

→ router classifies as spatial_recall (anchor=인천, radius=20km) → geocode → PostGIS ST_DWithin on L1 → vector re-rank → returns hits with distance_m. A downstream LLM, prompted with prompts/compose.py, can then ask "혹시 인천이 아니라 서울에서 핫도그 드신 거 말씀이신가요?"

Image attachments (Phase 5 v1 — T-008 / #74)

Image-only v1 — upload a binary, get back a mem-asset://<uuid> URI, embed it into any episode via attachments[]. The retrieval response inlines the resolved MediaRef (mime, bytes, width, height, download_url) for any hit that carries attachments; text-only hits skip the join entirely.

# 1. Upload a binary.
ASSET=$(curl -sf -X POST http://localhost:8080/v1/media-assets \
  -H "Authorization: Bearer $TOKEN" \
  -F "file=@./photo.jpg" | jq -r '.uri')   # → mem-asset://<uuid>

# 2. Save an episode that references it.
curl -sf -X POST http://localhost:8080/v1/memories \
  -H "Authorization: Bearer $TOKEN" \
  -H 'Content-Type: application/json' \
  -d "{\"text\":\"오늘 산 핫도그\",\"attachments\":[\"$ASSET\"]}"

# 3. Retrieve — hit.attachments[].download_url is what you GET back.

Allowed mimes: image/jpeg|png|webp|gif|heic|heif. Size cap: MEMORIA_MEDIA_MAX_BYTES (default 20 MiB). Backends:

  • MEMORIA_MEDIA_BACKEND=local (default) — writes under MEMORIA_MEDIA_LOCAL_DIR (./.media).
  • MEMORIA_MEDIA_BACKEND=s3 — requires MEMORIA_MEDIA_S3_BUCKET + standard AWS credential chain. Install with pip install 'memoria[media]' for boto3.

VLM caption, visual embedding, video/audio, web/extension UI, and non-owner sharing are explicit follow-ups — see PLAN.md Phase 5 roadmap.

Web onboarding (Phase 1.5 — backend)

Magic-link email auth + a dashboard API mirroring the CLI. The browser frontend itself (Next.js / React SPA) is a separate follow-up; the endpoints below are fully usable from curl/httpx today.

Magic-link sign-in

# Request a sign-in link (email is sent or logged depending on provider)
curl -X POST http://localhost:8080/v1/auth/magic-link/request \
  -H 'Content-Type: application/json' -d '{"email":"you@example.com"}'

# Open the URL from the email (or stdout in dev) — browser is redirected
# to /dashboard with the session cookie set:
curl -i http://localhost:8080/v1/auth/magic-link/consume?token=...

Provider:

  • MEMORIA_EMAIL_PROVIDER=log (default) — prints the email to stdout for dev. The magic link URL is in the body.
  • MEMORIA_EMAIL_PROVIDER=resend + MEMORIA_RESEND_API_KEY=... — sends through Resend's HTTP API.

Dashboard API (web session)

# After consume, curl --cookie-jar / --cookie keeps the session cookie:
curl -b cookies.txt http://localhost:8080/v1/web/me
curl -b cookies.txt http://localhost:8080/v1/web/keys
curl -b cookies.txt -X POST http://localhost:8080/v1/web/keys \
  -H 'Content-Type: application/json' \
  -d '{"name":"laptop","scopes":["memories:read","memories:write"]}'
# → response includes one-time plaintext token

Data rights

# Export — returns a job row whose download_url is a data: URI
curl -b cookies.txt -X POST http://localhost:8080/v1/me/export

# Account deletion — typed-confirmation gate; soft-deletes immediately,
# hard-purges 30 days later via the cron command below.
curl -b cookies.txt -X POST http://localhost:8080/v1/me/delete-account \
  -H 'Content-Type: application/json' \
  -d '{"confirm_email":"you@example.com"}'

Support / cron CLI

# Reverse a soft-delete during the 30-day grace window
memoria account undelete <user_id>

# Hard-purge expired soft-deleted accounts (daily cron)
memoria account purge
memoria account purge --dry-run   # preview only, no commit

Phase 2 — Hook ingest (auto-capture)

POST /v1/raw-memories/from-hook is the auto-capture endpoint every client (Claude Code hook, mobile app, browser extension) POSTs into. Source-agnostic envelope:

{
  "envelope_version": "v1",                      // unknown → 400 + Memoria-Min-Version header
  "source": "claude-code:personal/main",         // tag for the per-key allow/deny filter
  "session_id": "abc-123",                       // (session_id, prompt_hash) → idempotency
  "text": "what's the weather"                   // server redacts sk/AWS/JWT before persisting
  // attachments?: [{filename, mime, bytes}]     // metadata only — never the binary
  // occurred_at?: "2026-05-12T..."              // server defaults to now()
}
  • Idempotency: a retry with the same (session_id, prompt_hash) returns {deduped: true, episode_id: ...} pointing at the original row. Clients MAY retry as aggressively as backoff permits.
  • Source filter: PUT /v1/web/keys/{id}/sources from the dashboard configures source_allowlist / source_denylist. Patterns are exact match or prefix* suffix wildcard; deny wins on overlap.
  • Redaction: PLAN-default ship list — sk-..., sk-ant-..., AKIA... / ASIA..., JWT shape. Email / phone masking deferred to per-key opt-in (loses the "who did I talk to" signal too easily).
  • Rate limit: default 60 req/min per key, in-memory single-process.

Sample hook configs

See docs/hooks/ for Claude Code UserPromptSubmit configs across macOS / Linux / Windows. The Tier 2 installer at scripts/install/ merges the right one for the OS into the user's existing ~/.claude/settings.json and stores the bearer token in the OS credential store (Keychain / Credential Manager / libsecret).

# macOS / Linux (installer reads MEMORIA_TOKEN env var)
curl -fsSL http://localhost:8080/install.sh | MEMORIA_TOKEN=... sh
# Windows
$env:MEMORIA_TOKEN = '<token>'
iwr http://localhost:8080/install.ps1 | iex

Production hardening (script signature verification, atomic settings merge, end-to-end against a running Claude Code instance) is tracked as a follow-up.

Phase 3 — Background workers

L2 semantic-fact extractor

Polls L1 (episodic_memory.l2_processed_at IS NULL), extracts durable subject/predicate/object facts via the LLM, and writes them to semantic_fact with bi-temporal preservation (old contradicted rows keep their data — only valid_to and superseded_by_id are set).

# Run one batch (intended to be cron'd every 60s)
memoria worker l2-poll
memoria worker l2-poll --batch 200          # bigger batch

# Linux: cron entry
* * * * * /usr/local/bin/memoria worker l2-poll >> /var/log/memoria-l2.log

# systemd-timer alternative (preferred for clean restart semantics)
[Unit] Description=Memoria L2 extractor
[Service] ExecStart=/usr/local/bin/memoria worker l2-poll
[Timer]   OnCalendar=*:0/1   Unit=memoria-l2.service
  • mcp:tool-call:* rows are skipped (stamped processed but no facts extracted — those are server-side lookup logs, not user content).
  • An entry with zero durable facts is still stamped processed (the PLAN "no facts marker" contract).
  • LLM failures leave the row pending for the next poll.

L3 reflection cron

Pulls L1 entries from a sliding window (weekly = last 7 days, monthly = last 30 days), summarizes via LLM, writes one reflection row per active user with source_episode_ids pointing back. Tool-call rows are filtered from the input and from source_episode_ids.

# Weekly — cron on Mondays 04:00
memoria worker l3-poll --period weekly
0 4 * * 1 /usr/local/bin/memoria worker l3-poll --period weekly

# Monthly — cron on the 1st 04:00
memoria worker l3-poll --period monthly
0 4 1 * * /usr/local/bin/memoria worker l3-poll --period monthly

# Debug / backfill: restrict to one user
memoria worker l3-poll --period weekly --user <uuid>

v1 simplifications: rolling window (not ISO-calendar boundaries), no idempotency dedup (cron schedules the run; running twice creates two rows), embedding column left NULL on the reflection row until a later embed pass lands.

OAuth 2.0 for spec-compliant MCP clients (T-011)

Some MCP clients (notably claude.ai Custom Connectors, newer Claude Desktop versions) only support OAuth 2.0 — they don't expose a "paste a bearer token" field. Memoria implements Authorization Code + PKCE + Dynamic Client Registration per the MCP authorization spec:

Endpoint Purpose
GET /.well-known/oauth-authorization-server RFC 8414 metadata — the client discovers the rest from here.
POST /v1/oauth/register Dynamic Client Registration (RFC 7591). Public clients only (PKCE). Returns a fresh client_id.
GET /v1/oauth/authorize Browser entry point. Redirects to magic-link sign-in if no session, else to the consent screen.
POST /v1/oauth/authorize Consent submission (approve / deny). Approve mints an authorization code.
POST /v1/oauth/token Code + PKCE verifier → {access_token, refresh_token, expires_in}. Also accepts grant_type=refresh_token.
POST /v1/oauth/revoke RFC 7009. Always 200, even on unknown token.

The issued access_token is accepted as a Bearer on every protected route — same code path as api-key plaintexts. Lifetimes: 1-hour access, 30-day refresh, 10-minute authorization code.

Connecting claude.ai

  1. claude.ai → Custom Connectors → Add.
  2. URL: your public Memoria URL + /mcp/ (trailing slash). For local testing, expose :8080 via cloudflared tunnel --url http://localhost:8080.
  3. Submit. claude.ai will DCR-register, redirect you through magic- link sign-in, show the Memoria consent screen, and complete the token exchange. The MCP tools (recall_memories, save_memory, list_recent, list_near) appear in the connector panel.

The dashboard /oauth/consent page shows which client is asking for access and which scopes it wants before you approve.

MCP — connecting Claude Code / Cursor / ChatGPT

Memoria exposes a narrow MCP surface at POST /mcp (streamable HTTP transport). Every request carries the same bearer token as the REST endpoints.

Tools

Tool When to call
recall_memories(query, top_k=5) Natural-language recall. Routes via LLM (temporal / spatial / semantic / pattern / factual) and returns ranked hits. Excludes its own tool-call logs by default to avoid log-of-log noise.
save_memory(text, occurred_at?) Only when the user explicitly says "remember this". Everything else is captured automatically by hooks (Phase 2).
list_recent(since?, limit=20) Chronological listing. Includes MCP tool-call logs — so "어제 누구 생일 물었지?" finds yesterday's recall_memories call.
list_near(place, radius="근처") Spatial listing around a named place. Geocoded server-side; skips the LLM intent classifier.

Tool-call logging

Every successful MCP tool invocation produces one episodic_memory row tagged source="mcp:tool-call:<tool_name>". This is a deterministic side-effect — the server records calls already in flight, not the LLM's discretionary decision to save user prompts (that anti-pattern is what Phase 2 hooks avoid). The logs become queryable like any other memory: list_recent includes them, recall_memories excludes them by default.

Result payloads larger than 4 KB (JSON-encoded) are truncated with a truncated=true marker in entities.

Claude Code

--transport http is required — without it, claude mcp add treats the URL as a stdio command and the registration silently does nothing useful.

Trailing slash matters. The Starlette mount at /mcp issues a 307 redirect to /mcp/, and the Claude Code MCP client (and most HTTP MCP clients) do not follow POST redirects. Always register the URL with the trailing slash.

claude mcp add --transport http memoria http://localhost:8080/mcp/ \
  --header "Authorization: Bearer <your-token>"

Cursor / ChatGPT custom GPT / other MCP clients

Configure an HTTP MCP server with:

  • URL: http://localhost:8080/mcp/
  • Header: Authorization: Bearer <your-token>

Read-side tools (recall_memories, list_recent, list_near) require the key's scopes to include memories:read. save_memory requires memories:write. The CLI defaults to both: memoria keys create --user <uuid> --scopes memories:read,memories:write.

Testing

pytest                        # smoke tests, no DB needed
pytest tests/integration -m e2e   # TODO — requires docker-compose up

Repo layout

src/memoria/
  api/             FastAPI routes
  services/        ingest · retrieval · listings · tool_call_log · llm · embeddings · geocoding
  prompts/         extract · route · compose (system fragments)
  models.py        SQLAlchemy 2.0 — L1/L2/L3 + api_key + geocode_cache
  schemas.py       Pydantic API contracts
  auth.py          Principal + require_principal + sha256 key hashing
  cli.py           `memoria keys create | revoke | pause | list`
  mcp_server.py    FastMCP — recall / save / list_recent / list_near + tool-call logging
migrations/        Alembic — pgvector + PostGIS DDL + api_key + source column
tickets/           file-based ticket tracker (T-001..T-010)

Status

  • T-001 — auth foundation: done. api_key + Principal + bearer-token routes; user_id removed from request bodies; CLI mints / revokes / pauses keys.
  • T-002 — MCP server: done. 4 tools at /mcp + deterministic tool-call logging tagged source="mcp:tool-call:<tool>".
  • T-010 — web onboarding & dashboard backend: done (frontend deferred). Magic-link sign-in, web-session dashboard API, data export, soft-delete + grace + cron purge.
  • T-003 — hook ingest (server): done. /v1/raw-memories/from-hook with envelope versioning + redaction + idempotency + source filter
    • rate limit; sample hook configs + installer skeletons in docs/hooks/ and scripts/install/.
  • T-009 — browser extension (skeleton): done. Manifest V3 WebExtension at extension/ capturing ChatGPT.com / Claude.ai prompts. Manual sideload only — see extension/README.md.
  • T-004 — L2 semantic-fact extractor: done. memoria worker l2-poll runs the cron-poll worker; bi-temporal supersedence preserved; tool-call rows skipped.
  • T-005 — L3 reflection cron: done. memoria worker l3-poll --period weekly|monthly; tool-call rows excluded from reflection input + source_episode_ids.
  • T-006 — importance signal: done. Heuristic over the LLM extractor's entities + emotions replaces the hardcoded 0.5 in _rerank; high-importance hits actually re-rank now.
  • T-010 frontend: done. Vite + React SPA at web/ with four dashboard pages: keys (mint / revoke / pause / source-allowlist editor), memories (browse + per-row delete), installer (copy-to- clipboard one-liner with token templated in), account (data export download + soft-delete with email confirmation). Build via cd web && npm install && npm run build; FastAPI serves web/dist/ at root.
  • T-011 OAuth 2.0: done. Authorization Code + PKCE + DCR per the MCP authorization spec. /oauth/authorize, /oauth/token, /oauth/register, /oauth/revoke, plus .well-known/oauth- authorization-server. Issued access tokens authenticate every protected route same as api-key plaintexts. Unblocks claude.ai Custom Connectors + spec-compliant MCP clients that won't accept raw bearer tokens.

Next tickets (see tickets/README.md):

  • T-007 (A2A) / T-008 (Media) — deferred until Phase 3 stabilizes
  • Production hardening: deployment target, Resend prod key, pricing / legal / TOS

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors