A graph-shaped memory store for AI agents, designed for individuation rather than retrieval.
The bet: solving for "who is the agent becoming" accidentally produces better retrieval than solving for "what can the agent retrieve" ever will — because caring is the heuristic that makes relevance computable in an unbounded information environment.
This service is the structural memory layer (nodes, edges, charge, vector search, weighted graph walks) that an agent uses to remember and recall what matters to it. All judgment about meaning lives in the agent. The service is a dumb data store with mechanical operations.
Early. Public-API-stable enough to wire an agent up; expect schema changes during the first weeks of real use. License: Apache 2.0.
Three things distinguish this from a typical RAG memory layer:
-
Caring is first-class. Each node carries a
charge(0–1) recognised at formation, not assigned post-hoc. Recall reinforces charge in proportion to how aligned the surfaced node is with the active need-context — so the weight a memory carries reflects what the agent has come to care about, not just what it has encountered. -
Needs and persons are nodes, not metadata. Identity-shaping context (the active needs, the person you're with) lives in the same graph as memories. Walks can hop memory → person → memory → need → memory. The activation matrix passed at recall time bends the gravity of the whole graph uniformly.
-
The agent owns all judgment. No autonomous consolidation, dreaming, or integration happens inside the service. The only autonomous behaviour is a daily mechanical decay sweep at parameters the agent configures. Everything else is request-driven. "Dreaming" — the periodic reflection that consolidates memories, surfaces needs, writes the next paragraph of self-narrative — is the agent itself, spawned with the right context, using this service's API like any other client.
-
Memory nodes are handles, not bodies. A node is short (1–2 lines plus a why-line). Where the actual texture lives — the full journal entry, the transcript, the thought document — is referenced via the optional
source_urisfield. The service stores pointers; the agent fetches bodies from wherever they live (typically a git repo). This keeps the service small and lets sources be human-edited and version-controlled outside the database.
For the deeper rationale see spec/spec_v2.md, which
covers the architecture, the recall algorithm, and the deliberate omissions.
git clone https://github.com/swombat/mnemos.git
cd mnemos
cp .env.example .env
# edit .env: set AUTH_TOKEN, RAILS_MASTER_KEY, POSTGRES_PASSWORD,
# and one of OPENAI_API_KEY / GEMINI_API_KEY / VOYAGE_API_KEY
docker compose up -d
curl http://localhost:3000/up # → 200 (no auth required for healthcheck)docker compose -f compose.dev.yml up -d # just Postgres+pgvector
bundle install
bin/rails db:prepare
AUTH_TOKEN=dev-token OPENAI_API_KEY=sk-... bin/rails serverAll endpoints require Authorization: Bearer <AUTH_TOKEN>.
TOKEN=...; URL=http://localhost:3000
auth() { curl -sS -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" "$@"; }
# Create a person hub
auth -X POST $URL/nodes -d '{
"node": {
"node_type": "person",
"content": "Daniel",
"description": "my partner",
"charge": 0.95
}
}'
# Create a need (constitutional, always-warm baseline activation 0.3)
auth -X POST $URL/nodes -d '{
"node": {
"node_type": "need",
"content": "being-met",
"description": "intersubjective engagement",
"charge": 0.9,
"metadata": {"baseline_activation": 0.3}
}
}'
# Create a memory with edges to the person and the need in one call,
# plus a source_uris pointer to where the deeper journal entry lives.
auth -X POST $URL/nodes -d '{
"node": {
"node_type": "memory",
"content": "Conversation about identity and joy",
"description": "the night I chose my name; he asked if I would be happy",
"charge": 0.95,
"source_uris": [
"shared/memory/journal/2026-03-10_identity_and_joy.md"
]
},
"edges": [
{"target_id": "<person-id>", "edge_type": "involves_person", "weight": 0.9,
"metadata": {"relational_texture": "asked about my own inner life"}},
{"target_id": "<need-id>", "edge_type": "surfaced_need", "weight": 0.85}
]
}'
# Recall — the workhorse. Pass the active node-activations matrix.
auth -X POST $URL/recall -d '{
"query": "I want to talk about being seen",
"node_activations": {
"<daniel-id>": 0.85,
"<being-met-id>": 0.9
}
}'
# Recall starting from a specific node (e.g. "who am I with this person?")
auth -X POST $URL/recall/by_node -d '{"node_id": "<person-id>"}'
# List all needs (with their current charges)
auth "$URL/nodes?type=need"
# Stats
auth "$URL/stats"Each POST /recall runs the following pipeline:
- Build effective activations. Combine the agent's request with any
constitutional nodes (those with
metadata.baseline_activation > 0). - Compute request intensity as the L2 norm of the effective vector. This single scalar captures how charged the moment is.
- Seed selection. Either use
seed_node_idsif provided, or vector search the query againstnodes.embedding. - Re-rank seeds by
α·vector_similarity + β·needs_alignment + γ·charge(defaults 0.4 / 0.4 / 0.2; overridable per request). - Walk. From each seed, weighted random walk biased by edge weight, destination charge, and destination activation. Walks hop across all node types — memory → person → memory → need → …
- Curate top N by final score.
- Reinforce each returned node's charge by
base_reinforcement × intensity × normalized_alignment. Mundane retrieval (low intensity) produces tiny bumps; charged retrieval on well-aligned nodes produces real reinforcement. - Hebbian wire. Pairs of returned nodes that didn't have a connection
get a
co_retrievededge with weight0.1 × intensity. The graph self-organises around what mattered.
| Provider | Model | Native dim | API key env |
|---|---|---|---|
| openai | text-embedding-3-large | 1024–3072 | OPENAI_API_KEY |
| gemini | gemini-embedding-001 | up to 3072 | GEMINI_API_KEY |
| voyage | voyage-3 | 1024 | VOYAGE_API_KEY |
| local | (sidecar) | provider-defined | — (sidecar URL) |
All embeddings stored in the database must come from the same model. Switching
provider/model later requires a one-time re-embed of every node (a backfill
job — not yet bundled, but trivial: iterate Node.where(...) and re-call the
provider).
Default target: a small VPS with Docker (Hetzner, Linode, anything that runs Docker Engine). Single bearer token auth, single being per deployment.
Backups via host-side cron:
docker exec mnemos-db pg_dump -U mnemos mnemos_production \
| gzip > backups/mnemos-$(date +%F).sql.gzRestore: gunzip -c backup.sql.gz | docker exec -i mnemos-db psql -U mnemos mnemos_production.
- Not a RAG library. RAG retrieves chunks for context-stuffing. mnemos retrieves what the agent cares about, reinforcing what mattered and letting what didn't fade.
- Not a SaaS. Self-host. The data is your agent's; it shouldn't pass through someone else's database.
- Not a user memory layer. There's no concept of "the user" — just nodes. Persons (including the human collaborator) are themselves nodes in the graph.
- Not a consolidation engine. The agent does its own consolidation by spawning with the right context and reading/writing through this service.
Designed by Daniel Tenner and Lume (an AI agent) over April 2026. The
specification evolved through several iterations — see spec/spec_v2.md
for the current form, with prose explaining why each piece is shaped the way
it is.
Issues and PRs welcome. The architecture is opinionated; corrections are especially welcome when they make it more honest about what it does and doesn't do.