MemBurrow

Long-term memory service for AI agents. SQL as the source of truth, vectors as the acceleration layer.

Not another vector-only RAG.

Why Not Vector-Only?

Without long-term memory, agents repeatedly ask for context, forget user preferences, and violate previously stated rules — burning tokens replaying chat history every turn.

Vector-only memory makes this worse: it retrieves by surface similarity, not by operational correctness.

	Vector-Only	MemBurrow
Source of truth	Embeddings	✅ SQL
Retrieval strategy	Cosine similarity only	✅ Intent-based routing
Ranking	Single similarity score	✅ Multi-factor scoring
Rule / preference	Embedded in text chunks	✅ First-class structured types
Vector DB down	❌ No recall	✅ SQL-only fallback
Audit trail	❌ Opaque	✅ Full explainability
Write semantics	Fire-and-forget	✅ Outbox, exactly-once

Architecture

Write Flow

Client → API (event + outbox tx, <60 ms, 202) → Worker → LLM extract → Embed → Qdrant upsert

Recall Flow

Client → API → Intent Router → SQL-first / Hybrid → Rerank → Response

flowchart LR
    C([Client]) --> |ingest| API
    API --> |event + outbox tx| PG[(PostgreSQL)]
    API --> |202 Accepted| C
    PG --> |poll| W[Worker]
    W --> |extract| LLM[LLM API]
    W --> |embed| EMB[Embedding API]
    W --> |upsert| QD[(Qdrant)]
    W --> |update status| PG

    C --> |recall| API2[API]
    API2 --> |intent route| IR{Intent Router}
    IR --> |policy / rule / preference| SQL[SQL-first]
    IR --> |other| HY[Hybrid]
    SQL --> RR[Rerank]
    HY --> RR
    RR --> |response| C

Scoring formula:

score = semantic(0.40) + importance(0.16) + confidence(0.12) + freshness(0.22) + scope(0.10)

semantic = vector(0.70) + lexical(0.30)

Key Features

5 Memory Types — fact, preference, rule, skill, event
Intent-Based Routing — policy/rule/preference/constraint/safety/decision queries go SQL-first; others go hybrid
Outbox Pattern — event + outbox written in a single transaction, exactly-once delivery
Graceful Degradation — vector DB down? recall falls back to SQL-only
Scope Isolation — tenant / entity / process isolation with internal namespace unification and SQL+Qdrant scoped filtering
Audit Trail — full traceability for every memory operation
Async Write Path — ingest returns in <60 ms; extraction happens in background

Trade-off: async write path means recall is eventually consistent — a freshly ingested memory may not appear immediately.

Quick Start

git clone https://github.com/Mgrsc/MemBurrow.git && cd MemBurrow
cp .env.lite.example .env  # lite profile by default, set OPENAI_API_KEY
docker compose up -d

If you bind-mount a host directory for SQLite data (for example ./data:/app/data), make sure:

Use an absolute in-container SQLite URL: SQLITE_DATABASE_URL=sqlite:///app/data/memburrow.db
The host directory exists and is writable: mkdir -p ./data
The image runs as appuser (UID 10001), so the mounted directory must be writable by that UID; otherwise SQLite may fail with (code: 14) unable to open database file

Distributed profile:

cp .env.distributed.example .env
docker compose --profile distributed up -d

Verify:

curl -s http://localhost:8080/v1/memory/health
# {"status":"ok","timestamp":"..."}

API Examples

Ingest

curl -s -X POST http://localhost:8080/v1/memory/ingest \
  -H 'Authorization: Bearer dev-token' \
  -H 'X-Tenant-ID: acme' \
  -H 'Content-Type: application/json' \
  -d '{
    "tenant_id": "acme",
    "entity_id": "user_123",
    "process_id": "planner",
    "session_id": "sess_001",
    "turn_id": "turn_001",
    "messages": [
      {"role": "user", "content": "I do not drink espresso."},
      {"role": "assistant", "content": "Noted."}
    ]
  }'

Recall

curl -s -X POST http://localhost:8080/v1/memory/recall \
  -H 'Authorization: Bearer dev-token' \
  -H 'X-Tenant-ID: acme' \
  -H 'Content-Type: application/json' \
  -d '{
    "tenant_id": "acme",
    "entity_id": "user_123",
    "process_id": "planner",
    "query": "Recommend a low-caffeine drink for this afternoon.",
    "intent": "recommendation",
    "top_k": 8
  }'

Full API documentation: LLM_README.md

Tech Stack

Layer	Technology
Language	Rust
HTTP framework	Axum
Database	SQLite (lite) / PostgreSQL (distributed) + SQLx
Vector index	sqlite-vector (lite) / Qdrant (distributed)
LLM / Embedding	OpenAI-compatible API
Deployment	Docker Compose

Project Structure

MemBurrow/
├── api/
│   └── openapi.yaml
├── cmd/
│   ├── memory-api/          # HTTP API server
│   ├── memory-migrator/     # Database migration runner
│   └── memory-worker/       # Async extraction & embedding worker
├── crates/
│   └── memory-core/         # Shared domain logic, models, store
├── migrations/              # SQL migration files
├── migrations_sqlite/       # SQLite migration files
├── docker-compose.yml
├── Dockerfile
├── .env.lite.example
└── .env.distributed.example

Configuration

Environment Variables

Required

Variable	Description
`BACKEND_PROFILE`	`lite` or `distributed`
`API_AUTH_TOKEN`	Bearer token for API authentication
`OPENAI_API_KEY`	API key for LLM and embedding calls
`OPENAI_BASE_URL`	OpenAI-compatible API base URL (must end with `/v1` if set)

Optional

Variable	Default	Description
`SQLITE_DATABASE_URL`	`sqlite:///app/data/memburrow.db`	SQLite database URL (lite profile, Docker recommended)
`SQLITE_VECTOR_EXTENSION_PATH`	`/usr/local/lib/sqlite-vector/vector.so`	sqlite-vector extension path (lite profile)
`SQLITE_BUSY_TIMEOUT_MS`	`5000`	SQLite busy timeout in milliseconds
`DATABASE_URL`	``	PostgreSQL connection string (distributed profile)
`QDRANT_URL`	`http://qdrant:6333`	Qdrant gRPC/HTTP endpoint (distributed profile)
`API_BIND_ADDR`	`0.0.0.0:8080`	API listen address
`OPENAI_EXTRACT_MODEL`	`gpt-4o-mini`	Model for memory extraction
`OPENAI_EMBEDDING_MODEL`	`text-embedding-3-small`	Model for embeddings
`EMBEDDING_DIMS`	`1536`	Embedding dimensions (must match model output)
`WORKER_POLL_INTERVAL_MS`	`1500`	Worker poll interval
`WORKER_BATCH_SIZE`	`32`	Worker batch size
`WORKER_MAX_RETRY`	`8`	Max retry attempts
`RECALL_CANDIDATE_LIMIT`	`64`	Max recall candidates
`RECONCILE_ENABLED`	`true`	Enable reconciliation
`RECONCILE_INTERVAL_SECONDS`	`120`	Reconciliation interval
`RECONCILE_BATCH_SIZE`	`200`	Reconciliation batch size
`LOG_FORMAT`	`json`	Log format (`json` or `pretty`)
`RUST_LOG`	`info`	Log level

References

License

Apache-2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MemBurrow

Why Not Vector-Only?

Architecture

Key Features

Quick Start

API Examples

Tech Stack

Project Structure

Configuration

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
api		api
cmd		cmd
crates/memory-core		crates/memory-core
docker		docker
migrations		migrations
migrations_sqlite		migrations_sqlite
.dockerignore		.dockerignore
.env.distributed.example		.env.distributed.example
.env.lite.example		.env.lite.example
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LLM_README.md		LLM_README.md
README-zh.md		README-zh.md
README.md		README.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

MemBurrow

Why Not Vector-Only?

Architecture

Key Features

Quick Start

API Examples

Tech Stack

Project Structure

Configuration

References

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages