Local semantic memory with PGlite + pgvector. Budget Qdrant that runs anywhere Bun runs.
You want semantic search for your AI agents but don't want to run a vector database server. This gives you:
- Zero infrastructure - PGlite is Postgres compiled to WASM, runs in-process
- Real vector search - pgvector with HNSW indexes, not some janky cosine similarity loop
- Collection-based organization - Different collections for different contexts (codebase, research, notes)
- Configurable tool descriptions - The Qdrant MCP pattern: same tool, different behaviors via env vars
- Effect-TS - Proper error handling, resource management, composable services
# npm/bun/pnpm
npm install semantic-memory
# Need Ollama for embeddings
brew install ollama
ollama pull mxbai-embed-large# Via npx
npx semantic-memory store "The auth flow uses JWT tokens stored in httpOnly cookies"
npx semantic-memory find "how does authentication work"
# Or install globally
npm install -g semantic-memory
semantic-memory store "React component patterns" --collection code
semantic-memory find "components" --collection code
# Full-text search (no embeddings)
semantic-memory find "JWT" --fts
# Add metadata
semantic-memory store "API rate limits are 100 req/min" --metadata '{"source":"docs","priority":"high"}'
# List, get, delete
semantic-memory list
semantic-memory get <id>
semantic-memory delete <id>
# Stats
semantic-memory statsCollections let you organize memories by purpose. The collection name carries semantic meaning:
# Codebase analysis - store patterns, architecture notes, API quirks
semantic-memory store "Auth uses httpOnly JWT cookies with 7-day refresh" --collection codebase
semantic-memory store "The useOptimistic hook requires a reducer pattern" --collection codebase
semantic-memory find "authentication" --collection codebase
# Research/learning - concepts, connections, questions
semantic-memory store "Effect-TS uses generators for async, not Promises" --collection research
semantic-memory find "effect async patterns" --collection research
# Project onboarding - gotchas, tribal knowledge, "why is it like this"
semantic-memory store "Don't use React.memo on components with children - causes stale closures" --collection gotchas
semantic-memory find "performance issues" --collection gotchas
# Personal knowledge - decisions, preferences, breakthroughs
semantic-memory store "Prefer composition over inheritance for React components" --collection decisions
semantic-memory find "react patterns" --collection decisionsSearch across all collections or within one:
# Search everything
semantic-memory find "authentication"
# Search specific collection
semantic-memory find "authentication" --collection codebaseThe killer feature: tool descriptions are configurable.
Same semantic memory, different agent behaviors:
# Codebase assistant - searches before generating, stores patterns found
TOOL_STORE_DESCRIPTION="Store code patterns, architecture decisions, and API quirks discovered while analyzing the codebase. Include file paths and context." \
TOOL_FIND_DESCRIPTION="Search codebase knowledge. Query BEFORE making changes to understand existing patterns." \
semantic-memory find "auth patterns"
# Research assistant - accumulates and connects ideas
TOOL_STORE_DESCRIPTION="Store concepts, insights, and connections between ideas. Include source references." \
TOOL_FIND_DESCRIPTION="Search research notes. Use to find related concepts and prior findings." \
semantic-memory find "async patterns"
# Onboarding assistant - captures tribal knowledge
TOOL_STORE_DESCRIPTION="Store gotchas, workarounds, and 'why is it like this' explanations. Future devs will thank you." \
TOOL_FIND_DESCRIPTION="Search for known issues and gotchas. Check BEFORE debugging to avoid known pitfalls." \
semantic-memory find "common mistakes"The description tells the LLM when and how to use the tool. Change the description, change the behavior. No code changes.
Drop this in ~/.config/opencode/tool/semantic-memory.ts:
import { tool } from "@opencode-ai/plugin";
import { $ } from "bun";
// Rich descriptions that shape agent behavior
// Override via env vars for different contexts
const STORE_DESC =
process.env.TOOL_STORE_DESCRIPTION ||
"Persist important discoveries, decisions, and learnings for future sessions. Use for: architectural decisions, debugging breakthroughs, user preferences, project-specific patterns. Include context about WHY something matters.";
const FIND_DESC =
process.env.TOOL_FIND_DESCRIPTION ||
"Search your persistent memory for relevant context. Query BEFORE making architectural decisions, when hitting familiar-feeling bugs, or when you need project history. Returns semantically similar memories ranked by relevance.";
async function run(args: string[]): Promise<string> {
const result = await $`npx semantic-memory ${args}`.text();
return result.trim();
}
export const store = tool({
description: STORE_DESC,
args: {
information: tool.schema.string().describe("The information to store"),
collection: tool.schema
.string()
.optional()
.describe("Collection name (e.g., 'codebase', 'research', 'gotchas')"),
metadata: tool.schema
.string()
.optional()
.describe("Optional JSON metadata"),
},
async execute({ information, collection, metadata }) {
const args = ["store", information];
if (collection) args.push("--collection", collection);
if (metadata) args.push("--metadata", metadata);
return run(args);
},
});
export const find = tool({
description: FIND_DESC,
args: {
query: tool.schema.string().describe("Natural language search query"),
collection: tool.schema
.string()
.optional()
.describe("Collection to search (omit for all)"),
limit: tool.schema
.number()
.optional()
.describe("Max results (default: 10)"),
},
async execute({ query, collection, limit }) {
const args = ["find", query];
if (collection) args.push("--collection", collection);
if (limit) args.push("--limit", String(limit));
return run(args);
},
});For project-specific behavior, create a wrapper script or use direnv:
# .envrc (with direnv)
export TOOL_STORE_DESCRIPTION="Store patterns found in this Next.js codebase. Include file paths."
export TOOL_FIND_DESCRIPTION="Search codebase patterns. Check before implementing new features."Or create project-specific OpenCode tools that hardcode the collection:
// .opencode/tool/codebase-memory.ts
export const remember = tool({
description: "Store a pattern or insight about this codebase",
args: { info: tool.schema.string() },
async execute({ info }) {
return $`npx semantic-memory store ${info} --collection ${process.cwd()}`.text();
},
});All via environment variables:
| Variable | Default | Description |
|---|---|---|
SEMANTIC_MEMORY_PATH |
~/.semantic-memory |
Where to store the database |
OLLAMA_HOST |
http://localhost:11434 |
Ollama API endpoint |
OLLAMA_MODEL |
mxbai-embed-large |
Embedding model (1024 dims) |
COLLECTION_NAME |
default |
Default collection |
TOOL_STORE_DESCRIPTION |
(see code) | MCP tool description for store |
TOOL_FIND_DESCRIPTION |
(see code) | MCP tool description for find |
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β Ollama ββββββΆβ PGlite ββββββΆβ pgvector β
β (embeddings)β β (WASM PG) β β (HNSW idx) β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β β β
β memories table memory_embeddings
β - id - memory_id (FK)
β - content - embedding vector(1024)
β - metadata (JSONB)
β - collection
ββββββββββββββββββββββββββββββββββββββββββββ
cosine similarity search
- Ollama generates embeddings locally with
mxbai-embed-large(1024 dimensions) - PGlite is Postgres compiled to WASM - no server, runs in your process
- pgvector provides real vector operations with HNSW indexes for fast approximate nearest neighbor search
- Effect-TS handles errors, retries, and resource cleanup properly
Store patterns, architecture decisions, and API quirks as you explore a new codebase. Query before making changes.
Remember facts across AI sessions. No more re-explaining context every conversation.
Pre-load docs into a collection, search before hallucinating answers.
Accumulate findings, connect ideas across sources, build up domain knowledge.
Capture the "why" behind decisions, known gotchas, and tribal knowledge for future team members.
MIT