-
Notifications
You must be signed in to change notification settings - Fork 14.5k
Feature Request: Model-Agnostic Memory Layer (Global & Per-Project) #8043
Copy link
Copy link
Open
Description
Problem
When using OpenCode across multiple LLM providers (Claude, Gemini, Kimi, OpenRouter, etc.), there's no persistent memory between sessions or across models. Each conversation starts fresh, losing valuable context about:
- User preferences and coding style
- Project-specific decisions and architecture
- Previously solved problems and their solutions
- Domain knowledge accumulated over time
Proposed Solution
Add a model-agnostic memory layer to OpenCode that works transparently regardless of which LLM provider is being used.
Key Features
- Two memory scopes:
- Global memory (
~/.opencode/memory/) - Personal preferences, cross-project knowledge - Project memory (
.opencode/memory/) - Codebase-specific context, architecture decisions
- Global memory (
- Provider independence:
- Use local embedding model (e.g.,
Qwen3-Embedding-0.6B) - no API dependency - Local vector storage (LanceDB, SQLite, or similar)
-
- Optional: configurable "memory LLM" separate from main model (could be a cheap/fast model)
- Use local embedding model (e.g.,
- Transparent operation:
- Automatically extract relevant facts from conversations
- Inject relevant context before each LLM call
- Works the same whether using Claude, Gemini, Kimi, or any other provider
Example Architecture
User Query
│
▼
┌──────────────────────┐
│ Memory Retrieval │ ◄── Local embeddings + vector search
│ (relevant context) │
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ Context Injection │ ◄── Prepend relevant memories to prompt
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ LLM Call (any) │ ◄── Claude, Gemini, Kimi, etc.
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ Memory Extraction │ ◄── Extract facts from response (async)
│ (background task) │
└──────────────────────┘
Use Cases
- Preference persistence: "I prefer functional style" remembered across all projects and models
- Project context: Architecture decisions stay available even when switching models mid-project
- Cross-session continuity: Resume work without re-explaining context
- Knowledge accumulation: Build up domain expertise over time
Configuration (example)
# opencode.toml
[memory]
enabled = true
global_enabled = true
project_enabled = true
# Optional: dedicated model for memory operations (cheap/fast)
extraction_model = "openai/gpt-4.1-mini"
# Local embedding model (no API needed)
embedding_model = "Qwen/Qwen3-Embedding-0.6B"
Prior Art / Inspiration
- SimpleMem (https://github.com/aiming-lab/SimpleMem) - Efficient lifelong memory for LLM agents using semantic compression
- Mem0 (https://github.com/mem0ai/mem0) - Memory layer for AI applications
Additional Context
This would be especially valuable for users who:
- Switch between providers based on task (coding vs reasoning vs speed)
- Work on multiple projects with different contexts
- Want continuity without provider lock-inReactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels