MCP Server providing semantic memory with FAISS + SQLite hybrid storage and optional GPU acceleration.
- Semantic search via FAISS vector index
- Persistent storage via SQLite
- GPU bridge support for remote GPU computation
- Ollama embeddings (nomic-embed-text by default)
- Fallback to hash-based embeddings if Ollama unavailable
# From PyPI (when published)
pip install mcp-memory-gpu
# From GitHub
pip install git+https://github.com/YOUR_USERNAME/mcp-memory-gpu.git
# With GPU support
pip install mcp-memory-gpu[gpu]Add to %APPDATA%\Claude\claude_desktop_config.json (Windows) or ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):
{
"mcpServers": {
"memory": {
"command": "mcp-memory-gpu",
"env": {
"MCP_EMBEDDING_URL": "http://localhost:11434",
"MCP_EMBEDDING_MODEL": "nomic-embed-text",
"MCP_GPU_BRIDGE": "http://your-gpu-server:5000",
"MCP_GPU_TOKEN": "your-secret-token"
}
}
}
}| Variable | Default | Description |
|---|---|---|
MCP_MEMORY_DB |
~/.mcp-memory/memory.db |
SQLite database path |
MCP_MEMORY_INDEX |
~/.mcp-memory/memory.faiss |
FAISS index path |
MCP_EMBEDDING_URL |
http://localhost:11434 |
Ollama API URL |
MCP_EMBEDDING_MODEL |
nomic-embed-text |
Embedding model |
MCP_EMBEDDING_DIM |
768 |
Embedding dimension |
MCP_GPU_BRIDGE |
(none) | GPU bridge URL for remote computation |
MCP_GPU_TOKEN |
(none) | Bearer token for GPU bridge auth |
Store information with category/key organization.
{"category": "config", "key": "api_url", "value": "https://api.example.com"}Semantic search across all memories.
{"query": "how to connect to the API", "limit": 5}Get specific memory by category and key.
{"category": "config", "key": "api_url"}Delete a memory entry.
{"category": "config", "key": "api_url"}List all categories or items in a category.
{"category": "config"}Get memory statistics.
For GPU-accelerated embeddings, run the bridge server on your GPU machine:
# bridge/server.py on GPU machine
from flask import Flask, request, jsonify
import torch
from sentence_transformers import SentenceTransformer
app = Flask(__name__)
model = SentenceTransformer('nomic-ai/nomic-embed-text-v1', device='cuda')
AUTH_TOKEN = 'your-secret-token'
@app.route('/embedding', methods=['POST'])
def embedding():
auth = request.headers.get('Authorization', '')
if auth != f'Bearer {AUTH_TOKEN}':
return jsonify({'error': 'unauthorized'}), 401
text = request.json.get('text', '')
vec = model.encode(text).tolist()
return jsonify({'embedding': vec})
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)Windows/macOS (CPU) GPU Server (Pop-OS, etc.)
┌─────────────────┐ ┌─────────────────────┐
│ Claude Code │ │ GPU Bridge │
│ MCP Server │◄────────►│ FAISS GPU │
│ SQLite │ HTTP │ Sentence Transform │
└─────────────────┘ └─────────────────────┘
MIT