Memsy adapter and runner for the MemoryBench evaluation suite.
Evaluated on the full LoCoMo dataset (1,540 questions). Judge and answerer: gpt-4.1-mini.
| Run | k | Accuracy | Hit@10 | MRR |
|---|---|---|---|---|
| cdyi | 10 | 88.90% | 95.97% | 0.9089 |
| 2n68 | 20 | 88.83% | — | — |
| rik3 | 20 | 88.12% | — | — |
Full breakdown by question type, retrieval quality, and latency: results/BENCHMARK_RESULTS.md.
Reference: mem0 scores 82.7% at k=50 on the same dataset. Memsy beats that in all runs at k≤20.
# 1. Configure env
cp .env.example .env
# Fill in MEMSY_API_KEY and OPENAI_API_KEY in .env
# 2. One-time setup (clones MemoryBench, injects adapter, installs deps)
./setup.sh
# 3. Run benchmark
./run.sh 5 # 5 questions (standard)
./run.sh 1 # 1 question (smoke test)Get your Memsy API key at app.memsy.io → Settings → API Keys.
For the full guide — configuration options, troubleshooting, manual commands — see BENCHMARKING.md.
benchmark/
├── adapter/ # Memsy provider for MemoryBench (index.ts, prompts.ts)
├── patches/ # Patches applied to MemoryBench at setup time
├── results/ # Saved benchmark runs and history
├── setup.sh # One-time setup script
├── run.sh # Benchmark runner with result logging
├── update_patches.sh # Regenerates patches/ from a working memorybench/ checkout
└── BENCHMARKING.md # Full how-to guide
memorybench/ # Cloned by setup.sh (gitignored — not committed)
setup.sh clones supermemoryai/memorybench, copies the Memsy adapter from adapter/ into it, applies the patches from patches/, and installs npm dependencies. run.sh then drives the benchmark and appends results to BENCHMARK_HISTORY.md.
This project is licensed under the MIT License.
It depends on and patches MemoryBench by supermemory (MIT © 2025 supermemory). See NOTICE.
memsy.io · Docs · Get started