Add benchmark suite for memory quality evaluation

## Problem
No published benchmarks or evaluation scores. Cannot demonstrate quality vs alternatives. In 2026, published benchmark results are the primary credibility signal for memory systems.

## Proposed Solution
1. Implement evaluation scripts against standard benchmarks (LOCOMO, LongMemEval, AMA-Bench)
2. Create an internal eval harness that measures:
   - Retrieval precision/recall at k
   - Compression quality scores (already have \`scoreCompression\`)
   - End-to-end accuracy (store → retrieve → use in context)
   - Latency percentiles (P50, P95, P99)
3. Add benchmark results to README
4. Run benchmarks in CI to catch regressions

## Impact
Establishes credibility and enables data-driven optimization decisions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmark suite for memory quality evaluation #54

Problem

Proposed Solution

Impact

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add benchmark suite for memory quality evaluation #54

Description

Problem

Proposed Solution

Impact

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions