feat: SummaryMemory backend — rolling LLM-generated compression (closes #3) by Neal006 · Pull Request #7 · Neal006/memorylens

Neal006 · 2026-05-22T03:44:43Z

What does this PR do?

Implements SummaryMemory — a new memory backend that compresses conversation history into a rolling summary, addressing Issue #3.

The backend has two compression modes so it works in every environment:

Mode	When active	How it compresses
LLM	`GROQ_API_KEY` is set	Groq (llama-3.1-8b-instant) generates an abstractive summary, handling fact updates in natural language
Extractive	No API key (CI, offline)	Regex-based fact-pattern extraction — zero cost, fully deterministic

Benchmark results (extractive mode, 100 turns, 8 tracked facts)

Backend	T=25	T=50	T=75	T=100	Tokens/Query
Naive	100%	100%	100%	62.5%	1,189
RAG	100%	100%	100%	100%	58
Cascading	100%	100%	87.5%	75.0%	261
SummaryMemory	100%	100%	100%	100%	318

SummaryMemory matches RAG's recall while carrying richer narrative context through its running summary — at 5.5× lower token cost than naive.

Type of change

New memory backend

Related issue

Closes #3

How was this tested?

python tests/test_pipeline.py   # 14/14 tests pass
python tests/test_imports.py    # import smoke test passes

6 new tests added:

test_summary_extractive_fallback_recall_early — ≥75% recall at T=15
test_summary_compresses_overflow — recent buffer stays within window_size
test_summary_context_contains_summary_and_recent — correct context structure
test_summary_reset_clears_state — reset() wipes both buffer and summary
test_summary_token_cost_bounded — tokens < 2000 at T=100
test_summary_benchmark_registration — _make_memory("summary") resolves correctly

Checklist

All existing tests pass (python tests/test_pipeline.py)
6 new tests added and passing
Type hints on all public methods
CHANGELOG.md updated under ## [Unreleased]
No hardcoded API keys or secrets
Works with zero API key (extractive fallback)
Works with GROQ_API_KEY (LLM mode auto-detected)

Files changed

File	Change
`memory/summary.py`	New — SummaryMemory implementation
`evaluation/benchmark.py`	Register `"summary"` in `_make_memory()`
`tests/test_pipeline.py`	6 new tests (14 total)
`tests/test_imports.py`	Added SummaryMemory import
`CHANGELOG.md`	[Unreleased] section

Rolling-summary memory with two compression modes: - LLM mode (GROQ_API_KEY set): Groq abstractive summarisation — preserves semantic meaning and handles fact updates in natural language - Extractive fallback (zero cost): regex fact-pattern extraction — works with no API key, passes all CI tests Benchmark results (extractive, 100 turns, 8 facts): naive 62.5% recall @ 1,189 tokens/query rag 100.0% recall @ 58 tokens/query cascading 75.0% recall @ 261 tokens/query summary 100.0% recall @ 318 tokens/query ← new SummaryMemory matches RAG recall while carrying richer narrative context via its running summary, at 5.5x lower token cost than naive. Changes: - memory/summary.py: SummaryMemory class + extractive + LLM helpers - evaluation/benchmark.py: register "summary" in _make_memory() - tests/test_pipeline.py: 6 new tests (14 total, all passing) - tests/test_imports.py: SummaryMemory import check - CHANGELOG.md: [Unreleased] section

Copilot

Pull request overview

Adds a new SummaryMemory backend to MemoryLens that maintains a rolling conversation summary plus a bounded recent-message window, with optional Groq LLM summarization and a deterministic extractive fallback. This fits into the existing set of memory backends (naive / RAG / cascading) used by the benchmark runner and pipeline tests.

Changes:

Introduces memory/summary.py implementing SummaryMemory with LLM + extractive compression modes.
Registers the "summary" backend in evaluation/benchmark.py and tightens unknown-backend handling.
Adds SummaryMemory coverage in tests/test_pipeline.py and import smoke coverage in tests/test_imports.py, plus changelog entry.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`memory/summary.py`	New rolling-summary backend with LLM/extractive compression and bounded recent buffer.
`evaluation/benchmark.py`	Adds `"summary"` backend to `_make_memory()` and makes unknown backends error explicitly.
`tests/test_pipeline.py`	Adds 6 integration-style tests validating SummaryMemory behavior/metrics.
`tests/test_imports.py`	Adds import smoke-test for `SummaryMemory`.
`CHANGELOG.md`	Documents the new backend and benchmark results under `[Unreleased]`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    def add_message(self, role: str, content: str, turn: int) -> None:
+        self.recent.append({"role": role, "content": content, "turn": turn})
+        # Compress whenever the verbatim buffer grows past the window
+        if len(self.recent) > self.window_size:
+            self._compress()


+    if name == "summary":
+        # use_llm=None → auto-detect from GROQ_API_KEY env var
+        return SummaryMemory(window_size=20, use_llm=None)
+    raise ValueError(f"Unknown backend: '{name}'. Choose from: naive, rag, cascading, summary")


Copilot AI review requested due to automatic review settings May 22, 2026 03:44

Copilot started reviewing on behalf of Neal006 May 22, 2026 03:45 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

Neal006 merged commit 0ca3007 into main May 22, 2026
6 checks passed

Neal006 deleted the feat/summary-memory branch May 22, 2026 03:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: SummaryMemory backend — rolling LLM-generated compression (closes #3)#7

feat: SummaryMemory backend — rolling LLM-generated compression (closes #3)#7
Neal006 merged 1 commit into
mainfrom
feat/summary-memory

Neal006 commented May 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Neal006 commented May 22, 2026

What does this PR do?

Benchmark results (extractive mode, 100 turns, 8 tracked facts)

Type of change

Related issue

How was this tested?

Checklist

Files changed

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants