LCORE-1571: Extend conversation cache for compaction summaries by max-svistunov · Pull Request #1795 · lightspeed-core/lightspeed-stack

max-svistunov · 2026-05-26T07:50:49Z

Description

Extends the conversation cache with persistent storage for conversation-compaction summaries (part of the Conversation Compaction epic, LCORE-1631).

When compaction summarizes older conversation turns (LCORE-1570), each run produces a ConversationSummary chunk. This PR lets those chunks be persisted and retrieved per conversation so they survive across requests and service restarts.

Cache ABC (src/cache/cache.py): new store_summary(), get_summaries(), and replace_summaries() abstract methods.
SQLite & PostgreSQL: new conversation_summaries table keyed by (user_id, conversation_id, created_at), supporting multiple additive summary chunks per conversation and returned oldest-first. The table is created with CREATE TABLE IF NOT EXISTS in initialize_cache(), which doubles as a forward-only schema migration for existing databases. delete() cascades to the summaries table.
Recursive-fold persistence (replace_summaries()): atomically replaces a conversation's accumulated summary chunks with a single folded summary (delete + insert in one transaction). This is the persistence layer for recursive re-summarization (R3, LCORE-1572) — a fold is computed once and reused rather than recomputed per request. Ordering by created_at means a later get_summaries() returns the fold first, followed by any chunks appended after it (the active set).
In-memory & no-op backends: interface-satisfying no-op stubs, mirroring the existing set_topic_summary contract.

Reuses the ConversationSummary model from LCORE-1570. No public API or OpenAPI changes.

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: Claude (Claude Code)
Generated by: Claude Opus 4.7

Related Tickets & Documents

Related Issue # LCORE-1571
Closes # LCORE-1571

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Cache-layer change; verified with unit tests (no running stack required).

Cache backend tests: uv run python -m pytest tests/unit/cache/ -q -> 140 passed, 1 skipped. For SQLite (real on-disk DB): store/retrieve a summary, additive multi-chunk ordering, per-(user, conversation) isolation, persistence across a fresh cache instance (simulated restart), forward-only migration on a legacy DB that lacks the new table, cascade delete with the conversation, and disconnected error paths. replace_summaries() is covered for collapse, empty-conversation, fold-then-append ordering, per-conversation isolation, and disconnected/DB-error paths. PostgreSQL paths covered via mocked psycopg2 (store/replace params, row reconstruction, empty result, disconnected/DB-error). Memory and no-op backends: no-op behavior and conversation-id validation.
Full unit suite (confirms the new abstract methods broke no existing Cache users): uv run python -m pytest tests/unit -> 2373 passed, 1 skipped.

Summary by CodeRabbit

New Features
- Added conversation summary storage and retrieval capabilities across all cache backends (in-memory, SQLite, PostgreSQL).
- Conversation deletion now cascades to remove associated summaries.
Tests
- Added comprehensive test coverage for summary operations across cache implementations, including edge cases and error handling.

Add persistent storage for conversation-compaction summary chunks (ConversationSummary, introduced in LCORE-1570) to the conversation cache so summaries survive across requests and service restarts. - Cache ABC: new abstract store_summary() and get_summaries() methods. - SQLite and PostgreSQL: new conversation_summaries table keyed by (user_id, conversation_id, created_at), supporting multiple additive summary chunks per conversation and returning them oldest-first. The table is created with CREATE TABLE IF NOT EXISTS in initialize_cache(), which doubles as a forward-only schema migration for existing databases. delete() now cascades to the summaries table. - In-memory and no-op backends: interface-satisfying no-op stubs that validate the compound key but do not persist, mirroring the existing set_topic_summary contract. The return annotations use builtins.list because the Cache class shadows the list builtin with a method of the same name.

Cover the new store_summary()/get_summaries() across every cache backend. - SQLite (real on-disk store): store and retrieve a single summary, additive multi-chunk ordering, per-(user, conversation) isolation, persistence across a fresh cache instance (restart), forward-only migration on a legacy database lacking the new table, cascade delete with the conversation, and disconnected error paths. - PostgreSQL (mocked psycopg2): insert parameters in column order, reconstruction of summaries from result rows, empty result, and disconnected and database-error paths. - In-memory and no-op backends: no-op behavior and conversation-id validation.

coderabbitai · 2026-05-26T07:50:59Z

Walkthrough

This PR extends all cache implementations with conversation summary storage capabilities. It adds three new methods to the abstract Cache contract—store_summary, get_summaries, and replace_summaries—and implements them across InMemoryCache, NoopCache, PostgresCache, and SQLiteCache. PostgreSQL and SQLite backends include new table schemas, SQL statements, and transaction management, while no-op implementations provide key validation without persistence. The changes include comprehensive unit test coverage for all implementations.

Changes

Conversation Summary Storage API

Layer / File(s)	Summary
Abstract Cache Contract `src/cache/cache.py`	Three new abstract methods define the summary storage contract: `store_summary()` appends a chunk, `get_summaries()` retrieves all chunks ordered by `created_at`, and `replace_summaries()` atomically replaces chunks with a folded summary.
In-Memory and No-op Implementations `src/cache/in_memory_cache.py`, `src/cache/noop_cache.py`, `tests/unit/cache/test_in_memory_cache.py`, `tests/unit/cache/test_noop_cache.py`	InMemoryCache and NoopCache implement summary methods as no-ops: validation-only with no persistence. Tests verify no-op behavior, key validation, and consistent error handling across all three methods.
PostgreSQL Summary Storage `src/cache/postgres_cache.py`, `tests/unit/cache/test_postgres_cache.py`	PostgreSQL adds a `conversation_summaries` table keyed by `(user_id, conversation_id, created_at)`. Store/get/replace methods execute INSERT/SELECT/DELETE statements with error handling. Delete cascade removes summaries when conversations are deleted. Tests mock psycopg2 behavior to verify disconnected errors, row reconstruction, parameter values, and database error wrapping.
SQLite Summary Storage with Migration `src/cache/sqlite_cache.py`, `tests/unit/cache/test_sqlite_cache.py`	SQLite adds a `conversation_summaries` table created with CREATE TABLE IF NOT EXISTS during initialization, enabling schema migration on existing databases. Store/get/replace methods execute INSERT/SELECT/DELETE in transactions. Delete cascade removes summaries. Tests verify storage, retrieval, scoping, persistence across reconnect, cascade deletion, error handling, and schema migration of legacy databases.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

anik120
tisnik

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: extending the conversation cache to support persistence of compaction summaries.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Add a replace_summaries operation to the conversation-summary cache: it atomically removes a conversation's existing summary chunks and stores a single folded summary in their place. Recursive re-summarization (R3) folds accumulated chunks into one when they themselves grow large; this gives that fold a persistent home so it is computed once and reused, rather than recomputed per request. Implemented across the Cache ABC and all backends: SQLite and PostgreSQL persist (delete + insert in one transaction); in-memory and no-op backends are interface-satisfying stubs, mirroring store_summary. Ordering by created_at means a subsequent get_summaries returns the fold first, followed by any chunks appended after it. Adds 12 unit tests covering collapse, empty-conversation, fold-then-append ordering, per-conversation isolation, and disconnected/DB-error paths.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/cache/sqlite_cache.py (1)

207-208: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

replace_summaries can leave conversation_summaries empty under SQLite autocommit.

connect() sets self.connection.autocommit = True (~line 207), and replace_summaries() runs the DELETE (~690) and INSERT (~691-692) without an explicit BEGIN/rollback; if the INSERT fails after the DELETE, the deletion can persist even though the method ends with self.connection.commit().
The @connection decorator only ensures the cache is connected; it does not wrap the call in a transaction (so the “single transaction” docstring for replace_summaries is not actually guaranteed).

💡 Proposed fix (explicit transactional block)

     def replace_summaries(
         self,
         user_id: str,
         conversation_id: str,
         folded_summary: ConversationSummary,
         skip_user_id_check: bool = False,
     ) -> None:
@@
         if self.connection is None:
             logger.error("Cache is disconnected")
             raise CacheError("replace_summaries: cache is disconnected")

-        cursor = self.connection.cursor()
-        cursor.execute(self.DELETE_SUMMARIES_STATEMENT, (user_id, conversation_id))
-        cursor.execute(
-            self.INSERT_SUMMARY_STATEMENT,
-            (
-                user_id,
-                conversation_id,
-                folded_summary.created_at,
-                folded_summary.summarized_through_turn,
-                folded_summary.token_count,
-                folded_summary.model_used,
-                folded_summary.summary_text,
-            ),
-        )
-        cursor.close()
-        self.connection.commit()
+        cursor = self.connection.cursor()
+        try:
+            cursor.execute("BEGIN")
+            cursor.execute(self.DELETE_SUMMARIES_STATEMENT, (user_id, conversation_id))
+            cursor.execute(
+                self.INSERT_SUMMARY_STATEMENT,
+                (
+                    user_id,
+                    conversation_id,
+                    folded_summary.created_at,
+                    folded_summary.summarized_through_turn,
+                    folded_summary.token_count,
+                    folded_summary.model_used,
+                    folded_summary.summary_text,
+                ),
+            )
+            self.connection.commit()
+        except sqlite3.Error:
+            self.connection.rollback()
+            raise
+        finally:
+            cursor.close()

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cache/sqlite_cache.py` around lines 207 - 208, The connect() sets
self.connection.autocommit = True which allows replace_summaries() (the method
that runs the DELETE and subsequent INSERTs) to leave conversation_summaries
empty if an INSERT fails; modify replace_summaries() to explicitly run a
transactional block: begin a transaction (e.g., execute "BEGIN" on
self.connection or temporarily set autocommit=False), perform DELETE and all
INSERTs, then commit on success and rollback on exception, and restore
autocommit state afterwards; ensure the `@connection` decorator remains only for
connectivity and does not replace this explicit transaction handling so that
replace_summaries() truly executes as a single atomic transaction.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unit/cache/test_sqlite_cache.py`:
- Around line 624-653: Declare explicit types for the module-level summary
constants by annotating each variable with the ConversationSummary type (e.g.,
change summary_1 = ConversationSummary(...) to summary_1: ConversationSummary =
ConversationSummary(...), and do the same for summary_2, folded_summary, and
summary_after_fold) so the constants have clear, consistent type annotations at
module scope.

---

Outside diff comments:
In `@src/cache/sqlite_cache.py`:
- Around line 207-208: The connect() sets self.connection.autocommit = True
which allows replace_summaries() (the method that runs the DELETE and subsequent
INSERTs) to leave conversation_summaries empty if an INSERT fails; modify
replace_summaries() to explicitly run a transactional block: begin a transaction
(e.g., execute "BEGIN" on self.connection or temporarily set autocommit=False),
perform DELETE and all INSERTs, then commit on success and rollback on
exception, and restore autocommit state afterwards; ensure the `@connection`
decorator remains only for connectivity and does not replace this explicit
transaction handling so that replace_summaries() truly executes as a single
atomic transaction.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f197064e-6858-4635-8ff9-5b314e0253d2

📥 Commits

Reviewing files that changed from the base of the PR and between 083d1ae and c35fb3b.

📒 Files selected for processing (9)

src/cache/cache.py
src/cache/in_memory_cache.py
src/cache/noop_cache.py
src/cache/postgres_cache.py
src/cache/sqlite_cache.py
tests/unit/cache/test_in_memory_cache.py
tests/unit/cache/test_noop_cache.py
tests/unit/cache/test_postgres_cache.py
tests/unit/cache/test_sqlite_cache.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: Pylinter
GitHub Check: ruff
GitHub Check: E2E Tests for Lightspeed Evaluation job
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request

🧰 Additional context used

📓 Path-based instructions (2)

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

src/cache/postgres_cache.py
src/cache/in_memory_cache.py
src/cache/cache.py
src/cache/noop_cache.py
src/cache/sqlite_cache.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/unit/cache/test_noop_cache.py
tests/unit/cache/test_postgres_cache.py
tests/unit/cache/test_in_memory_cache.py
tests/unit/cache/test_sqlite_cache.py

🔇 Additional comments (9)

src/cache/cache.py (1)

175-241: LGTM!

src/cache/in_memory_cache.py (1)

3-10: LGTM!

Also applies to: 169-234

src/cache/noop_cache.py (1)

3-10: LGTM!

Also applies to: 148-207

tests/unit/cache/test_in_memory_cache.py (1)

1-71: LGTM!

tests/unit/cache/test_noop_cache.py (1)

9-10: LGTM!

Also applies to: 256-293

tests/unit/cache/test_postgres_cache.py (1)

22-23: LGTM!

Also applies to: 922-1130
src/cache/postgres_cache.py (1)
226-227: ⚡ Quick win

replace_summaries DELETE+INSERT atomicity depends on @connection

PostgresCache.__init__ sets self.connection.autocommit = True (src/cache/postgres_cache.py:226), while replace_summaries (decorated with @connection imported from utils.connection_decorator) executes DELETE_SUMMARIES_STATEMENT followed by INSERT_SUMMARY_STATEMENT with no commit/rollback in the method body. If utils.connection_decorator.connection doesn’t wrap the call in a single transaction (e.g., disables autocommit for the duration), a failure after the DELETE can leave summaries history partially updated.
Proposed fix: force an explicit transaction boundary around DELETE+INSERT
         previous_autocommit = self.connection.autocommit
         try:
+            self.connection.autocommit = False
             with self.connection.cursor() as cursor:
                 cursor.execute(
                     PostgresCache.DELETE_SUMMARIES_STATEMENT,
                     (user_id, conversation_id),
                 )
                 cursor.execute(
                     PostgresCache.INSERT_SUMMARY_STATEMENT,
                     (
                         user_id,
                         conversation_id,
                         folded_summary.created_at,
                         folded_summary.summarized_through_turn,
                         folded_summary.token_count,
                         folded_summary.model_used,
                         folded_summary.summary_text,
                     ),
                 )
+            self.connection.commit()
         except psycopg2.DatabaseError as e:
+            self.connection.rollback()
             logger.error("PostgresCache.replace_summaries: %s", e)
             raise CacheError("PostgresCache.replace_summaries", e) from e
         finally:
             self.connection.autocommit = previous_autocommit
tests/unit/cache/test_sqlite_cache.py (2)

19-19: LGTM!

656-821: LGTM!

coderabbitai · 2026-05-26T10:26:50Z

+summary_1 = ConversationSummary(
+    summary_text="User asked about Kubernetes pods; assistant explained kubectl.",
+    summarized_through_turn=8,
+    token_count=14,
+    created_at="2025-10-03T09:31:29Z",
+    model_used="openai/gpt-4o-mini",
+)
+summary_2 = ConversationSummary(
+    summary_text="User then moved on to Helm charts and Istio routing.",
+    summarized_through_turn=18,
+    token_count=12,
+    created_at="2025-10-03T10:05:01Z",
+    model_used="openai/gpt-4o-mini",
+)
+# A fold collapsing summary_1 + summary_2; created after both (R3, LCORE-1572).
+folded_summary = ConversationSummary(
+    summary_text="Folded: pods/kubectl, then Helm charts and Istio routing.",
+    summarized_through_turn=18,
+    token_count=11,
+    created_at="2025-10-03T11:00:00Z",
+    model_used="openai/gpt-4o-mini",
+)
+# A chunk produced after a fold; created later still, so it sorts after the fold.
+summary_after_fold = ConversationSummary(
+    summary_text="Later: user asked about ArgoCD sync waves.",
+    summarized_through_turn=26,
+    token_count=10,
+    created_at="2025-10-03T11:42:00Z",
+    model_used="openai/gpt-4o-mini",
+)


🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Consider adding explicit type annotations to module-level summary constants.

While the types are inferable, explicit annotations improve readability and consistency with project conventions.

📝 Suggested enhancement

-summary_1 = ConversationSummary( +summary_1: ConversationSummary = ConversationSummary( summary_text="User asked about Kubernetes pods; assistant explained kubectl.", summarized_through_turn=8, token_count=14, created_at="2025-10-03T09:31:29Z", model_used="openai/gpt-4o-mini", ) -summary_2 = ConversationSummary( +summary_2: ConversationSummary = ConversationSummary( summary_text="User then moved on to Helm charts and Istio routing.", summarized_through_turn=18, token_count=12, created_at="2025-10-03T10:05:01Z", model_used="openai/gpt-4o-mini", ) -folded_summary = ConversationSummary( +folded_summary: ConversationSummary = ConversationSummary( summary_text="Folded: pods/kubectl, then Helm charts and Istio routing.", summarized_through_turn=18, token_count=11, created_at="2025-10-03T11:00:00Z", model_used="openai/gpt-4o-mini", ) -summary_after_fold = ConversationSummary( +summary_after_fold: ConversationSummary = ConversationSummary( summary_text="Later: user asked about ArgoCD sync waves.", summarized_through_turn=26, token_count=10, created_at="2025-10-03T11:42:00Z", model_used="openai/gpt-4o-mini", )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/cache/test_sqlite_cache.py` around lines 624 - 653, Declare explicit types for the module-level summary constants by annotating each variable with the ConversationSummary type (e.g., change summary_1 = ConversationSummary(...) to summary_1: ConversationSummary = ConversationSummary(...), and do the same for summary_2, folded_summary, and summary_after_fold) so the constants have clear, consistent type annotations at module scope.

tisnik

LGTM

max-svistunov added 2 commits May 26, 2026 04:24

max-svistunov changed the title ~~Lcore 1571 cache conversation summaries~~ LCORE-1571: Extend conversation cache for compaction summaries May 26, 2026

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

tisnik approved these changes May 26, 2026

View reviewed changes

tisnik merged commit 6c0733b into lightspeed-core:main May 26, 2026
32 checks passed

coderabbitai Bot mentioned this pull request May 26, 2026

LCORE-1572: Integrate conversation compaction into the query flow #1796

Open

19 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LCORE-1571: Extend conversation cache for compaction summaries#1795

LCORE-1571: Extend conversation cache for compaction summaries#1795
tisnik merged 3 commits into
lightspeed-core:mainfrom
max-svistunov:lcore-1571-cache-conversation-summaries

max-svistunov commented May 26, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 26, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 26, 2026

Uh oh!

tisnik left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

max-svistunov commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

max-svistunov commented May 26, 2026 •

edited

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading