Skip to content

LCORE-1571: Extend conversation cache for compaction summaries#1795

Merged
tisnik merged 3 commits into
lightspeed-core:mainfrom
max-svistunov:lcore-1571-cache-conversation-summaries
May 26, 2026
Merged

LCORE-1571: Extend conversation cache for compaction summaries#1795
tisnik merged 3 commits into
lightspeed-core:mainfrom
max-svistunov:lcore-1571-cache-conversation-summaries

Conversation

@max-svistunov
Copy link
Copy Markdown
Contributor

@max-svistunov max-svistunov commented May 26, 2026

Description

Extends the conversation cache with persistent storage for conversation-compaction summaries (part of the Conversation Compaction epic, LCORE-1631).

When compaction summarizes older conversation turns (LCORE-1570), each run produces a ConversationSummary chunk. This PR lets those chunks be persisted and retrieved per conversation so they survive across requests and service restarts.

  • Cache ABC (src/cache/cache.py): new store_summary(), get_summaries(), and replace_summaries() abstract methods.
  • SQLite & PostgreSQL: new conversation_summaries table keyed by (user_id, conversation_id, created_at), supporting multiple additive summary chunks per conversation and returned oldest-first. The table is created with CREATE TABLE IF NOT EXISTS in initialize_cache(), which doubles as a forward-only schema migration for existing databases. delete() cascades to the summaries table.
  • Recursive-fold persistence (replace_summaries()): atomically replaces a conversation's accumulated summary chunks with a single folded summary (delete + insert in one transaction). This is the persistence layer for recursive re-summarization (R3, LCORE-1572) — a fold is computed once and reused rather than recomputed per request. Ordering by created_at means a later get_summaries() returns the fold first, followed by any chunks appended after it (the active set).
  • In-memory & no-op backends: interface-satisfying no-op stubs, mirroring the existing set_topic_summary contract.

Reuses the ConversationSummary model from LCORE-1570. No public API or OpenAPI changes.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Claude (Claude Code)
  • Generated by: Claude Opus 4.7

Related Tickets & Documents

  • Related Issue # LCORE-1571
  • Closes # LCORE-1571

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

Cache-layer change; verified with unit tests (no running stack required).

  1. Cache backend tests: uv run python -m pytest tests/unit/cache/ -q -> 140 passed, 1 skipped. For SQLite (real on-disk DB): store/retrieve a summary, additive multi-chunk ordering, per-(user, conversation) isolation, persistence across a fresh cache instance (simulated restart), forward-only migration on a legacy DB that lacks the new table, cascade delete with the conversation, and disconnected error paths. replace_summaries() is covered for collapse, empty-conversation, fold-then-append ordering, per-conversation isolation, and disconnected/DB-error paths. PostgreSQL paths covered via mocked psycopg2 (store/replace params, row reconstruction, empty result, disconnected/DB-error). Memory and no-op backends: no-op behavior and conversation-id validation.

  2. Full unit suite (confirms the new abstract methods broke no existing Cache users): uv run python -m pytest tests/unit -> 2373 passed, 1 skipped.

Summary by CodeRabbit

  • New Features

    • Added conversation summary storage and retrieval capabilities across all cache backends (in-memory, SQLite, PostgreSQL).
    • Conversation deletion now cascades to remove associated summaries.
  • Tests

    • Added comprehensive test coverage for summary operations across cache implementations, including edge cases and error handling.

Review Change Stack

Add persistent storage for conversation-compaction summary chunks
(ConversationSummary, introduced in LCORE-1570) to the conversation
cache so summaries survive across requests and service restarts.

- Cache ABC: new abstract store_summary() and get_summaries() methods.
- SQLite and PostgreSQL: new conversation_summaries table keyed by
  (user_id, conversation_id, created_at), supporting multiple additive
  summary chunks per conversation and returning them oldest-first. The
  table is created with CREATE TABLE IF NOT EXISTS in initialize_cache(),
  which doubles as a forward-only schema migration for existing
  databases. delete() now cascades to the summaries table.
- In-memory and no-op backends: interface-satisfying no-op stubs that
  validate the compound key but do not persist, mirroring the existing
  set_topic_summary contract.

The return annotations use builtins.list because the Cache class shadows
the list builtin with a method of the same name.
Cover the new store_summary()/get_summaries() across every cache backend.

- SQLite (real on-disk store): store and retrieve a single summary,
  additive multi-chunk ordering, per-(user, conversation) isolation,
  persistence across a fresh cache instance (restart), forward-only
  migration on a legacy database lacking the new table, cascade delete
  with the conversation, and disconnected error paths.
- PostgreSQL (mocked psycopg2): insert parameters in column order,
  reconstruction of summaries from result rows, empty result, and
  disconnected and database-error paths.
- In-memory and no-op backends: no-op behavior and conversation-id
  validation.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 26, 2026

Walkthrough

This PR extends all cache implementations with conversation summary storage capabilities. It adds three new methods to the abstract Cache contract—store_summary, get_summaries, and replace_summaries—and implements them across InMemoryCache, NoopCache, PostgresCache, and SQLiteCache. PostgreSQL and SQLite backends include new table schemas, SQL statements, and transaction management, while no-op implementations provide key validation without persistence. The changes include comprehensive unit test coverage for all implementations.

Changes

Conversation Summary Storage API

Layer / File(s) Summary
Abstract Cache Contract
src/cache/cache.py
Three new abstract methods define the summary storage contract: store_summary() appends a chunk, get_summaries() retrieves all chunks ordered by created_at, and replace_summaries() atomically replaces chunks with a folded summary.
In-Memory and No-op Implementations
src/cache/in_memory_cache.py, src/cache/noop_cache.py, tests/unit/cache/test_in_memory_cache.py, tests/unit/cache/test_noop_cache.py
InMemoryCache and NoopCache implement summary methods as no-ops: validation-only with no persistence. Tests verify no-op behavior, key validation, and consistent error handling across all three methods.
PostgreSQL Summary Storage
src/cache/postgres_cache.py, tests/unit/cache/test_postgres_cache.py
PostgreSQL adds a conversation_summaries table keyed by (user_id, conversation_id, created_at). Store/get/replace methods execute INSERT/SELECT/DELETE statements with error handling. Delete cascade removes summaries when conversations are deleted. Tests mock psycopg2 behavior to verify disconnected errors, row reconstruction, parameter values, and database error wrapping.
SQLite Summary Storage with Migration
src/cache/sqlite_cache.py, tests/unit/cache/test_sqlite_cache.py
SQLite adds a conversation_summaries table created with CREATE TABLE IF NOT EXISTS during initialization, enabling schema migration on existing databases. Store/get/replace methods execute INSERT/SELECT/DELETE in transactions. Delete cascade removes summaries. Tests verify storage, retrieval, scoping, persistence across reconnect, cascade deletion, error handling, and schema migration of legacy databases.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • anik120
  • tisnik
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: extending the conversation cache to support persistence of compaction summaries.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@max-svistunov max-svistunov changed the title Lcore 1571 cache conversation summaries LCORE-1571: Extend conversation cache for compaction summaries May 26, 2026
Add a replace_summaries operation to the conversation-summary cache: it
atomically removes a conversation's existing summary chunks and stores a single
folded summary in their place. Recursive re-summarization (R3) folds accumulated
chunks into one when they themselves grow large; this gives that fold a
persistent home so it is computed once and reused, rather than recomputed per
request.

Implemented across the Cache ABC and all backends: SQLite and PostgreSQL persist
(delete + insert in one transaction); in-memory and no-op backends are
interface-satisfying stubs, mirroring store_summary. Ordering by created_at means
a subsequent get_summaries returns the fold first, followed by any chunks appended
after it.

Adds 12 unit tests covering collapse, empty-conversation, fold-then-append
ordering, per-conversation isolation, and disconnected/DB-error paths.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cache/sqlite_cache.py (1)

207-208: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

replace_summaries can leave conversation_summaries empty under SQLite autocommit.

  • connect() sets self.connection.autocommit = True (~line 207), and replace_summaries() runs the DELETE (~690) and INSERT (~691-692) without an explicit BEGIN/rollback; if the INSERT fails after the DELETE, the deletion can persist even though the method ends with self.connection.commit().
  • The @connection decorator only ensures the cache is connected; it does not wrap the call in a transaction (so the “single transaction” docstring for replace_summaries is not actually guaranteed).
💡 Proposed fix (explicit transactional block)
     def replace_summaries(
         self,
         user_id: str,
         conversation_id: str,
         folded_summary: ConversationSummary,
         skip_user_id_check: bool = False,
     ) -> None:
@@
         if self.connection is None:
             logger.error("Cache is disconnected")
             raise CacheError("replace_summaries: cache is disconnected")

-        cursor = self.connection.cursor()
-        cursor.execute(self.DELETE_SUMMARIES_STATEMENT, (user_id, conversation_id))
-        cursor.execute(
-            self.INSERT_SUMMARY_STATEMENT,
-            (
-                user_id,
-                conversation_id,
-                folded_summary.created_at,
-                folded_summary.summarized_through_turn,
-                folded_summary.token_count,
-                folded_summary.model_used,
-                folded_summary.summary_text,
-            ),
-        )
-        cursor.close()
-        self.connection.commit()
+        cursor = self.connection.cursor()
+        try:
+            cursor.execute("BEGIN")
+            cursor.execute(self.DELETE_SUMMARIES_STATEMENT, (user_id, conversation_id))
+            cursor.execute(
+                self.INSERT_SUMMARY_STATEMENT,
+                (
+                    user_id,
+                    conversation_id,
+                    folded_summary.created_at,
+                    folded_summary.summarized_through_turn,
+                    folded_summary.token_count,
+                    folded_summary.model_used,
+                    folded_summary.summary_text,
+                ),
+            )
+            self.connection.commit()
+        except sqlite3.Error:
+            self.connection.rollback()
+            raise
+        finally:
+            cursor.close()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cache/sqlite_cache.py` around lines 207 - 208, The connect() sets
self.connection.autocommit = True which allows replace_summaries() (the method
that runs the DELETE and subsequent INSERTs) to leave conversation_summaries
empty if an INSERT fails; modify replace_summaries() to explicitly run a
transactional block: begin a transaction (e.g., execute "BEGIN" on
self.connection or temporarily set autocommit=False), perform DELETE and all
INSERTs, then commit on success and rollback on exception, and restore
autocommit state afterwards; ensure the `@connection` decorator remains only for
connectivity and does not replace this explicit transaction handling so that
replace_summaries() truly executes as a single atomic transaction.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unit/cache/test_sqlite_cache.py`:
- Around line 624-653: Declare explicit types for the module-level summary
constants by annotating each variable with the ConversationSummary type (e.g.,
change summary_1 = ConversationSummary(...) to summary_1: ConversationSummary =
ConversationSummary(...), and do the same for summary_2, folded_summary, and
summary_after_fold) so the constants have clear, consistent type annotations at
module scope.

---

Outside diff comments:
In `@src/cache/sqlite_cache.py`:
- Around line 207-208: The connect() sets self.connection.autocommit = True
which allows replace_summaries() (the method that runs the DELETE and subsequent
INSERTs) to leave conversation_summaries empty if an INSERT fails; modify
replace_summaries() to explicitly run a transactional block: begin a transaction
(e.g., execute "BEGIN" on self.connection or temporarily set autocommit=False),
perform DELETE and all INSERTs, then commit on success and rollback on
exception, and restore autocommit state afterwards; ensure the `@connection`
decorator remains only for connectivity and does not replace this explicit
transaction handling so that replace_summaries() truly executes as a single
atomic transaction.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f197064e-6858-4635-8ff9-5b314e0253d2

📥 Commits

Reviewing files that changed from the base of the PR and between 083d1ae and c35fb3b.

📒 Files selected for processing (9)
  • src/cache/cache.py
  • src/cache/in_memory_cache.py
  • src/cache/noop_cache.py
  • src/cache/postgres_cache.py
  • src/cache/sqlite_cache.py
  • tests/unit/cache/test_in_memory_cache.py
  • tests/unit/cache/test_noop_cache.py
  • tests/unit/cache/test_postgres_cache.py
  • tests/unit/cache/test_sqlite_cache.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Pylinter
  • GitHub Check: ruff
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
  • GitHub Check: E2E: library mode / ci / group 3
  • GitHub Check: E2E: server mode / ci / group 2
  • GitHub Check: E2E: server mode / ci / group 3
  • GitHub Check: E2E: server mode / ci / group 1
  • GitHub Check: E2E: library mode / ci / group 1
  • GitHub Check: E2E: library mode / ci / group 2
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

  • src/cache/postgres_cache.py
  • src/cache/in_memory_cache.py
  • src/cache/cache.py
  • src/cache/noop_cache.py
  • src/cache/sqlite_cache.py
tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

  • tests/unit/cache/test_noop_cache.py
  • tests/unit/cache/test_postgres_cache.py
  • tests/unit/cache/test_in_memory_cache.py
  • tests/unit/cache/test_sqlite_cache.py
🔇 Additional comments (9)
src/cache/cache.py (1)

175-241: LGTM!

src/cache/in_memory_cache.py (1)

3-10: LGTM!

Also applies to: 169-234

src/cache/noop_cache.py (1)

3-10: LGTM!

Also applies to: 148-207

tests/unit/cache/test_in_memory_cache.py (1)

1-71: LGTM!

tests/unit/cache/test_noop_cache.py (1)

9-10: LGTM!

Also applies to: 256-293

tests/unit/cache/test_postgres_cache.py (1)

22-23: LGTM!

Also applies to: 922-1130

src/cache/postgres_cache.py (1)

226-227: ⚡ Quick win

replace_summaries DELETE+INSERT atomicity depends on @connection

PostgresCache.__init__ sets self.connection.autocommit = True (src/cache/postgres_cache.py:226), while replace_summaries (decorated with @connection imported from utils.connection_decorator) executes DELETE_SUMMARIES_STATEMENT followed by INSERT_SUMMARY_STATEMENT with no commit/rollback in the method body. If utils.connection_decorator.connection doesn’t wrap the call in a single transaction (e.g., disables autocommit for the duration), a failure after the DELETE can leave summaries history partially updated.

Proposed fix: force an explicit transaction boundary around DELETE+INSERT
         previous_autocommit = self.connection.autocommit
         try:
+            self.connection.autocommit = False
             with self.connection.cursor() as cursor:
                 cursor.execute(
                     PostgresCache.DELETE_SUMMARIES_STATEMENT,
                     (user_id, conversation_id),
                 )
                 cursor.execute(
                     PostgresCache.INSERT_SUMMARY_STATEMENT,
                     (
                         user_id,
                         conversation_id,
                         folded_summary.created_at,
                         folded_summary.summarized_through_turn,
                         folded_summary.token_count,
                         folded_summary.model_used,
                         folded_summary.summary_text,
                     ),
                 )
+            self.connection.commit()
         except psycopg2.DatabaseError as e:
+            self.connection.rollback()
             logger.error("PostgresCache.replace_summaries: %s", e)
             raise CacheError("PostgresCache.replace_summaries", e) from e
         finally:
             self.connection.autocommit = previous_autocommit
tests/unit/cache/test_sqlite_cache.py (2)

19-19: LGTM!


656-821: LGTM!

Comment on lines +624 to +653
summary_1 = ConversationSummary(
summary_text="User asked about Kubernetes pods; assistant explained kubectl.",
summarized_through_turn=8,
token_count=14,
created_at="2025-10-03T09:31:29Z",
model_used="openai/gpt-4o-mini",
)
summary_2 = ConversationSummary(
summary_text="User then moved on to Helm charts and Istio routing.",
summarized_through_turn=18,
token_count=12,
created_at="2025-10-03T10:05:01Z",
model_used="openai/gpt-4o-mini",
)
# A fold collapsing summary_1 + summary_2; created after both (R3, LCORE-1572).
folded_summary = ConversationSummary(
summary_text="Folded: pods/kubectl, then Helm charts and Istio routing.",
summarized_through_turn=18,
token_count=11,
created_at="2025-10-03T11:00:00Z",
model_used="openai/gpt-4o-mini",
)
# A chunk produced after a fold; created later still, so it sorts after the fold.
summary_after_fold = ConversationSummary(
summary_text="Later: user asked about ArgoCD sync waves.",
summarized_through_turn=26,
token_count=10,
created_at="2025-10-03T11:42:00Z",
model_used="openai/gpt-4o-mini",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Consider adding explicit type annotations to module-level summary constants.

While the types are inferable, explicit annotations improve readability and consistency with project conventions.

📝 Suggested enhancement
-summary_1 = ConversationSummary(
+summary_1: ConversationSummary = ConversationSummary(
     summary_text="User asked about Kubernetes pods; assistant explained kubectl.",
     summarized_through_turn=8,
     token_count=14,
     created_at="2025-10-03T09:31:29Z",
     model_used="openai/gpt-4o-mini",
 )
-summary_2 = ConversationSummary(
+summary_2: ConversationSummary = ConversationSummary(
     summary_text="User then moved on to Helm charts and Istio routing.",
     summarized_through_turn=18,
     token_count=12,
     created_at="2025-10-03T10:05:01Z",
     model_used="openai/gpt-4o-mini",
 )
-folded_summary = ConversationSummary(
+folded_summary: ConversationSummary = ConversationSummary(
     summary_text="Folded: pods/kubectl, then Helm charts and Istio routing.",
     summarized_through_turn=18,
     token_count=11,
     created_at="2025-10-03T11:00:00Z",
     model_used="openai/gpt-4o-mini",
 )
-summary_after_fold = ConversationSummary(
+summary_after_fold: ConversationSummary = ConversationSummary(
     summary_text="Later: user asked about ArgoCD sync waves.",
     summarized_through_turn=26,
     token_count=10,
     created_at="2025-10-03T11:42:00Z",
     model_used="openai/gpt-4o-mini",
 )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/cache/test_sqlite_cache.py` around lines 624 - 653, Declare
explicit types for the module-level summary constants by annotating each
variable with the ConversationSummary type (e.g., change summary_1 =
ConversationSummary(...) to summary_1: ConversationSummary =
ConversationSummary(...), and do the same for summary_2, folded_summary, and
summary_after_fold) so the constants have clear, consistent type annotations at
module scope.

Copy link
Copy Markdown
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tisnik tisnik merged commit 6c0733b into lightspeed-core:main May 26, 2026
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants