LCORE-1571: Extend conversation cache for compaction summaries#1795
Conversation
Add persistent storage for conversation-compaction summary chunks (ConversationSummary, introduced in LCORE-1570) to the conversation cache so summaries survive across requests and service restarts. - Cache ABC: new abstract store_summary() and get_summaries() methods. - SQLite and PostgreSQL: new conversation_summaries table keyed by (user_id, conversation_id, created_at), supporting multiple additive summary chunks per conversation and returning them oldest-first. The table is created with CREATE TABLE IF NOT EXISTS in initialize_cache(), which doubles as a forward-only schema migration for existing databases. delete() now cascades to the summaries table. - In-memory and no-op backends: interface-satisfying no-op stubs that validate the compound key but do not persist, mirroring the existing set_topic_summary contract. The return annotations use builtins.list because the Cache class shadows the list builtin with a method of the same name.
Cover the new store_summary()/get_summaries() across every cache backend. - SQLite (real on-disk store): store and retrieve a single summary, additive multi-chunk ordering, per-(user, conversation) isolation, persistence across a fresh cache instance (restart), forward-only migration on a legacy database lacking the new table, cascade delete with the conversation, and disconnected error paths. - PostgreSQL (mocked psycopg2): insert parameters in column order, reconstruction of summaries from result rows, empty result, and disconnected and database-error paths. - In-memory and no-op backends: no-op behavior and conversation-id validation.
WalkthroughThis PR extends all cache implementations with conversation summary storage capabilities. It adds three new methods to the abstract ChangesConversation Summary Storage API
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Add a replace_summaries operation to the conversation-summary cache: it atomically removes a conversation's existing summary chunks and stores a single folded summary in their place. Recursive re-summarization (R3) folds accumulated chunks into one when they themselves grow large; this gives that fold a persistent home so it is computed once and reused, rather than recomputed per request. Implemented across the Cache ABC and all backends: SQLite and PostgreSQL persist (delete + insert in one transaction); in-memory and no-op backends are interface-satisfying stubs, mirroring store_summary. Ordering by created_at means a subsequent get_summaries returns the fold first, followed by any chunks appended after it. Adds 12 unit tests covering collapse, empty-conversation, fold-then-append ordering, per-conversation isolation, and disconnected/DB-error paths.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/cache/sqlite_cache.py (1)
207-208:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
replace_summariescan leaveconversation_summariesempty under SQLite autocommit.
connect()setsself.connection.autocommit = True(~line 207), andreplace_summaries()runs theDELETE(~690) andINSERT(~691-692) without an explicitBEGIN/rollback; if theINSERTfails after theDELETE, the deletion can persist even though the method ends withself.connection.commit().- The
@connectiondecorator only ensures the cache is connected; it does not wrap the call in a transaction (so the “single transaction” docstring forreplace_summariesis not actually guaranteed).💡 Proposed fix (explicit transactional block)
def replace_summaries( self, user_id: str, conversation_id: str, folded_summary: ConversationSummary, skip_user_id_check: bool = False, ) -> None: @@ if self.connection is None: logger.error("Cache is disconnected") raise CacheError("replace_summaries: cache is disconnected") - cursor = self.connection.cursor() - cursor.execute(self.DELETE_SUMMARIES_STATEMENT, (user_id, conversation_id)) - cursor.execute( - self.INSERT_SUMMARY_STATEMENT, - ( - user_id, - conversation_id, - folded_summary.created_at, - folded_summary.summarized_through_turn, - folded_summary.token_count, - folded_summary.model_used, - folded_summary.summary_text, - ), - ) - cursor.close() - self.connection.commit() + cursor = self.connection.cursor() + try: + cursor.execute("BEGIN") + cursor.execute(self.DELETE_SUMMARIES_STATEMENT, (user_id, conversation_id)) + cursor.execute( + self.INSERT_SUMMARY_STATEMENT, + ( + user_id, + conversation_id, + folded_summary.created_at, + folded_summary.summarized_through_turn, + folded_summary.token_count, + folded_summary.model_used, + folded_summary.summary_text, + ), + ) + self.connection.commit() + except sqlite3.Error: + self.connection.rollback() + raise + finally: + cursor.close()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cache/sqlite_cache.py` around lines 207 - 208, The connect() sets self.connection.autocommit = True which allows replace_summaries() (the method that runs the DELETE and subsequent INSERTs) to leave conversation_summaries empty if an INSERT fails; modify replace_summaries() to explicitly run a transactional block: begin a transaction (e.g., execute "BEGIN" on self.connection or temporarily set autocommit=False), perform DELETE and all INSERTs, then commit on success and rollback on exception, and restore autocommit state afterwards; ensure the `@connection` decorator remains only for connectivity and does not replace this explicit transaction handling so that replace_summaries() truly executes as a single atomic transaction.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/unit/cache/test_sqlite_cache.py`:
- Around line 624-653: Declare explicit types for the module-level summary
constants by annotating each variable with the ConversationSummary type (e.g.,
change summary_1 = ConversationSummary(...) to summary_1: ConversationSummary =
ConversationSummary(...), and do the same for summary_2, folded_summary, and
summary_after_fold) so the constants have clear, consistent type annotations at
module scope.
---
Outside diff comments:
In `@src/cache/sqlite_cache.py`:
- Around line 207-208: The connect() sets self.connection.autocommit = True
which allows replace_summaries() (the method that runs the DELETE and subsequent
INSERTs) to leave conversation_summaries empty if an INSERT fails; modify
replace_summaries() to explicitly run a transactional block: begin a transaction
(e.g., execute "BEGIN" on self.connection or temporarily set autocommit=False),
perform DELETE and all INSERTs, then commit on success and rollback on
exception, and restore autocommit state afterwards; ensure the `@connection`
decorator remains only for connectivity and does not replace this explicit
transaction handling so that replace_summaries() truly executes as a single
atomic transaction.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f197064e-6858-4635-8ff9-5b314e0253d2
📒 Files selected for processing (9)
src/cache/cache.pysrc/cache/in_memory_cache.pysrc/cache/noop_cache.pysrc/cache/postgres_cache.pysrc/cache/sqlite_cache.pytests/unit/cache/test_in_memory_cache.pytests/unit/cache/test_noop_cache.pytests/unit/cache/test_postgres_cache.pytests/unit/cache/test_sqlite_cache.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Pylinter
- GitHub Check: ruff
- GitHub Check: E2E Tests for Lightspeed Evaluation job
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/cache/postgres_cache.pysrc/cache/in_memory_cache.pysrc/cache/cache.pysrc/cache/noop_cache.pysrc/cache/sqlite_cache.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/cache/test_noop_cache.pytests/unit/cache/test_postgres_cache.pytests/unit/cache/test_in_memory_cache.pytests/unit/cache/test_sqlite_cache.py
🔇 Additional comments (9)
src/cache/cache.py (1)
175-241: LGTM!src/cache/in_memory_cache.py (1)
3-10: LGTM!Also applies to: 169-234
src/cache/noop_cache.py (1)
3-10: LGTM!Also applies to: 148-207
tests/unit/cache/test_in_memory_cache.py (1)
1-71: LGTM!tests/unit/cache/test_noop_cache.py (1)
9-10: LGTM!Also applies to: 256-293
tests/unit/cache/test_postgres_cache.py (1)
22-23: LGTM!Also applies to: 922-1130
src/cache/postgres_cache.py (1)
226-227: ⚡ Quick win
replace_summariesDELETE+INSERT atomicity depends on@connection
PostgresCache.__init__setsself.connection.autocommit = True(src/cache/postgres_cache.py:226), whilereplace_summaries(decorated with@connectionimported fromutils.connection_decorator) executesDELETE_SUMMARIES_STATEMENTfollowed byINSERT_SUMMARY_STATEMENTwith no commit/rollback in the method body. Ifutils.connection_decorator.connectiondoesn’t wrap the call in a single transaction (e.g., disables autocommit for the duration), a failure after the DELETE can leave summaries history partially updated.Proposed fix: force an explicit transaction boundary around DELETE+INSERT
previous_autocommit = self.connection.autocommit try: + self.connection.autocommit = False with self.connection.cursor() as cursor: cursor.execute( PostgresCache.DELETE_SUMMARIES_STATEMENT, (user_id, conversation_id), ) cursor.execute( PostgresCache.INSERT_SUMMARY_STATEMENT, ( user_id, conversation_id, folded_summary.created_at, folded_summary.summarized_through_turn, folded_summary.token_count, folded_summary.model_used, folded_summary.summary_text, ), ) + self.connection.commit() except psycopg2.DatabaseError as e: + self.connection.rollback() logger.error("PostgresCache.replace_summaries: %s", e) raise CacheError("PostgresCache.replace_summaries", e) from e finally: self.connection.autocommit = previous_autocommittests/unit/cache/test_sqlite_cache.py (2)
19-19: LGTM!
656-821: LGTM!
| summary_1 = ConversationSummary( | ||
| summary_text="User asked about Kubernetes pods; assistant explained kubectl.", | ||
| summarized_through_turn=8, | ||
| token_count=14, | ||
| created_at="2025-10-03T09:31:29Z", | ||
| model_used="openai/gpt-4o-mini", | ||
| ) | ||
| summary_2 = ConversationSummary( | ||
| summary_text="User then moved on to Helm charts and Istio routing.", | ||
| summarized_through_turn=18, | ||
| token_count=12, | ||
| created_at="2025-10-03T10:05:01Z", | ||
| model_used="openai/gpt-4o-mini", | ||
| ) | ||
| # A fold collapsing summary_1 + summary_2; created after both (R3, LCORE-1572). | ||
| folded_summary = ConversationSummary( | ||
| summary_text="Folded: pods/kubectl, then Helm charts and Istio routing.", | ||
| summarized_through_turn=18, | ||
| token_count=11, | ||
| created_at="2025-10-03T11:00:00Z", | ||
| model_used="openai/gpt-4o-mini", | ||
| ) | ||
| # A chunk produced after a fold; created later still, so it sorts after the fold. | ||
| summary_after_fold = ConversationSummary( | ||
| summary_text="Later: user asked about ArgoCD sync waves.", | ||
| summarized_through_turn=26, | ||
| token_count=10, | ||
| created_at="2025-10-03T11:42:00Z", | ||
| model_used="openai/gpt-4o-mini", | ||
| ) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Consider adding explicit type annotations to module-level summary constants.
While the types are inferable, explicit annotations improve readability and consistency with project conventions.
📝 Suggested enhancement
-summary_1 = ConversationSummary(
+summary_1: ConversationSummary = ConversationSummary(
summary_text="User asked about Kubernetes pods; assistant explained kubectl.",
summarized_through_turn=8,
token_count=14,
created_at="2025-10-03T09:31:29Z",
model_used="openai/gpt-4o-mini",
)
-summary_2 = ConversationSummary(
+summary_2: ConversationSummary = ConversationSummary(
summary_text="User then moved on to Helm charts and Istio routing.",
summarized_through_turn=18,
token_count=12,
created_at="2025-10-03T10:05:01Z",
model_used="openai/gpt-4o-mini",
)
-folded_summary = ConversationSummary(
+folded_summary: ConversationSummary = ConversationSummary(
summary_text="Folded: pods/kubectl, then Helm charts and Istio routing.",
summarized_through_turn=18,
token_count=11,
created_at="2025-10-03T11:00:00Z",
model_used="openai/gpt-4o-mini",
)
-summary_after_fold = ConversationSummary(
+summary_after_fold: ConversationSummary = ConversationSummary(
summary_text="Later: user asked about ArgoCD sync waves.",
summarized_through_turn=26,
token_count=10,
created_at="2025-10-03T11:42:00Z",
model_used="openai/gpt-4o-mini",
)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/unit/cache/test_sqlite_cache.py` around lines 624 - 653, Declare
explicit types for the module-level summary constants by annotating each
variable with the ConversationSummary type (e.g., change summary_1 =
ConversationSummary(...) to summary_1: ConversationSummary =
ConversationSummary(...), and do the same for summary_2, folded_summary, and
summary_after_fold) so the constants have clear, consistent type annotations at
module scope.
Description
Extends the conversation cache with persistent storage for conversation-compaction summaries (part of the Conversation Compaction epic, LCORE-1631).
When compaction summarizes older conversation turns (LCORE-1570), each run produces a
ConversationSummarychunk. This PR lets those chunks be persisted and retrieved per conversation so they survive across requests and service restarts.src/cache/cache.py): newstore_summary(),get_summaries(), andreplace_summaries()abstract methods.conversation_summariestable keyed by(user_id, conversation_id, created_at), supporting multiple additive summary chunks per conversation and returned oldest-first. The table is created withCREATE TABLE IF NOT EXISTSininitialize_cache(), which doubles as a forward-only schema migration for existing databases.delete()cascades to the summaries table.replace_summaries()): atomically replaces a conversation's accumulated summary chunks with a single folded summary (delete + insert in one transaction). This is the persistence layer for recursive re-summarization (R3, LCORE-1572) — a fold is computed once and reused rather than recomputed per request. Ordering bycreated_atmeans a laterget_summaries()returns the fold first, followed by any chunks appended after it (the active set).set_topic_summarycontract.Reuses the
ConversationSummarymodel from LCORE-1570. No public API or OpenAPI changes.Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Cache-layer change; verified with unit tests (no running stack required).
Cache backend tests:
uv run python -m pytest tests/unit/cache/ -q-> 140 passed, 1 skipped. For SQLite (real on-disk DB): store/retrieve a summary, additive multi-chunk ordering, per-(user, conversation) isolation, persistence across a fresh cache instance (simulated restart), forward-only migration on a legacy DB that lacks the new table, cascade delete with the conversation, and disconnected error paths.replace_summaries()is covered for collapse, empty-conversation, fold-then-append ordering, per-conversation isolation, and disconnected/DB-error paths. PostgreSQL paths covered via mocked psycopg2 (store/replace params, row reconstruction, empty result, disconnected/DB-error). Memory and no-op backends: no-op behavior and conversation-id validation.Full unit suite (confirms the new abstract methods broke no existing Cache users):
uv run python -m pytest tests/unit-> 2373 passed, 1 skipped.Summary by CodeRabbit
New Features
Tests