fix(reflect): scope delta mental model recall to new memories only#1192
fix(reflect): scope delta mental model recall to new memories only#1192nicoloboschi merged 1 commit intomainfrom
Conversation
55c1d1f to
2f1a3e3
Compare
Delta mode mental model refresh was running a full recall across ALL memories (identical to full mode), then passing all facts to a second LLM call for delta ops. This caused content bloat, duplication, and made delta strictly more expensive than full mode. Changes: - Add created_after/created_before time range filter to the recall pipeline (retrieval.py, link_expansion_retrieval.py, graph_retrieval.py) threaded through recall_async -> reflect_async -> tool closures - Delta refresh passes last_refreshed_at as created_after so the agentic loop only retrieves memories created/updated since the last refresh (uses updated_at to catch consolidation updates) - Short-circuit delta when no new facts found (skip LLM call, preserve existing content) - Accumulate based_on across delta refreshes (merge previous + new, deduped by ID) - Pass context to reflect agent during MM refresh with document name, stay-on-topic guidance, and example preservation instructions - Rewrite delta prompt: preserve existing content from prior refreshes, merge overlapping topics, preserve concrete examples over abstract rules - Add recall time-range unit tests (8 tests) - Add integration test verifying delta fusion quality
2f1a3e3 to
92b6a4a
Compare
|
Hi, do we need to set anything regarding mental models, retain options etc to activate delta retain/mental models or is it automatic? Yesterday I unleashed the hindsight beast, had a $30 bill across 500 conversation retains. (gemini-3-flash model) I was probably doing the whole thing wrong (rebuilding mental model on every new chat message I guess?) so trying to keep it as optimized as possible now, I only need per user profile to pass the agent. Any hints are appreciated! P.S: Loving the memory structure on this, spot on. |
yes you have to set mode=delta if you don't want real time and if you prefer batching the updates (e.g. hourly) you can set manual and trigger the refresh via API on your fav schedule |
|
Thanks! In node.js sdk it seems trigger mode is not present, only refresh_after_consolidation option is available, heads up! For now I'll update mental models with an API shim. |
Summary
last_refreshed_at, usingupdated_atto also catch consolidation updates.Changes
Recall pipeline (
retrieval.py,link_expansion_retrieval.py,graph_retrieval.py):created_after/created_beforetime range filter onupdated_atthreaded through all retrieval strategies (semantic, BM25, temporal, graph seeds)Reflect + tools (
memory_engine.py,tools.py):recall_async→_search_with_retries→reflect_async→ tool closures →tool_recall/tool_search_observations_is_mental_model_stalenow usesupdated_at(catches consolidation updates)Mental model refresh (
memory_engine.py):created_after=last_refreshed_atto reflectbased_onaccumulates across refreshes (merge previous + new, deduped by ID)Delta prompt (
prompts.py):Test plan
test_delta_editorial_fusion.py) with real SEO specialist + brand voice documents — verifies brand voice fuses organically with SEO guidance, no duplication, based_on accumulatesruff check)