# Memory System Practice

This notebook demonstrates the complete **three-tier memory system** implementation.

## Memory Architecture

The system implements:

1. **Working Memory (WM)**: Recent conversation turns (in-memory, last N messages)
2. **Episodic Memory (EM)**: Session-specific facts distilled from conversations
3. **Semantic Memory (SM)**: Long-term knowledge from documents

## Contents
1. Setup and Initialization
2. Working Memory (Session Management)
3. Intent Classification and Entity Extraction
4. Episodic Memory Operations
5. Semantic Memory Retrieval
6. Combined Memory Recall (EM + SM)
7. LangGraph State Machine Flow
8. Salience Tracking and Distillation
9. End-to-End Scenarios

## 1. Setup and Initialization

In [1]:
# Add project root to path
import sys
from pathlib import Path
sys.path.insert(0, str(Path.cwd().parent / 'src'))

# Load environment variables
from dotenv import load_dotenv
load_dotenv()

# Import memory system components
from acc_llamaindex.config import config
from acc_llamaindex.application.chat_service.session_manager import session_manager
from acc_llamaindex.application.memory_service.intent_classifier import intent_classifier
from acc_llamaindex.application.memory_service.entity_extractor import entity_extractor
from acc_llamaindex.application.memory_service.service import memory_service
from acc_llamaindex.application.memory_service.salience_tracker import salience_tracker
from acc_llamaindex.application.memory_service.conversation_summarizer import conversation_summarizer
from acc_llamaindex.application.chat_service.graph import get_memory_graph, MemoryState
from acc_llamaindex.infrastructure.llm_providers.langchain_provider import get_llm, reset_llm
from acc_llamaindex.infrastructure.db.chroma_client import chroma_client

print("✓ All memory system imports successful")

  from .autonotebook import tqdm as notebook_tqdm
[32m2025-10-19 21:48:31.201[0m | [1mINFO    [0m | [36macc_llamaindex.application.reranking_service.service[0m:[36m__init__[0m:[36m18[0m - [1mRerankingService initialized[0m
[32m2025-10-19 21:48:31.213[0m | [1mINFO    [0m | [36macc_llamaindex.application.evaluation_service.service[0m:[36m__init__[0m:[36m15[0m - [1mEvaluationService initialized[0m
[32m2025-10-19 21:48:31.213[0m | [1mINFO    [0m | [36macc_llamaindex.application.chat_service.service[0m:[36m__init__[0m:[36m20[0m - [1mChatService initialized[0m
[32m2025-10-19 21:48:31.214[0m | [1mINFO    [0m | [36macc_llamaindex.application.chat_service.session_manager[0m:[36m__init__[0m:[36m25[0m - [1mSessionManager initialized (max_sessions=100, ttl=30min)[0m
[32m2025-10-19 21:48:31.215[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.service[0m:[36m__init__[0m:[36m23[0m - [1mMemoryService initialized[0m
[32m202

✓ All memory system imports successful


## 2. Working Memory (Session Management)

Working Memory stores recent conversation turns in-memory with:
- LRU eviction when max sessions reached
- TTL-based cleanup of expired sessions
- Thread-safe operations

### Pattern 1: Create and Manage Sessions

In [2]:
# Initialize core services
chroma_client.initialize()
get_llm()

print(f"\nMemory System Config:")
print(f"- WM Max Turns: {config.wm_max_turns}")
print(f"- WM Max Sessions: {config.wm_max_sessions}")
print(f"- WM Session TTL: {config.wm_session_ttl_minutes} minutes")
print(f"- EM Collection: {config.em_collection_name}")
print(f"- SM Collection: {config.chroma_collection_name}")  # SM uses main collection
print(f"- EM K Default: {config.em_k_default}")
print(f"- SM K Default: {config.sm_k_default}")
print(f"- Enable Memory System: {config.enable_memory_system}")
print(f"- Enable EM Distillation: {config.enable_em_distillation}")
print(f"- Distill Every N Turns: {config.em_distill_every_n_turns}")

[32m2025-10-19 21:48:32.955[0m | [1mINFO    [0m | [36macc_llamaindex.infrastructure.db.chroma_client[0m:[36minitialize[0m:[36m34[0m - [1mInitializing ChromaDB at /Users/kevinknox/coding/acc-llamaindex/data/chroma_db[0m
[32m2025-10-19 21:48:33.023[0m | [1mINFO    [0m | [36macc_llamaindex.infrastructure.db.chroma_client[0m:[36minitialize[0m:[36m42[0m - [1mChromaDB client and embeddings initialized[0m
[32m2025-10-19 21:48:33.030[0m | [1mINFO    [0m | [36macc_llamaindex.infrastructure.db.chroma_client[0m:[36minitialize[0m:[36m55[0m - [1mSemantic memory collection initialized: documents[0m
[32m2025-10-19 21:48:33.030[0m | [1mINFO    [0m | [36macc_llamaindex.infrastructure.db.chroma_client[0m:[36m_initialize_episodic_memory[0m:[36m73[0m - [1mInitializing episodic memory collection: episodic_memory[0m
[32m2025-10-19 21:48:33.031[0m | [1mINFO    [0m | [36macc_llamaindex.infrastructure.db.chroma_client[0m:[36m_initialize_episodic_memory[0m


Memory System Config:
- WM Max Turns: 10
- WM Max Sessions: 100
- WM Session TTL: 30 minutes
- EM Collection: episodic_memory
- SM Collection: documents
- EM K Default: 5
- SM K Default: 10
- Enable Memory System: True
- Enable EM Distillation: True
- Distill Every N Turns: 5


In [3]:
# Create a new session
session_id = "demo-session-001"
session = session_manager.create_session(session_id)

print(f"Created session: {session['session_id']}")
print(f"Created at: {session['created_at']}")
print(f"Turn count: {session['turn_count']}")

[32m2025-10-19 21:48:37.192[0m | [1mINFO    [0m | [36macc_llamaindex.application.chat_service.session_manager[0m:[36mcreate_session[0m:[36m88[0m - [1mCreated session demo-session-001[0m


Created session: demo-session-001
Created at: 2025-10-19 21:48:37.192629
Turn count: 0


### Pattern 2: Add Conversation Turns

In [4]:
# Add conversation turns
session_manager.add_turn(
    session_id=session_id,
    user_message="What is the African Growth and Opportunity Act?",
    assistant_message="AGOA is a U.S. trade program providing preferential access to African countries.",
    metadata={"intent": "compliance_query", "entities": [{"text": "AGOA", "type": "program"}]}
)

session_manager.add_turn(
    session_id=session_id,
    user_message="Which countries are eligible?",
    assistant_message="Sub-Saharan African countries that meet eligibility criteria.",
    metadata={"intent": "general", "entities": []}
)

print(f"Added 2 turns to session {session_id}")

[32m2025-10-19 21:48:39.334[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.chat_service.session_manager[0m:[36madd_turn[0m:[36m134[0m - [34m[1mAdded turn 1 to session demo-session-001[0m
[32m2025-10-19 21:48:39.335[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.chat_service.session_manager[0m:[36madd_turn[0m:[36m134[0m - [34m[1mAdded turn 2 to session demo-session-001[0m


Added 2 turns to session demo-session-001


### Pattern 3: Retrieve Recent Turns

In [5]:
# Get recent turns
recent_turns = session_manager.get_recent_turns(session_id, n=2)

print(f"Retrieved {len(recent_turns)} recent turns:\n")
for turn in recent_turns:
    print(f"Turn {turn['turn_number']}:")
    print(f"  User: {turn['user_message']}")
    print(f"  Assistant: {turn['assistant_message']}")
    print(f"  Metadata: {turn['metadata']}\n")

Retrieved 2 recent turns:

Turn 1:
  User: What is the African Growth and Opportunity Act?
  Assistant: AGOA is a U.S. trade program providing preferential access to African countries.
  Metadata: {'intent': 'compliance_query', 'entities': [{'text': 'AGOA', 'type': 'program'}]}

Turn 2:
  User: Which countries are eligible?
  Assistant: Sub-Saharan African countries that meet eligibility criteria.
  Metadata: {'intent': 'general', 'entities': []}



## 3. Intent Classification and Entity Extraction

Intent classification helps route queries:
- `quote_request`: Price quotes
- `compliance_query`: Regulations
- `shipment_tracking`: Tracking
- `general`: General questions

### Pattern 1: Classify User Intents

In [6]:
# Test queries with different intents
test_queries = [
    "What is the price for shipping 100 units to Lagos?",
    "What are the import regulations for textiles?",
    "Where is my shipment TRK-12345?",
    "What is AGOA?"
]

print("Intent Classification Results:\n")
for query in test_queries:
    result = intent_classifier.classify(query)
    print(f"Query: {query}")
    print(f"  Intent: {result['intent']} (confidence: {result['confidence']:.2f})")
    print(f"  Reasoning: {result.get('reasoning', 'N/A')}\n")

[32m2025-10-19 21:48:45.572[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.intent_classifier[0m:[36m_classify_cached[0m:[36m66[0m - [1mClassifying intent for query: What is the price for shipping 100 units to Lagos?[0m


Intent Classification Results:



[32m2025-10-19 21:48:48.392[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.intent_classifier[0m:[36m_classify_cached[0m:[36m88[0m - [1mIntent classified: quote_request (confidence: 0.85)[0m
[32m2025-10-19 21:48:48.393[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.intent_classifier[0m:[36m_classify_cached[0m:[36m66[0m - [1mClassifying intent for query: What are the import regulations for textiles?[0m


Query: What is the price for shipping 100 units to Lagos?
  Intent: quote_request (confidence: 0.85)
  Reasoning: User asked for the price/quote for shipping 100 units to Lagos, which is a request for a shipping quote.



[32m2025-10-19 21:48:50.211[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.intent_classifier[0m:[36m_classify_cached[0m:[36m88[0m - [1mIntent classified: compliance_query (confidence: 0.80)[0m
[32m2025-10-19 21:48:50.212[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.intent_classifier[0m:[36m_classify_cached[0m:[36m66[0m - [1mClassifying intent for query: Where is my shipment TRK-12345?[0m


Query: What are the import regulations for textiles?
  Intent: compliance_query (confidence: 0.80)
  Reasoning: User asked about import regulations for textiles, which relates to regulatory/compliance information (customs, documentation, standards).



[32m2025-10-19 21:48:53.508[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.intent_classifier[0m:[36m_classify_cached[0m:[36m88[0m - [1mIntent classified: shipment_tracking (confidence: 0.92)[0m
[32m2025-10-19 21:48:53.508[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.intent_classifier[0m:[36m_classify_cached[0m:[36m66[0m - [1mClassifying intent for query: What is AGOA?[0m


Query: Where is my shipment TRK-12345?
  Intent: shipment_tracking (confidence: 0.92)
  Reasoning: User asks for the current location/status of a shipment using a tracking number TRK-12345, which is a tracking inquiry.



[32m2025-10-19 21:48:55.905[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.intent_classifier[0m:[36m_classify_cached[0m:[36m88[0m - [1mIntent classified: compliance_query (confidence: 0.78)[0m


Query: What is AGOA?
  Intent: compliance_query (confidence: 0.78)
  Reasoning: AGOA is a trade regulation/program (African Growth and Opportunity Act) relevant to shipments and compliance requirements.



### Pattern 2: Extract Entities

In [7]:
# Extract entities from queries
query = "I need to ship 500 smartphones from China to Nigeria using FOB Incoterms"
entities = entity_extractor.extract(query)

print(f"Query: {query}\n")
print(f"Extracted Entities:")
for entity in entities:
    print(f"  - {entity['text']} ({entity['type']})")

[32m2025-10-19 21:48:56.821[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.memory_service.entity_extractor[0m:[36m_extract_with_llm[0m:[36m191[0m - [34m[1mUsing LLM fallback for entity extraction: I need to ship 500 smartphones from China to Niger[0m
[32m2025-10-19 21:49:05.626[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.entity_extractor[0m:[36mextract[0m:[36m137[0m - [1mExtracted 4 entities from query: I need to ship 500 smartphones from China to Niger[0m
[32m2025-10-19 21:49:05.626[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.memory_service.entity_extractor[0m:[36mextract[0m:[36m139[0m - [34m[1m  - incoterm: FOB (incoterm:fob)[0m
[32m2025-10-19 21:49:05.627[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.memory_service.entity_extractor[0m:[36mextract[0m:[36m139[0m - [34m[1m  - commodity: smartphones (commodity:smartphones)[0m
[32m2025-10-19 21:49:05.627[0m | [34m[1mDEBUG  

Query: I need to ship 500 smartphones from China to Nigeria using FOB Incoterms

Extracted Entities:
  - FOB (incoterm)
  - smartphones (commodity)
  - China (country)
  - Nigeria (country)


## 4. Episodic Memory Operations

Episodic Memory stores session-specific facts extracted from conversations.

### Pattern 1: Add Facts to Episodic Memory

In [8]:
# Add episodic facts manually (normally done via distillation)
facts = [
    "User is interested in exporting textiles to Nigeria",
    "User prefers FOB Incoterms for shipments",
    "User mentioned working with supplier in Lagos"
]

for i, fact in enumerate(facts, 1):
    chroma_client.write_episodic(
        texts=[fact],
        metadatas=[{  # Must be a list of dicts
            "session_id": session_id,
            "fact_id": f"fact-{i}",
            "source": "manual_test",
            "salience_score": 0.7
        }]
    )
    print(f"Added EM fact {i}: {fact}")

[32m2025-10-19 21:49:10.138[0m | [1mINFO    [0m | [36macc_llamaindex.infrastructure.db.chroma_client[0m:[36mwrite_episodic[0m:[36m202[0m - [1mWrote 1 items to episodic memory[0m
[32m2025-10-19 21:49:10.296[0m | [1mINFO    [0m | [36macc_llamaindex.infrastructure.db.chroma_client[0m:[36mwrite_episodic[0m:[36m202[0m - [1mWrote 1 items to episodic memory[0m


Added EM fact 1: User is interested in exporting textiles to Nigeria
Added EM fact 2: User prefers FOB Incoterms for shipments


[32m2025-10-19 21:49:10.451[0m | [1mINFO    [0m | [36macc_llamaindex.infrastructure.db.chroma_client[0m:[36mwrite_episodic[0m:[36m202[0m - [1mWrote 1 items to episodic memory[0m


Added EM fact 3: User mentioned working with supplier in Lagos


### Pattern 2: Query Episodic Memory

In [9]:
# Query EM for session-specific context
query = "What shipping terms does the user prefer?"

em_results = memory_service._query_episodic(
    session_id=session_id,
    query=query,
    k=3
)

print(f"Query: {query}\n")
print(f"Episodic Memory Results ({len(em_results)} found):\n")
for i, result in enumerate(em_results, 1):
    print(f"Result {i}:")
    print(f"  Text: {result['text']}")
    print(f"  Salience: {result.get('salience', 0.0):.4f}")
    print(f"  Source: {result.get('source', 'unknown')}")
    print(f"  Metadata: {result.get('metadata', {})}\n")

[32m2025-10-19 21:49:19.052[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.memory_service.service[0m:[36m_query_episodic[0m:[36m110[0m - [34m[1mRetrieved 3 EM results for session demo-session-001[0m


Query: What shipping terms does the user prefer?

Episodic Memory Results (3 found):

Result 1:
  Text: User prefers FOB Incoterms for shipments
  Salience: 0.5000
  Source: EM
  Metadata: {'source': 'manual_test', 'salience_score': 0.7, 'fact_id': 'fact-2', 'session_id': 'demo-session-001'}

Result 2:
  Text: User prefers FOB Incoterms for shipments
  Salience: 0.5000
  Source: EM
  Metadata: {'session_id': 'demo-session-001', 'fact_id': 'fact-2', 'source': 'manual_test', 'salience_score': 0.7}

Result 3:
  Text: User is interested in exporting textiles to Nigeria
  Salience: 0.5000
  Source: EM
  Metadata: {'fact_id': 'fact-1', 'salience_score': 0.7, 'source': 'manual_test', 'session_id': 'demo-session-001'}



## 5. Semantic Memory Retrieval

Semantic Memory contains long-term knowledge from documents.

### Pattern 1: Query Semantic Memory

In [10]:
# Query SM for document knowledge
query = "What are FOB Incoterms?"

sm_results = memory_service.recall(
    query=query,
    session_id="any-session",
    intent="general",
    entities=[{"text": "FOB", "type": "incoterm"}],
    k_em=0,  # Only SM
    k_sm=5
)

print(f"Query: {query}\n")
print(f"Semantic Memory Results ({len(sm_results['sm_results'])} found):\n")
for i, result in enumerate(sm_results['sm_results'], 1):
    print(f"Result {i}:")
    print(f"  Text: {result['text'][:200]}...")
    print(f"  salience: {result['salience']:.4f}")
    print(f"  Source: {result.get('source', 'unknown')}\n")

[32m2025-10-19 21:49:26.063[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.service[0m:[36mrecall[0m:[36m60[0m - [1mMemory recall: query=What are FOB Incoterms?, intent=general, session=any-session, entities=1[0m
[32m2025-10-19 21:49:27.347[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.memory_service.service[0m:[36m_query_episodic[0m:[36m110[0m - [34m[1mRetrieved 0 EM results for session any-session[0m
[32m2025-10-19 21:49:27.543[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.memory_service.service[0m:[36m_query_semantic[0m:[36m162[0m - [34m[1mRetrieved 5 SM results for intent=general, entities=1[0m
[32m2025-10-19 21:49:27.543[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.memory_service.service[0m:[36m_merge_results[0m:[36m201[0m - [34m[1mMerged results: 0 EM + 5 SM = 5 (deduped)[0m


Query: What are FOB Incoterms?

Semantic Memory Results (5 found):

Result 1:
  Text: FAS - Free Alongside Ship
“Free Alongside Ship” means that the seller delivers when the goods are placed alongside the vessel (e.g., on a
quay or a barge) nominated by the buyer at the named port of s...
  salience: 0.0000
  Source: SM

Result 2:
  Text: If selling on FOB terms:
You will only have to cover the costs to get the goods loaded on board the vessel ready for export – so you
will cover the container trucking from your warehouse to the port p...
  salience: 0.0000
  Source: SM

Result 3:
  Text: Put simply, Incoterms® are the selling terms that the buyer and seller of goods both agrees to.  The Incoterm®
clearly states which tasks, costs and risks are associated with the buyer and the seller....
  salience: 0.0000
  Source: SM

Result 4:
  Text: Buyer’s and Seller’s Own Transport
Under Incoterms® 2010 it was assumed that all transport would be undertaken by a third party transport
provider. U

## 6. Combined Memory Recall (EM + SM)

The memory service intelligently combines episodic and semantic memory.

### Pattern 1: Basic Combined Recall

In [11]:
# Combined EM + SM recall
query = "What Incoterms should I use for my Nigeria shipment?"

combined_results = memory_service.recall(
    query=query,
    session_id=session_id,
    intent="general",
    entities=[{"text": "Nigeria", "type": "location"}, {"text": "Incoterms", "type": "term"}],
    k_em=3,
    k_sm=3
)

print(f"Query: {query}\n")
print(f"Episodic Memory ({len(combined_results['em_results'])} results):") 
for result in combined_results['em_results']:
    print(f"  - {result['text']}")

print(f"\nSemantic Memory ({len(combined_results['sm_results'])} results):")
for result in combined_results['sm_results']:
    print(f"  - {result['text'][:100]}...")

print(f"\nCombined & Reranked ({len(combined_results['combined_results'])} results):")
for i, result in enumerate(combined_results['combined_results'], 1):
    source = result.get('source', 'unknown')
    print(f"  {i}. [{source}] {result['text'][:80]}...")

[32m2025-10-19 21:49:31.450[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.service[0m:[36mrecall[0m:[36m60[0m - [1mMemory recall: query=What Incoterms should I use for my Nigeria shipmen, intent=general, session=demo-session-001, entities=2[0m
[32m2025-10-19 21:49:31.647[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.memory_service.service[0m:[36m_query_episodic[0m:[36m110[0m - [34m[1mRetrieved 3 EM results for session demo-session-001[0m
[32m2025-10-19 21:49:31.861[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.memory_service.service[0m:[36m_query_semantic[0m:[36m162[0m - [34m[1mRetrieved 3 SM results for intent=general, entities=2[0m
[32m2025-10-19 21:49:31.861[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.memory_service.service[0m:[36m_merge_results[0m:[36m201[0m - [34m[1mMerged results: 3 EM + 3 SM = 5 (deduped)[0m


Query: What Incoterms should I use for my Nigeria shipment?

Episodic Memory (3 results):
  - User prefers FOB Incoterms for shipments
  - User prefers FOB Incoterms for shipments
  - User is interested in exporting textiles to Nigeria

Semantic Memory (3 results):
  - to, which may occur when it is being used to confirm complex commercial agreements.
All parties must...
  - Letter of Credit.
Therefore provisions have been made to the Incoterms® 2020 to state that the buyer...
  - Ensure that you make changes to any contracts and documents as necessary
Ensure that you are stating...

Combined & Reranked (5 results):
  1. [EM] User prefers FOB Incoterms for shipments...
  2. [EM] User is interested in exporting textiles to Nigeria...
  3. [SM] to, which may occur when it is being used to confirm complex commercial agreemen...
  4. [SM] Letter of Credit.
Therefore provisions have been made to the Incoterms® 2020 to ...
  5. [SM] Ensure that you make changes to any contracts and documents

## 7. LangGraph State Machine Flow

The memory graph orchestrates the entire memory flow.

### Pattern 1: Execute Full Memory Graph

In [12]:
# Get the compiled memory graph
memory_graph = get_memory_graph()

# Create initial state
initial_state = MemoryState(
    user_query="What Incoterms are best for shipping to Africa?",
    session_id=session_id,
    wm_context=[],
    intent="",
    confidence=0.0,
    entities=[],
    em_results=[],
    sm_results=[],
    reranked_context=[],
    response="",
    citations=[],
    should_distill=False,
    retrieval_ms=0.0,
    generation_ms=0.0
)

# Execute the graph
print("Executing memory graph...\n")
final_state = memory_graph.invoke(initial_state)

print("\n=== Memory Graph Execution Complete ===\n")
print(f"User Query: {final_state['user_query']}")
print(f"\nIntent: {final_state['intent']} (confidence: {final_state['confidence']:.2f})")
print(f"Entities: {[e.get('text', '') for e in final_state['entities']]}")
print(f"\nWorking Memory: {len(final_state['wm_context'])} turns")
print(f"EM Results: {len(final_state['em_results'])}")
print(f"SM Results: {len(final_state['sm_results'])}")
print(f"Reranked Context: {len(final_state['reranked_context'])}")
print(f"\nResponse:\n{final_state['response']}")
print(f"\nCitations: {len(final_state['citations'])}")
print(f"Should Distill: {final_state['should_distill']}")

[32m2025-10-19 21:49:38.014[0m | [1mINFO    [0m | [36macc_llamaindex.application.chat_service.graph[0m:[36mcreate_memory_graph[0m:[36m343[0m - [1mMemory graph created successfully[0m
[32m2025-10-19 21:49:38.022[0m | [1mINFO    [0m | [36macc_llamaindex.application.chat_service.graph[0m:[36mload_working_memory[0m:[36m58[0m - [1mLoading working memory for session: demo-session-001[0m
[32m2025-10-19 21:49:38.023[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.chat_service.graph[0m:[36mload_working_memory[0m:[36m67[0m - [34m[1mLoaded 2 turns from working memory[0m
[32m2025-10-19 21:49:38.024[0m | [1mINFO    [0m | [36macc_llamaindex.application.chat_service.graph[0m:[36mclassify_intent[0m:[36m82[0m - [1mClassifying intent for query: What Incoterms are best for shipping to Africa?...[0m
[32m2025-10-19 21:49:38.024[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.intent_classifier[0m:[36m_classify_cached[0m:

Executing memory graph...



[32m2025-10-19 21:49:41.510[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.intent_classifier[0m:[36m_classify_cached[0m:[36m88[0m - [1mIntent classified: compliance_query (confidence: 0.62)[0m
[32m2025-10-19 21:49:41.511[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.memory_service.entity_extractor[0m:[36m_extract_with_llm[0m:[36m191[0m - [34m[1mUsing LLM fallback for entity extraction: What Incoterms are best for shipping to Africa?[0m
[32m2025-10-19 21:49:50.360[0m | [1mINFO    [0m | [36macc_llamaindex.application.memory_service.entity_extractor[0m:[36mextract[0m:[36m137[0m - [1mExtracted 1 entities from query: What Incoterms are best for shipping to Africa?[0m
[32m2025-10-19 21:49:50.360[0m | [34m[1mDEBUG   [0m | [36macc_llamaindex.application.memory_service.entity_extractor[0m:[36mextract[0m:[36m139[0m - [34m[1m  - incoterm: Incoterms (incoterm:incoterms)[0m
[32m2025-10-19 21:49:50.360[0m | [1mINF


=== Memory Graph Execution Complete ===

User Query: What Incoterms are best for shipping to Africa?

Intent: compliance_query (confidence: 0.62)
Entities: ['Incoterms']

Working Memory: 2 turns
EM Results: 0
SM Results: 0
Reranked Context: 3

Response:
Based on your context (you prefer FOB, you’re exporting textiles to Nigeria, and you’re working with a Lagos supplier), here are the Incoterms that tend to work best for shipping to Africa, with notes on when to use them:

- FOB (Free On Board) — strongest default for sea shipments from Lagos
  - Your Lagos supplier handles export clearance and loads the goods onto the ship.
  - Risk transfers to you when the goods pass the ship’s rail; you then arrange ocean freight and insurance from the port onward.
  - This matches your stated preference for FOB [EM-1] and fits your Lagos supplier setup [EM-3].

- CFR (Cost and Freight) — if you want the seller to pay freight to the destination port, but you handle insurance
  - Seller covers cost 

## 8. Salience Tracking and Distillation

Salience tracking determines which memories are most important.

### Pattern 1: Track Citation Salience

In [None]:
# Simulate tracking citations
cited_facts = [
    {"text": "User prefers FOB Incoterms", "id": "fact-1"},
    {"text": "User is exporting textiles", "id": "fact-2"}
]

salience_tracker.track_citations(cited_facts)
print("Tracked citations for salience scoring")

# Get salience scores
for fact in cited_facts:
    score = salience_tracker.get_salience_score(fact['id'])
    print(f"  {fact['text']}: {score:.4f}")

AttributeError: 'dict' object has no attribute 'llm_provider'

### Pattern 2: Conversation Distillation

In [None]:
# Get recent turns to distill
turns_to_distill = session_manager.get_recent_turns(session_id, n=2)

print(f"Distilling {len(turns_to_distill)} conversation turns...\n")

# Distill into facts
distill_result = conversation_summarizer.distill(
    session_id=session_id,
    turns=turns_to_distill
)

print(f"Distillation Result:")
print(f"  Success: {distill_result.get('success', False)}")
print(f"  Facts Created: {distill_result.get('facts_created', 0)}")
print(f"\nExtracted Facts:")
for fact in distill_result.get('facts', []):
    print(f"  - {fact}")

Agent handled potential errors gracefully


## 9. End-to-End Scenario

Complete workflow demonstrating the full memory system.

### Scenario: Multi-Turn Conversation with Memory

In [None]:
# Create a new session for this scenario
scenario_session_id = "scenario-001"

print("Simulating Multi-Turn Conversation:\n")

# Turn 1
state1 = MemoryState(
    user_query="What are Incoterms?",
    session_id=scenario_session_id,
    wm_context=[], intent="", confidence=0.0, entities=[],
    em_results=[], sm_results=[], reranked_context=[],
    response="", citations=[], should_distill=False,
    retrieval_ms=0.0, generation_ms=0.0
)
result1 = memory_graph.invoke(state1)
print(f"Turn 1:")
print(f"  User: {result1['user_query']}")
print(f"  Intent: {result1['intent']}")
print(f"  Response: {result1['response'][:100]}...\n")

# Turn 2
state2 = MemoryState(
    user_query="Which one is best for sea freight?",
    session_id=scenario_session_id,
    wm_context=[], intent="", confidence=0.0, entities=[],
    em_results=[], sm_results=[], reranked_context=[],
    response="", citations=[], should_distill=False,
    retrieval_ms=0.0, generation_ms=0.0
)
result2 = memory_graph.invoke(state2)
print(f"Turn 2:")
print(f"  User: {result2['user_query']}")
print(f"  WM Context: {len(result2['wm_context'])} previous turns")
print(f"  Response: {result2['response'][:100]}...\n")

# Turn 3 - tests memory recall
state3 = MemoryState(
    user_query="What did we discuss earlier?",
    session_id=scenario_session_id,
    wm_context=[], intent="", confidence=0.0, entities=[],
    em_results=[], sm_results=[], reranked_context=[],
    response="", citations=[], should_distill=False,
    retrieval_ms=0.0, generation_ms=0.0
)
result3 = memory_graph.invoke(state3)
print(f"Turn 3:")
print(f"  User: {result3['user_query']}")
print(f"  WM Context: {len(result3['wm_context'])} previous turns")
print(f"  EM Results: {len(result3['em_results'])}")
print(f"  Response: {result3['response'][:150]}...\n")

print("\n=== Scenario Complete ===")
session_stats = session_manager.get_session(scenario_session_id)
print(f"Total turns in session: {session_stats['turn_count']}")

## Summary

This notebook demonstrated:

1. **Working Memory**: In-memory session management with LRU eviction and TTL cleanup
2. **Intent Classification**: LLM-based routing for different query types
3. **Entity Extraction**: Identifying key entities in user queries
4. **Episodic Memory**: Session-specific facts from conversation distillation
5. **Semantic Memory**: Long-term document knowledge retrieval
6. **Memory Fusion**: Combining EM + SM with intelligent reranking
7. **LangGraph Flow**: State machine orchestration of all memory tiers
8. **Salience Tracking**: Importance scoring for memory items
9. **Distillation**: Automatic fact extraction from conversations
10. **End-to-End**: Full multi-turn conversations with memory

## Key Takeaways

- **WM** provides fast access to recent context
- **EM** stores personalized session facts
- **SM** retrieves relevant document knowledge
- **LangGraph** coordinates the flow between memory tiers
- **Salience** ensures important facts are retained
- **Distillation** automatically extracts conversation facts

## Next Steps

- Experiment with different distillation frequencies
- Test salience-based memory promotion
- Implement custom reranking strategies
- Add memory decay mechanisms
- Explore cross-session memory sharing