## ****MemoSynth-Lite: LLM-Powered Memory System Demo****

This notebook demonstrates a robust memory system for LLM agents, combining semantic search (Qdrant), timeline (DuckDB), and graph relationships (Neo4j).
Features:
1. Semantic search with recency/confidence re-ranking
2. Timeline and conflict handling
3. Entity and relationship extraction
4. Cross-store data consistency
5. Performance at scale

All code is reproducible and runs from a clean state each time.

### **Table of Contents**
1. Environment Setup & State Reset
2. Define Memories
3. Store Memories
4. Timeline Visualization
5. Graph Memory
6. Query Memory
7. Summarization
8. Diff and Resolve
9. Testing & Consistency Checks
10. Performance Test
11. All Memories in Qdrant


### **Environment Setup & State Reset**
This section resets all data stores (Neo4j, Qdrant, DuckDB) to ensure a clean, reproducible demo.

In [3]:
#To add the parent directory of your current script or notebook to Python’s module search path.
import sys, os, asyncio
sys.path.append(os.path.abspath(".."))

In [5]:
#Import your APIs and sample memories
from memosynth.memory_client import (
    write_and_sync_memory, query_memory, summarize_memories, diff, resolve, example_memory, update_memory
)
from memosynth.vector_store import get_memory_by_id
from memosynth.timeline_store import log_memory
from memosynth.graph_store import create_memory_node, relate_memories, find_related_memories
import duckdb
import pandas as pd


In [7]:
#To delete any existing collection of memories to avoid overlap or confusion
from memosynth.graph_store import driver  # This is the AsyncGraphDatabase driver

async def reset_neo4j():
    async with driver.session() as session:
        await session.run("MATCH (n) DETACH DELETE n")
    print("✅ Neo4j database reset (all nodes and relationships deleted)")

await reset_neo4j()

from memosynth.vector_store import client  # This is the AsyncQdrantClient

await client.delete_collection(collection_name="memos")
print("✅ Qdrant collection 'memos' deleted")


✅ Neo4j database reset (all nodes and relationships deleted)
✅ Qdrant collection 'memos' deleted


In [9]:
# Clear the timeline table to avoid duplicate entries in a demo
con = duckdb.connect("memory_timeline.db")
con.execute("DELETE FROM memory_log")
print("✅ Cleared timeline")

✅ Cleared timeline


### **Define Memories**
Here we define three sample memories with rich metadata.
These will be used throughout the demo for storage, search, and summarization.

In [12]:
#Define three memories
memory1 = example_memory
memory2 = {
    "id": "m-002",
    "project": "demo_project",
    "agent": "doc_bot",
    "summary": "Client is expecting an up-to-date forecast of Q3 to plan ahead for pitfalls.",
    "type": "insight",
    "tags": ["finance", "Q3", "forecast", "planning", "pitfalls"],
    "source": "Finance_Forecast_Q3.pdf",
    "author": "doc_bot",
    "created_at": "2025-06-22",
    "version": 1,
    "confidence": 0.8,
    "visibility": "project",
    "sensitivity": "medium"
}
memory3 = {
    "id": "m-003",
    "project": "demo_project",
    "agent": "doc_bot",
    "summary": "Client expressed concern about rising costs in Q2.",
    "type": "insight",
    "tags": ["finance", "Q2", "costs"],
    "source": "Finance_Report_Q2.pdf",
    "author": "doc_bot",
    "created_at": "2025-06-22",
    "version": 1,
    "confidence": 0.92,
    "visibility": "project",
    "sensitivity": "medium"
}

### **Store Memories (Vector, Timeline, Graph)**
This section writes each memory to:

Qdrant (for semantic search)

DuckDB (for timeline)

Neo4j (for graph relationships)

In [15]:
from memosynth.vector_store import initialize
await initialize()  # Ensures the collection exists before anything else
print("✅ Created new memos collection")

✅ Created new memos collection


In [17]:
#Write and log the memories
async def store_memories():
    await write_and_sync_memory(memory1)
    await write_and_sync_memory(memory2)
    await write_and_sync_memory(memory3)
await store_memories()

📝 Writing memory: m-001
✅ Memory m-001 successfully written to Qdrant
Memory m-001 written and synced across all stores.
📝 Writing memory: m-002
✅ Memory m-002 successfully written to Qdrant
Memory m-002 written and synced across all stores.
📝 Writing memory: m-003
✅ Memory m-003 successfully written to Qdrant
Memory m-003 written and synced across all stores.


The timeline table below shows all memories in chronological order.
This helps track memory evolution and check for duplicates.

In [19]:
# Visualize timeline
df_timeline = con.execute("SELECT * FROM memory_log ORDER BY timestamp DESC").fetchdf()
print("Timeline:")
df_timeline = pd.DataFrame(df_timeline)
print(df_timeline)

Timeline:
      id                                            summary   timestamp  \
0  m-002  Client is expecting an up-to-date forecast of ...  2025-06-22   
1  m-003  Client expressed concern about rising costs in...  2025-06-22   
2  m-001              Client asked about margin drop in Q2.  2025-06-19   

   version  
0        1  
1        1  
2        1  


### **Graph Memory**
Here we create and query relationships between memories in Neo4j.
This demonstrates the system's ability to track how memories relate and influence each other.

In [22]:
#Create Additional Relationships
async def graph_demo():
    await relate_memories(memory1["id"], memory2["id"], relationship="RELATED_TO")
    await relate_memories(memory3["id"], memory1["id"], relationship="FOLLOWS")
    await relate_memories(memory3["id"], memory2["id"], relationship="CAUSES")
    # Find related memories (2 hops from memory1)
    related = await find_related_memories(memory1["id"], max_hops=2)
    print("\nRelated memories to m-001 (within 2 hops):")
    for rid, summary in related:
        print(f"{rid}: {summary}")
await graph_demo()



Related memories to m-001 (within 2 hops):
m-003: Client expressed concern about rising costs in Q2.
m-002: Client is expecting an up-to-date forecast of Q3 to plan ahead for pitfalls.


In [None]:
# Show all Memory and Entity nodes in Neo4j
async def show_graph_nodes():
    async with driver.session() as session:
        memories = await session.run("MATCH (m:Memory) RETURN m.id AS id, m.summary AS summary")
        entities = await session.run("MATCH (e:Entity) RETURN e.id AS id, e.name AS name, e.type AS type")
        print("Memories:")
        for record in await memories.data():
            print(record)
        print("\nEntities:")
        for record in await entities.data():
            print(record)
await show_graph_nodes()


### **Query Memories (Semantic Search with Recency/Confidence)**
This section demonstrates semantic search with recency and confidence re-ranking.
The results show the most relevant memories for a given query.

In [25]:
#Query for Q2 issues
async def query_demo():
    results = await query_memory("What are Q2 risks?")
    print("\nQuery Results for 'What are Q2 risks?':")
    for mem in results:
        print(f"{mem['summary']} (confidence: {mem['confidence']}, last_accessed: {mem.get('last_accessed')})")
await query_demo()



Query Results for 'What are Q2 risks?':
Client expressed concern about rising costs in Q2. (confidence: 0.92, last_accessed: 2025-06-27T03:29:29.131540+00:00)
Client is expecting an up-to-date forecast of Q3 to plan ahead for pitfalls. (confidence: 0.8, last_accessed: 2025-06-27T03:29:26.649279+00:00)
Client asked about margin drop in Q2. (confidence: 0.9, last_accessed: 2025-06-27T03:29:24.350402+00:00)


### **Summarize Memories (Using LLM)**

We use an LLM to summarize all memories in a single, concise paragraph.

LLM output is automatically cleaned and repaired using the json-repair library to handle malformed JSON.

In [28]:
#Summarize all memories
async def summarize_demo():
    all_memories = [memory1, memory2, memory3]
    summary = await summarize_memories(all_memories)
    print("\nSummary of all memories:")
    print(summary)
await summarize_demo()


Summary of all memories:
According to the given text, the client mentioned that they expect a potential downfall in Q3 due to rising costs in Q2. They also expressed concerns related to margin drop in Q2 and were requesting an up-to-date forecast of Q3 to plan ahead for potential pitfalls.


### **Compare and Resolve**
This section compares two memories for differences and uses the LLM to resolve any contradictions.
This is critical for reconciling conflicting information in long-term memory systems.

In [31]:
async def compare_and_resolve_demo():
    print("\nDiff between memory1 and memory2:")
    print(await diff(memory1, memory2))                # No model parameter needed
    print("\nResolution between memory1 and memory2:")
    print(await resolve(memory1, memory2))
await compare_and_resolve_demo()


Diff between memory1 and memory2:
⚠️ Difference in summaries (cosine similarity: 0.365):
1: Client asked about margin drop in Q2.
2: Client is expecting an up-to-date forecast of Q3 to plan ahead for pitfalls.

Resolution between memory1 and memory2:
After researching and analyzing multiple sources, I have come to a reconcile conclusion that while there has been a slight margin decline in Q2 (i.e., the second quarter), it is expected to stabilize by Q3 due to various factors, such as an anticipated uptick in sales volumes and improved market conditions. Therefore, the client can plan ahead accordingly for any potential pitfalls that may arise during this period. The reconcile conclusion is based on multiple sources' perspectives and data analysis, and it has been reached after thorough research and evaluation of conflicting insights from various angles.


### **Testing Core Features**
Here we test and validate each core feature:
1. Qdrant collection status
2. Timeline consistency and duplicate detection
3. Neo4j graph structure and relationships
4. Entity extraction quality
5. Update and Conflict handling
6. Semantic Search Effectiveness
7. Cross Store Data consistency

In [34]:
import httpx
response = httpx.get("http://localhost:6333/collections")
print(response.text)


{"result":{"collections":[{"name":"memos"}]},"status":"ok","time":0.000165833}


In [36]:
from qdrant_client import AsyncQdrantClient

client = AsyncQdrantClient("http://localhost:6333")
collections = await client.get_collections()
print(collections)


collections=[CollectionDescription(name='memos')]


In [38]:
# Test 1: Qdrant Collection Status (Fixed)
print("=== Test 1a: Qdrant Collection Status ===")
from memosynth.vector_store import client  # Use the SAME async client
import asyncio
async def test_qdrant_properly():
    collections = await client.get_collections()
    print(f"Available collections: {collections}")
    
    result = await client.scroll(
        collection_name="memos",
        limit=10,
        with_payload=True,
        with_vectors=False
    )
    
    # FIXED: Check result[0] (points list), not result[1] (next page offset)
    if result and result[0]:  # ✅ Correct!
        print(f"✅ Found {len(result[0])} points:")
        for pt in result[0]:
            print(f"  Point ID: {pt.id[:8]}... | Custom ID: {pt.payload['id']}")
    else:
        print("❌ No points returned by scroll")

await test_qdrant_properly()


=== Test 1a: Qdrant Collection Status ===
Available collections: collections=[CollectionDescription(name='memos')]
✅ Found 3 points:
  Point ID: 053a97d4... | Custom ID: m-001
  Point ID: 3fe8a291... | Custom ID: m-002
  Point ID: 6b05b2dd... | Custom ID: m-003


In [40]:
# Test 2: Timeline Consistency
print("\n=== Test 2: Timeline Integrity ===")
import duckdb
con = duckdb.connect("memory_timeline.db")

# Check all timeline entries
timeline_df = con.execute("SELECT * FROM memory_log ORDER BY timestamp DESC").fetchdf()
print(f"Timeline entries: {len(timeline_df)}")
print(timeline_df)

# Check for duplicates
duplicates = con.execute("""
    SELECT id, COUNT(*) as count 
    FROM memory_log 
    GROUP BY id 
    HAVING count > 1
""").fetchdf()

if duplicates.empty:
    print("✅ No duplicate entries in timeline")
else:
    print(f"⚠️ Found duplicates: {duplicates}")



=== Test 2: Timeline Integrity ===
Timeline entries: 3
      id                                            summary   timestamp  \
0  m-002  Client is expecting an up-to-date forecast of ...  2025-06-22   
1  m-003  Client expressed concern about rising costs in...  2025-06-22   
2  m-001              Client asked about margin drop in Q2.  2025-06-19   

   version  
0        1  
1        1  
2        1  
✅ No duplicate entries in timeline


In [42]:
# Test 3: Neo4j Graph Structure
print("\n=== Test 3: Graph Relationships ===")
from memosynth.graph_store import driver

async def test_graph_structure():
    async with driver.session() as session:
        # Count all nodes
        result = await session.run("MATCH (n) RETURN labels(n) as label, count(n) as count")
        records = await result.data()
        print("Node counts by label:")
        for record in records:
            print(f"  {record['label']}: {record['count']}")
        
        # Show all relationships
        result = await session.run("MATCH (a)-[r]->(b) RETURN a.id, type(r), b.id LIMIT 10")
        records = await result.data()
        print("\nRelationships:")
        for record in records:
            print(f"  {record['a.id']} --{record['type(r)']}-> {record['b.id']}")

await test_graph_structure()



=== Test 3: Graph Relationships ===
Node counts by label:
  ['Memory']: 3
  ['Entity']: 5

Relationships:
  m-001 --MENTIONS-> organization1
  entity1 --RELATIONSHIP_TYPE-> entity2
  m-002 --MENTIONS-> entity1
  m-002 --MENTIONS-> entity2
  m-003 --MENTIONS-> organization1
  m-003 --MENTIONS-> organization1
  m-003 --MENTIONS-> organization2
  m-001 --RELATED_TO-> m-002
  m-003 --FOLLOWS-> m-001
  m-003 --CAUSES-> m-002


In [44]:
# Test 4: Entity Extraction Testing
print("\n=== Test 4: Entity Extraction Quality ===")
from memosynth.graph_store import extract_entities_and_relationships

test_summaries = [
    "Microsoft announced partnership with OpenAI for Q3 2024",
    "The CEO mentioned budget concerns during the board meeting",
    "Client expressed satisfaction with the new product launch"
]

for i, summary in enumerate(test_summaries):
    print(f"\nTest {i+1}: {summary}")
    result = await extract_entities_and_relationships(summary)
    print(f"  Entities: {len(result['nodes'])}")
    for node in result['nodes']:
        print(f"    - {node.get('name', 'N/A')} ({node.get('type', 'unknown')})")
    print(f"  Relationships: {len(result['edges'])}")
    for edge in result['edges']:
        print(f"    - {edge.get('source', 'N/A')} --{edge.get('type', 'RELATED')}-> {edge.get('target', 'N/A')}")



=== Test 4: Entity Extraction Quality ===

Test 1: Microsoft announced partnership with OpenAI for Q3 2024
  Entities: 1
    - Microsoft (organization)
  Relationships: 1
    - entity1 --RELATIONSHIP_TYPE-> entity2

Test 2: The CEO mentioned budget concerns during the board meeting
  Entities: 1
    - CEO Mentioned Budget Concerns During Board Meeting (organization)
  Relationships: 1
    - entity1 --RELATIONSHIP_TYPE-> entity2

Test 3: Client expressed satisfaction with the new product launch
  Entities: 1
    - Client expressed satisfaction with the new product launch (organization)
  Relationships: 1
    - entity1 --RELATIONSHIP_TYPE-> entity2


In [46]:
# Test 5: Update and Conflict Handling
print("\n=== Test 5: Memory Update & Conflict Testing ===")
from memosynth.memory_client import update_memory

# Test 5a: Update existing memory (should increment version)
memory1_update = memory1.copy()
memory1_update["summary"] = "UPDATED: Client asked about margin drop in Q2 and requested action plan"
memory1_update["version"] = 1  # Same version, should increment

print("Original memory version:", memory1["version"])
await update_memory(memory1_update)

# Verify update
updated = await get_memory_by_id("m-001")
print("Updated memory version:", updated["version"] if updated else "None")

# Test 5b: Create conflict (old version)
memory1_conflict = memory1.copy()
memory1_conflict["summary"] = "CONFLICT: Different interpretation of Q2 margin issue"
memory1_conflict["version"] = 1  # Deliberately old version

await update_memory(memory1_conflict)

# Check conflict log
conflict_df = con.execute("SELECT * FROM conflict_log ORDER BY timestamp DESC LIMIT 5").fetchdf()
print("Recent conflicts:")
print(conflict_df)



=== Test 5: Memory Update & Conflict Testing ===
Original memory version: 1
📝 Writing memory: m-001
✅ Memory m-001 successfully written to Qdrant
Memory updated successfully.
Updated memory version: 2
Version conflict detected!
Current version: 2, Your version: 1
Recent conflicts:
                          timestamp conflict_type new_memory_id  \
0  2025-06-27T03:30:51.808397+00:00       version         m-001   
1  2025-06-27T01:14:07.049066+00:00       version         m-001   
2  2025-06-27T01:14:07.030776+00:00       version         m-001   
3  2025-06-27T01:13:48.023943+00:00       version         m-001   
4  2025-06-27T01:13:47.961855+00:00       version         m-001   

  current_memory_id                                        new_summary  \
0             m-001  CONFLICT: Different interpretation of Q2 margi...   
1             m-001  CONFLICT: Different interpretation of Q2 margi...   
2             m-001  UPDATED: Client asked about margin drop in Q2 ...   
3             m-00

In [48]:
# Test 6: Semantic Search Effectiveness
print("\n=== Test 6: Semantic Search Quality ===")
from memosynth.memory_client import query_memory

test_queries = [
    "financial concerns Q2",
    "future planning Q3",
    "client satisfaction issues",
    "budget problems cost overruns"
]

for query in test_queries:
    print(f"\nQuery: '{query}'")
    results = await query_memory(query, top_k=2)
    for i, mem in enumerate(results):
        print(f"  {i+1}. {mem['id']}: {mem['summary'][:60]}...")
        print(f"     Confidence: {mem.get('confidence', 'N/A')}")



=== Test 6: Semantic Search Quality ===

Query: 'financial concerns Q2'
  1. m-003: Client expressed concern about rising costs in Q2....
     Confidence: 0.92
  2. m-001: UPDATED: Client asked about margin drop in Q2 and requested ...
     Confidence: 0.9

Query: 'future planning Q3'
  1. m-002: Client is expecting an up-to-date forecast of Q3 to plan ahe...
     Confidence: 0.8
  2. m-003: Client expressed concern about rising costs in Q2....
     Confidence: 0.92

Query: 'client satisfaction issues'
  1. m-003: Client expressed concern about rising costs in Q2....
     Confidence: 0.92
  2. m-001: UPDATED: Client asked about margin drop in Q2 and requested ...
     Confidence: 0.9

Query: 'budget problems cost overruns'
  1. m-003: Client expressed concern about rising costs in Q2....
     Confidence: 0.92
  2. m-002: Client is expecting an up-to-date forecast of Q3 to plan ahe...
     Confidence: 0.8


In [None]:
# Test 7: Cross-Store Data Consistency
print("\n=== Test 7: Cross-Store Consistency Check ===")

# Qdrant
qdrant_ids = set()
points_info = await client.scroll(collection_name="memos", limit=100)
if points_info and points_info[0]:
    for pt in points_info[0]:
        qdrant_ids.add(pt.payload['id'])

# DuckDB
timeline_ids = set(con.execute("SELECT DISTINCT id FROM memory_log").fetchdf()['id'].tolist())

# Neo4j
async def get_neo4j_memory_ids():
    async with driver.session() as session:
        result = await session.run("MATCH (m:Memory) RETURN m.id as id")
        records = await result.data()
        return set(record['id'] for record in records)
neo4j_ids = await get_neo4j_memory_ids()

# Compare
if qdrant_ids == timeline_ids == neo4j_ids:
    print("✅ All stores have consistent memory IDs")
else:
    print("⚠️ Inconsistency detected:")
    print(f"  Missing from Timeline: {qdrant_ids - timeline_ids}")
    print(f"  Missing from Neo4j: {qdrant_ids - neo4j_ids}")
    print(f"  Missing from Qdrant: {timeline_ids - qdrant_ids}")


### **Performance Test**
We test the system's performance by inserting multiple memories and timing the process.
A summary table shows all inserted test memories.

In [52]:
# Performance Test with Multiple Memories
print("\n=== Test 8: Performance & Scale Test ===")
import time
from memosynth.memory_client import write_and_sync_memory

# Create batch of test memories
test_memories = []
for i in range(5):
    test_mem = {
        "id": f"perf-test-{i:03d}",
        "summary": f"Performance test memory {i} with various content and details",
        "project": "performance_test",
        "agent": "test_bot",
        "type": "test",
        "version": 1,
        "confidence": 0.7 + (i * 0.05),
        "created_at": "2025-06-26"
    }
    test_memories.append(test_mem)

# Time the batch insertion
start_time = time.time()
for mem in test_memories:
    await write_and_sync_memory(mem)
end_time = time.time()

print(f"Inserted {len(test_memories)} memories in {end_time - start_time:.2f} seconds")
print(f"Average: {(end_time - start_time) / len(test_memories):.2f} seconds per memory")

# Test query performance
start_time = time.time()
results = await query_memory("performance test content", top_k=3)
end_time = time.time()
print(f"Query completed in {end_time - start_time:.3f} seconds")
print(f"Found {len(results)} results")



=== Test 8: Performance & Scale Test ===
📝 Writing memory: perf-test-000
✅ Memory perf-test-000 successfully written to Qdrant
Memory perf-test-000 written and synced across all stores.
📝 Writing memory: perf-test-001
✅ Memory perf-test-001 successfully written to Qdrant
Memory perf-test-001 written and synced across all stores.
📝 Writing memory: perf-test-002
✅ Memory perf-test-002 successfully written to Qdrant
Memory perf-test-002 written and synced across all stores.
📝 Writing memory: perf-test-003
✅ Memory perf-test-003 successfully written to Qdrant
Memory perf-test-003 written and synced across all stores.
📝 Writing memory: perf-test-004
✅ Memory perf-test-004 successfully written to Qdrant
Memory perf-test-004 written and synced across all stores.
Inserted 5 memories in 8.73 seconds
Average: 1.75 seconds per memory
Query completed in 0.063 seconds
Found 3 results


### **List All Memories in Qdrant**
This cell lists all memories currently stored in Qdrant, for manual inspection and debugging.

In [54]:
# List all memories in Qdrant
points = await client.scroll(collection_name="memos", limit=20)
for pt in points[0]:
    print(pt.payload)

{'id': 'm-001', 'project': 'demo_project', 'agent': 'doc_bot', 'summary': 'UPDATED: Client asked about margin drop in Q2 and requested action plan', 'type': 'insight', 'tags': ['finance', 'Q2', 'risk'], 'source': 'Earnings_Report_Q2.pdf', 'author': 'doc_bot', 'created_at': '2025-06-19', 'version': 2, 'confidence': 0.9, 'visibility': 'project', 'sensitivity': 'medium', 'last_accessed': '2025-06-27T03:30:49.998383+00:00', 'qdrant_id': '053a97d4-b7ca-4c6f-9408-39f27db7ce3d'}
{'id': 'perf-test-002', 'summary': 'Performance test memory 2 with various content and details', 'project': 'performance_test', 'agent': 'test_bot', 'type': 'test', 'version': 1, 'confidence': 0.7999999999999999, 'created_at': '2025-06-26', 'qdrant_id': '2e0d8d96-88c4-435a-bc5b-2e6a02b3a35b', 'last_accessed': '2025-06-27T03:31:42.931769+00:00'}
{'id': 'm-002', 'project': 'demo_project', 'agent': 'doc_bot', 'summary': 'Client is expecting an up-to-date forecast of Q3 to plan ahead for pitfalls.', 'type': 'insight', 'ta