# Memory Systems in LangChain - Interactive Notebook

This notebook implements hands-on examples for:
- Conversation buffer memory
- Summary memory
- Vector memory with Chroma
- Lightweight knowledge graph memory (triple store)
- Persistence patterns (JSON + vector store persistence)
- Capstone: Memory-enabled assistant

Note: Some cells use OpenAI (for embeddings or LLM). Ensure your `.env` is configured (OPENAI_API_KEY).

## 1) Setup and Imports

In [None]:
import os
import re
import json
import time
import hashlib
from datetime import datetime
from typing import List, Dict, Any, Optional, Tuple

import numpy as np
from dotenv import load_dotenv
load_dotenv()

from langchain_core.prompts import PromptTemplate
from langchain_core.chains import LLMChain
from langchain_core.memory import ConversationBufferMemory
from langchain_core.documents import Document

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

print("Environment and imports set up.")

## 2) Conversation Buffer Memory
ConversationBufferMemory keeps verbatim chat history and injects it into the prompt.

In [None]:
try:
    llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0.4)
except Exception as e:
    llm = None
    print("ChatOpenAI unavailable (check OPENAI_API_KEY):", e)

prompt_text = """You are a helpful assistant.
{chat_history}
Human: {input}
Assistant:"""

prompt = PromptTemplate(
    template=prompt_text,
    input_variables=["input", "chat_history"]
)

memory = ConversationBufferMemory(memory_key="chat_history", input_key="input")

if llm:
    chat_chain = LLMChain(llm=llm, prompt=prompt, memory=memory, verbose=True)
    try:
        print(chat_chain.run(input="Hi, I'm learning memory systems."))
        print(chat_chain.run(input="What did I say earlier?"))
    except Exception as e:
        print("Chain run failed:", e)
else:
    print("Skipping LLM calls; no API key configured.")

Exercise A:
- Modify `prompt_text` to include a system rule like "Always answer concisely".
- Observe differences across several turns.

## 3) Summary Memory Strategy
Summarize previous turns to compress context and stay within token limits.

In [None]:
try:
    summarizer_llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
except Exception as e:
    summarizer_llm = None
    print("Summarizer model unavailable:", e)

summarizer_prompt = PromptTemplate(
    template="""Summarize the following conversation in 2-3 sentences.
{conversation}
Summary:""",
    input_variables=["conversation"]
)

conversation_log: List[str] = []

def append_and_summarize(user_text: str, ai_text: Optional[str] = None) -> str:
    conversation_log.append(f"Human: {user_text}")
    if ai_text is not None:
        conversation_log.append(f"Assistant: {ai_text}")
    joined = "\n".join(conversation_log[-20:])
    if summarizer_llm:
        chain = LLMChain(llm=summarizer_llm, prompt=summarizer_prompt)
        try:
            return chain.run(conversation=joined)
        except Exception as e:
            return f"Summarization failed: {e}"
    return "Summarizer not available (no API key)."

# Demo (mock AI text to illustrate summarization without consuming tokens)
s1 = append_and_summarize("Plan a 3-day trip to Goa.", "Day-by-day itinerary...")
print("Summary 1:\n", s1)
s2 = append_and_summarize("Make it budget-friendly.", "Updated itinerary with budget tips...")
print("\nSummary 2:\n", s2)

Exercise B:
- Constrain summary length to ~100 tokens.
- Compare recall quality vs. brevity across 10+ turns.

## 4) Vector Memory with Chroma
Store semantic memories (preferences, facts) as embeddings and retrieve when needed.

In [None]:
persist_dir = "./chroma_memory"
os.makedirs(persist_dir, exist_ok=True)

try:
    embeddings = OpenAIEmbeddings()
    vs = Chroma(collection_name="memories", embedding_function=embeddings, persist_directory=persist_dir)
    print("Chroma vector store ready.")
except Exception as e:
    embeddings = None
    vs = None
    print("Chroma/Embeddings unavailable:", e)

def remember(text: str, metadata: Optional[Dict[str, Any]] = None):
    if vs is None:
        print("Vector store not available; skipping.")
        return
    doc = Document(page_content=text, metadata=metadata or {})
    vs.add_documents([doc])
    vs.persist()

def recall(query: str, k: int = 3) -> List[Document]:
    if vs is None:
        print("Vector store not available; returning empty list.")
        return []
    retriever = vs.as_retriever(search_kwargs={"k": k})
    return retriever.get_relevant_documents(query)

# Demo
remember("User prefers concise answers", {"type": "preference"})
remember("Project is about LangChain memory", {"type": "context"})

docs = recall("What does the user prefer?")
for d in docs:
    print("Retrieved:", d.metadata, "|", d.page_content)

Exercise C:
- Insert 5+ conversation facts and build a function `build_context(query)` that retrieves and concatenates top-k facts for prompts.

## 5) Knowledge Graph Memory (Triple Store)
Lightweight structured memory to store (subject, predicate, object) triples for precise recall.

In [None]:
from collections import defaultdict

class TripleStore:
    def __init__(self):
        self.forward = defaultdict(set)  # (s, p) -> {o}
        self.reverse = defaultdict(set)  # (o, p) -> {s}
    def add(self, s: str, p: str, o: str):
        self.forward[(s, p)].add(o)
        self.reverse[(o, p)].add(s)
    def objects(self, s: str, p: str) -> List[str]:
        return list(self.forward.get((s, p), []))
    def subjects(self, o: str, p: str) -> List[str]:
        return list(self.reverse.get((o, p), []))

kg = TripleStore()
kg.add("Alice", "likes", "RAG")
kg.add("Alice", "role", "Engineer")
kg.add("ProjectX", "uses", "LangChain")

print("Alice likes:", kg.objects("Alice", "likes"))
print("Who uses LangChain:", kg.subjects("LangChain", "uses"))

def extract_triples_stub(text: str) -> List[Tuple[str, str, str]]:
    """A simple stub that finds 'X likes Y' patterns; replace with LLM extraction for robustness."""
    triples: List[Tuple[str, str, str]] = []
    for m in re.finditer(r"(\b[A-Z][a-zA-Z0-9_]+)\s+likes\s+(\b[A-Z][a-zA-Z0-9_]+)", text):
        s, o = m.group(1), m.group(2)
        triples.append((s, "likes", o))
    return triples

found = extract_triples_stub("Bob likes LangChain. Carol likes RAG.")
for s, p, o in found:
    kg.add(s, p, o)
print("KG after extraction - Bob likes:", kg.objects("Bob", "likes"))
print("KG after extraction - Carol likes:", kg.objects("Carol", "likes"))

Exercise D:
- Implement `extract_triples(text)` using an LLM prompt and parse structured output (e.g., JSON markdown block) to populate the `TripleStore`.

## 6) Persistence Patterns
Combine multiple memory forms and save to disk (JSON for buffer/summary; Chroma for vector memory).

In [None]:
state = {
    "buffer": [],
    "summary": ""
}

def save_state(path: str = "memory_state.json"):
    with open(path, "w", encoding="utf-8") as f:
        json.dump(state, f, ensure_ascii=False, indent=2)

def load_state(path: str = "memory_state.json") -> Dict[str, Any]:
    try:
        with open(path, "r", encoding="utf-8") as f:
            return json.load(f)
    except FileNotFoundError:
        return {"buffer": [], "summary": ""}

# Persist an example state
state["buffer"].append({"role": "user", "content": "Hello again"})
state["summary"] = "User greeted; topic pending."
save_state()
print("Saved state:", load_state())

# Note: Chroma vector store persistence already handled via 'persist_directory' and vs.persist() calls.

Exercise E:
- Add simple encryption-at-rest for JSON memory using a symmetric key (e.g., Fernet).
- Discuss trade-offs in key management and performance overhead.

## 7) Capstone: Memory-Enabled Assistant
Combine buffer, summary, KG, and vector memories into a single assistant wrapper.

In [None]:
class MemoryAssistant:
    def __init__(self):
        try:
            self.llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0.3)
        except Exception as e:
            self.llm = None
            print("LLM unavailable:", e)

        self.buffer = ConversationBufferMemory(memory_key="chat_history", input_key="input")
        self.summary = ""
        self.kg = TripleStore()

        try:
            self.vs = Chroma(
                collection_name="assistant_mem",
                embedding_function=OpenAIEmbeddings(),
                persist_directory="./assistant_mem"
            )
        except Exception as e:
            self.vs = None
            print("Vector store unavailable:", e)

        # Summarizer
        try:
            self.summarizer = LLMChain(
                llm=ChatOpenAI(model="gpt-3.5-turbo", temperature=0),
                prompt=PromptTemplate(
                    template="""Summarize:
{conversation}
Summary:""",
                    input_variables=["conversation"]
                )
            )
        except Exception as e:
            self.summarizer = None
            print("Summarizer chain unavailable:", e)

        self.chat_prompt = PromptTemplate(
            template="""System: Use prior context when helpful.
{chat_history}
Facts: {facts}
Retrieved: {retrieved}
Human: {input}
Assistant:""",
            input_variables=["chat_history", "facts", "retrieved", "input"]
        )

        if self.llm:
            self.chat_chain = LLMChain(llm=self.llm, prompt=self.chat_prompt, memory=self.buffer)
        else:
            self.chat_chain = None

    def add_fact(self, s: str, p: str, o: str):
        self.kg.add(s, p, o)
        if self.vs:
            self.vs.add_texts([f"{s} {p} {o}"])
            self.vs.persist()

    def summarize(self) -> str:
        if not self.summarizer:
            return "Summarizer unavailable"
        convo = self.buffer.buffer
        try:
            self.summary = self.summarizer.run(conversation=convo)
            return self.summary
        except Exception as e:
            return f"Summarize failed: {e}"

    def ask(self, text: str) -> str:
        # Retrieve
        if self.vs:
            try:
                retrieved_docs = self.vs.as_retriever(search_kwargs={"k": 3}).get_relevant_documents(text)
            except Exception as e:
                print("Retrieval failed:", e)
                retrieved_docs = []
        else:
            retrieved_docs = []
        retrieved = " \n".join([d.page_content for d in retrieved_docs]) if retrieved_docs else ""

        # Facts (example: list what Alice likes)
        facts = ", ".join(self.kg.objects("Alice", "likes"))

        if not self.chat_chain:
            return "Chat chain unavailable (no LLM)."
        try:
            return self.chat_chain.run(input=text, facts=facts, retrieved=retrieved)
        except Exception as e:
            return f"Ask failed: {e}"

# Demo
assistant = MemoryAssistant()
assistant.add_fact("Alice", "likes", "RAG")
resp = assistant.ask("What do we know about Alice?")
print("Assistant:", resp if len(resp) < 400 else resp[:400] + " ...")

Exercise F (Capstone):
- Drop old turns when updating the summary and clear buffer accordingly.
- Add per-user namespaces (e.g., separate Chroma collections per user).
- Implement `export_state()` that saves buffer, summary, KG triples, and vector store metadata to disk.

## Summary
You implemented multiple memory strategies and combined them into a robust assistant. Next, build on this by integrating tools and agents, and later move to LangGraph for workflow orchestration.