<a href="https://colab.research.google.com/github/crystalloide/RAG/blob/main/LAB23_M%C3%A9moires_Short_Term_et_Long_Term.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LAB23 : construction d'agents avec m√©moires "Short_Term" et "Long_Term"

## Objectifs :
Impl√©mentation de 2 types de m√©moire pour les agents IA :
- **Short-term memory** ‚Üí context window (messages r√©cents)
- **Long-term memory** ‚Üí knowledge base persistante (Vector DB)

## Temps estim√© :
- 20‚Äì30 minutes

**Livrables :**
- Notebook montrant comment un agent rappelle les conversations r√©centes et r√©cup√®re des faits stock√©s en m√©moire long terme.

## Step 1: Setup (5 min)

Installation des pr√©requis et d√©pendances n√©cessaires.

In [None]:
# Installation des d√©pendances
!pip install -q openai langchain chromadb python-dotenv
!pip install -q -U langchain-chroma langchain-huggingface langchain-core langchain-openai
!pip install -q datasets
!pip install -q transformers
!pip install -q faiss-cpu
!pip install -q langchain
!pip install -q langchain-community


print("‚úì D√©pendances install√©es avec succ√®s")

### Configuration de l'API OpenAI

Dans Google Colab, vous pouvez d√©finir votre cl√© API de deux fa√ßons :
1. **M√©thode s√©curis√©e (recommand√©e)** : Utiliser `google.colab.userdata`
2. **M√©thode alternative** : D√©finir directement en variable d'environnement

In [None]:
import os
from google.colab import userdata

# R√©cup√©rer la cl√© API depuis les secrets Colab
# Pour ajouter : cliquez sur üîë dans le panneau de gauche
try:
    openai_api_key = userdata.get('OPENAI_API_KEY')
    os.environ['OPENAI_API_KEY'] = openai_api_key
    print("‚úì Cl√© API OpenAI charg√©e depuis les secrets Colab")
except:
    print("‚ö† Secrets Colab non configur√©s. Veuillez ajouter OPENAI_API_KEY.")
    print("Instructions : Cliquez sur üîë dans le panneau gauche > Ajouter un nouveau secret")

## Step 2: Short-Term Memory (Context Window) (5 min)

La m√©moire court-terme consiste √† maintenir l'historique de conversation dans le prompt.

### Concept cl√© :
- Sans contexte ‚Üí le mod√®le "oublie"
- Avec historique ‚Üí il peut se rappeler des informations pr√©c√©demment donn√©es

In [None]:
from openai import OpenAI

client = OpenAI(api_key=os.environ.get('OPENAI_API_KEY'))

def chat_with_memory(messages):
    """
    Fonction pour communiquer avec le mod√®le en gardant l'historique.

    Args:
        messages: Liste de dictionnaires avec 'role' et 'content'

    Returns:
        str: R√©ponse du mod√®le
    """
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        temperature=0.7
    )
    return response.choices[0].message.content

print("‚úì Fonction chat_with_memory d√©finie")

### Exp√©rience 1 : Conversation avec m√©moire

Simulons une conversation o√π l'agent doit se souvenir du nom de l'utilisateur.

In [None]:
# Initialisation une conversation
conversation = [
    {"role": "system", "content": "You are a helpful tutor. Be concise and friendly."},
    {"role": "user", "content": "My name is Stephane."}
]

# Premier √©change
print("üó£Ô∏è User: My name is Stephane.")
reply1 = chat_with_memory(conversation)
print(f"ü§ñ Assistant: {reply1}")

# Ajout de la r√©ponse √† l'historique
conversation.append({"role": "assistant", "content": reply1})

# Deuxi√®me message - test de m√©moire court-terme
conversation.append({"role": "user", "content": "What is my name?"})
print("\nüó£Ô∏è User: What is my name?")
reply2 = chat_with_memory(conversation)
print(f"ü§ñ Assistant: {reply2}")

conversation.append({"role": "assistant", "content": reply2})

### Exp√©rience 2 : Sans contexte (oubli)

Demandons le nom du utilisateur **SANS l'historique** pour constater qu'effectivement le mod√®le oublie.

In [None]:
# Cr√©er une nouvelle conversation SANS contexte pr√©c√©dent
conversation_without_context = [
    {"role": "system", "content": "You are a helpful tutor."},
    {"role": "user", "content": "What is my name?"}
]

print("üó£Ô∏è User: What is my name?")
print("\n(Note: Sans contexte pr√©c√©dent dans la conversation)")
reply_without_context = chat_with_memory(conversation_without_context)
print(f"ü§ñ Assistant: {reply_without_context}")

print("\n" + "="*60)
print("ANALYSE:")
print(f"‚úì AVEC contexte: L'agent se souvient du nom (Stephane)")
print(f"‚úó SANS contexte: L'agent ne peut pas savoir le nom")
print("="*60)

## Step 3: Long-Term Memory (Vector Store) (10 min)

Persistance d'informations y compris apr√®s  la fin des sessions de conversation.

### Concept cl√© :
- Vector DB (Chroma) stocke les embeddings des documents et persiste les informations
- Lors d'un √©change, on effectue une recherche s√©mantique : "agent memory" ‚Üí r√©cup√®re les documents pertinents
-  ‚Üí Les informations sont persist√©es entre les sessions

In [None]:
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
#from langchain_community.vectorstores import Chroma
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_core.documents import Document

# Initialiser les embeddings OpenAI
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

print("‚úì OpenAIEmbeddings initialis√©")

### Cr√©er une base de connaissances

Nous allons stocker plusieurs faits sur les agents IA et LangChain.

In [None]:
# Les 6 documents de notre base de connaissances
knowledge_base = [
    Document(
        page_content="Agentic AI agents use tools and memory to accomplish complex tasks autonomously.",
        metadata={"source": "agentic_ai_basics", "type": "definition"}
    ),
    Document(
        page_content="LangChain is a framework that helps build autonomous agents with memory, tools, and chains.",
        metadata={"source": "langchain_intro", "type": "framework"}
    ),
    Document(
        page_content="RAG (Retrieval-Augmented Generation) improves accuracy by retrieving relevant knowledge before generating responses.",
        metadata={"source": "rag_concept", "type": "technique"}
    ),
    Document(
        page_content="Short-term memory in agents stores recent conversation context in the prompt window.",
        metadata={"source": "agent_memory", "type": "memory_type"}
    ),
    Document(
        page_content="Long-term memory uses vector databases to persistently store and retrieve semantic knowledge.",
        metadata={"source": "agent_memory", "type": "memory_type"}
    ),
    Document(
        page_content="CrewAI orchestrates multiple AI agents to collaborate on complex workflows.",
        metadata={"source": "crewai_framework", "type": "framework"}
    )
]

# On veille √† bien utiliser explicitement le nom de collection 'hanlab_long_term_memory'
db = Chroma.from_documents(
    documents=knowledge_base,
    embedding=embeddings,
    collection_name="hanlab_long_term_memory",  # Nom de collection
    persist_directory="./chroma_db"  # Optionnel: persiste les donn√©es sur disque
)

print(f"‚úì Base vectorielle cr√©√©e avec {len(knowledge_base)} documents")
print("‚úì Collection: 'hanlab_long_term_memory'")

### Exp√©rience 3 : Rechercher dans la m√©moire long-terme

Testons une recherche s√©mantique.

In [None]:
def recall_from_long_term_memory(query, k=2):
    """
    R√©cup√®re les documents les plus pertinents de la m√©moire long-terme.

    Args:
        query: Question ou recherche √† effectuer
        k: Nombre de documents √† retourner

    Returns:
        list: Documents pertinents
    """
    results = db.similarity_search(query, k=k)
    return results

# Test de recherche
query = "What is agent memory?"
print(f"üîç Query: {query}\n")

results = recall_from_long_term_memory(query, k=2)

print("R√©sultats r√©cup√©r√©s de la m√©moire long-terme:")
for i, doc in enumerate(results, 1):
    print(f"\n  [{i}] {doc.page_content}")
    print(f"      (Source: {doc.metadata['source']})")

### Exp√©rience 4 : Ajouter dynamiquement des nouveaux faits

Les agents peuvent continuellement apprendre et √©tendre leur base de connaissances.

In [None]:
# On regarde avant enrichissement les faits r√©cup√©rables :
query_kafka_avant = "Tell me about data streaming platforms"
print(f"üîç Query: {query_kafka_avant}")
print("\nR√©sultats (incluant les nouveaux faits):")
results_kafka = recall_from_long_term_memory(query_kafka_avant, k=2)
for i, doc in enumerate(results_kafka, 1):
    print(f"\n  [{i}] {doc.page_content}")
    print(f"      (Source: {doc.metadata['source']})")

In [None]:
# Nouveaux faits √† ajouter dynamiquement
new_facts = [
    Document(
        page_content="Apache Kafka is a distributed streaming platform for building real-time data pipelines.",
        metadata={"source": "kafka_platform", "type": "technology"}
    ),
    Document(
        page_content="Prompt engineering involves carefully crafting instructions to optimize LLM outputs.",
        metadata={"source": "prompt_engineering", "type": "technique"}
    )
]

# Ajouter les nouveaux documents √† la base
db.add_documents(new_facts)

print(f"‚úì {len(new_facts)} nouveaux faits ajout√©s √† la m√©moire long-terme")

In [None]:
# V√©rifier que les nouveaux faits sont r√©cup√©rables
query_kafka = "Tell me about data streaming platforms"
print(f"üîç Query: {query_kafka}")
print("\nR√©sultats (incluant les nouveaux faits):")
results_kafka = recall_from_long_term_memory(query_kafka, k=2)
for i, doc in enumerate(results_kafka, 1):
    print(f"\n  [{i}] {doc.page_content}")
    print(f"      (Source: {doc.metadata['source']})")

### Attention aux d√©ppr√©ciations :

| Aspect | RetrievalQA (Ancienne)                    | create_retrieval_chain() (Moderne) |
| ------------ | ----------------------------------------- | ---------------------------------- |
| Statut       |    Deprecated depuis v0.1.0python.langchain‚Äã | ‚úÖ Approche courante actuelle       |
| Architecture | Wrapper h√©rit√©                            | LCEL natif (plus flexible)         |
| Contr√¥le     | Limit√©                                    | Complet et composable              |
| Maintenance  | Arr√™t√©e                                   | Active et am√©lior√©e                |
| Performance  | OK                                        | Optimis√©e                          |

## Step 4: Association de la m√©moire Short-Term et de la m√©moire Long-Term (10 min)

Cela consiste √† combiner les deux syst√®mes de m√©moire : contexte conversationnel + connaissance persistante.

In [None]:
from langchain_classic.chains import create_retrieval_chain
from langchain_classic.chains import RetrievalQA
from langchain_openai import ChatOpenAI

# Initialiser le LLM et le retriever
llm = ChatOpenAI(model="gpt-4o-mini", temperature=1.0)

retriever = db.as_retriever(search_kwargs={"k": 3})

# Cr√©ation de la cha√Æne RAG (Retrieval-Augmented Generation)
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

print("‚úì Cha√Æne RetrievalQA cr√©√©e")
print("‚úì Combine short-term (contexte) + long-term (retriever) m√©moire")
conversation = [
    {"role":"system","content":"You are a teaching AI agent."},
    {"role":"user","content":"Remember my favorite framework is LangChain."},
    {"role":"assistant","content":"Got it, your favorite framework is LangChain."}
]
print("\nConversation:", conversation)
# User asks later
conversation.append({"role":"user","content":"What's my favorite framework and how do agents use memory?"})
print("_______________________________________")

# La 1√®re r√©ponse utilise la m√©moire court-terme (contexte) :
short_term_ans = chat_with_memory(conversation)
print("\nShort-term:", short_term_ans)
print("_______________________________________")
# La 2nde r√©ponse est enrichie grace √† la m√©moire long-terme (recherche dans vector DB) :
# print("\nLong-term:", qa_chain.invoke({"query": "How do agents use memory?"}))
long_term_response = qa_chain.invoke({"query": "How do agents use memory?"})
print("\nLong-term: How do agents use memory ?\n")
print(long_term_response["result"])
print("_______________________________________")
# La 3√®me r√©ponse reprend la 1√®re question itiale sur le framework pr√©f√©r√© mais seulement sur la base de la recherche dans vector DB :
# print("\nLong-term:", qa_chain.invoke({"query": "How do agents use memory?"}))
long_term_response = qa_chain.invoke({"query": "What's my favorite framework?"})
print("\nLong-term: What's my favorite framework ?\n")
print(long_term_response["result"])
print("_______________________________________")

### On rajoute l'information dans la base vectorielle

In [None]:
user_preference = [
    Document(
        page_content="The user's favorite framework is LangChain because it helps build autonomous agents with memory, tools, and chains.",
        metadata={"source": "user_preferences", "type": "user_preferences"}
    ),
    Document(
        page_content="The best framework is Hadoop.",
        metadata={"source": "prompt_engineering", "type": "Fake"}
    )
]

# Ajouter les nouveaux documents √† la base
db.add_documents(user_preference)

print(f"‚úì {len(new_facts)} nouveaux faits ajout√©s √† la m√©moire long-terme")


### Et on repose la question :

In [None]:
# La 4√®me r√©ponse reprend la 1√®re question initiale sur le framework pr√©f√©r√© toujours sur la base de la recherche dans vector DB :
long_term_response = qa_chain.invoke({"query": "What's my favorite framework?"})
print("\nLong-term: What's my favorite framework ?\n")
print(long_term_response["result"])
print("_______________________________________")

# La 5√®me r√©ponse interroge sur le meilleur framework :
long_term_response = qa_chain.invoke({"query": "What is the best's Framework?"})
print("\nLong-term: What's my favorite framework ?\n")
print(long_term_response["result"])
print("_______________________________________")

### Exp√©rience 5 : Conversation avec hybridation m√©moire

**Sc√©nario :** L'utilisateur indique quel est son framework favori (short-term), puis pose une question impliquant une connaissance long-terme.

In [None]:
# Initialise la conversation
hybrid_conversation = [
    {
        "role": "system",
        "content": "You are an expert teaching AI agent. You remember user preferences and use knowledge about AI frameworks and techniques."
    },
    {
        "role": "user",
        "content": "My favorite framework is LangChain because it helps build autonomous agents."
    }
]

# Premi√®re r√©ponse (short-term memory)
print("üó£Ô∏è User: My favorite framework is LangChain because it helps build autonomous agents.")
reply_acknowledge = chat_with_memory(hybrid_conversation)
print(f"ü§ñ Assistant: {reply_acknowledge}")

hybrid_conversation.append({"role": "assistant", "content": reply_acknowledge})

In [None]:
# Question qui demande √† la fois short-term et long-term memory
user_query = "What's my favorite framework and how do agents use memory?"
hybrid_conversation.append({"role": "user", "content": user_query})

print(f"\nüó£Ô∏è User: {user_query}")
print("\n" + "="*60)
print("R√âPONSE 1: Utilisant SHORT-TERM MEMORY UNIQUEMENT")
print("="*60)

# R√©ponse sans long-term memory (juste contexte)
short_term_reply = chat_with_memory(hybrid_conversation)
print(f"\nü§ñ Assistant:\n{short_term_reply}")

### Exp√©rience 6 : Court-terme oublie, Long-terme se souvient

**Scenario :** Apr√®s r√©initialisation de conversation, short-term memory oublie mais long-term memory persiste.

In [None]:
print("SCENARIO: R√©initialisation de conversation\n")
print("="*60)

# Conversation 1 : L'agent apprend une info
conv_session1 = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "I love Apache Kafka for real-time data processing."}
]
reply_session1 = chat_with_memory(conv_session1)
print(f"Session 1 - User: I love Apache Kafka for real-time data processing.")
print(f"Session 1 - Assistant: {reply_session1}\n")

# Conversation 2 : NOUVELLE CONVERSATION (conversation r√©initialis√©e)
conv_session2 = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What do I love for data processing?"}
]
reply_session2 = chat_with_memory(conv_session2)
print(f"Session 2 - User: What do I love for data processing?")
print(f"Session 2 - Assistant: {reply_session2}")
print(f"\n‚ùå SHORT-TERM MEMORY: L'agent a OUBLI√â (conversation r√©initialis√©e)")
print("="*60)

### Exp√©rience 7 : Persistance avec long-term memory

M√™me apr√®s r√©initialisation, la long-term memory persiste.

In [None]:
user_preference = [
    Document(
        page_content="I love Apache Kafka for real-time data processing..",
        metadata={"source": "user_preferences", "type": "user_preferences"}
    ),
    Document(
        page_content="One another best framework is Trunk Data Platform.",
        metadata={"source": "prompt_engineering", "type": "Tech"}
    )
]

# Ajouter les nouveaux documents √† la base
db.add_documents(user_preference)

print(f"‚úì {len(new_facts)} nouveaux faits ajout√©s √† la m√©moire long-terme")

In [None]:
print("SCENARIO: Requ√™te apr√®s r√©initialisation avec LONG-TERM MEMORY\n")
print("="*60)

# M√™me question, mais avec RAG
query_persistent = "What technologies are mentioned for real-time processing?"
print(f"Query: {query_persistent}")

# Cr√©ation de la cha√Æne RAG (Retrieval-Augmented Generation)
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)

qa_persistent = qa_chain.invoke({"query": query_persistent})
result_persistent = qa_persistent["result"]

print(f"\nü§ñ Assistant (with long-term memory):\n{result_persistent}")
print(f"\n‚úì LONG-TERM MEMORY: Les connaissances PERSISTENT m√™me apr√®s reset")
print("="*60)

### Exp√©rience 8 : Tester hallucination vs retrieval

Comparer les r√©ponses avec et sans contexte de retrieval.

In [None]:
# Imports corrects pour LangChain moderne
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

# Template pour le RAG
template = """You are an AI expert assistant. Use the following context to answer the question.
If you don't know the answer based on the context, say so.

Context: {context}

Question: {question}

Answer:"""

prompt = ChatPromptTemplate.from_template(template)

# Fonction pour formater les documents
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# Cr√©er la cha√Æne RAG avec LCEL (LangChain Expression Language)
qa_chain_lcel = (
    {
        "context": retriever | format_docs,
        "question": RunnablePassthrough()
    }
    | prompt
    | llm
    | StrOutputParser()
)

# Question qui pourrait causer une hallucination
hallucination_query = "What is the relationship between multi-agent systems and real-time data processing?"

print(f"üîç Query: {hallucination_query}\n")
print("="*60)
print("R√âPONSE 1: Sans retrieval (risque de hallucination)")
print("="*60)

# R√©ponse directe (sans RAG)
conv_no_rag = [
    {"role": "system", "content": "You are an AI expert."},
    {"role": "user", "content": hallucination_query}
]
reply_no_rag = chat_with_memory(conv_no_rag)
print(f"\n{reply_no_rag}")

print("\n" + "="*60)
print("R√âPONSE 2: Avec retrieval (grounding knowledge)")
print("="*60)

# Utiliser invoke()
reply_rag = qa_chain_lcel.invoke(hallucination_query)
print(f"\n{reply_rag}")

# Pour r√©cup√©rer aussi les documents sources
docs = retriever.invoke(hallucination_query)
print(f"\nDocuments utilis√©s comme source:")
for i, doc in enumerate(docs, 1):
    print(f"  [{i}] {doc.page_content[:200]}...")


In [None]:
# Tra√ßons chaque √©tape
question = "What is the relationship between multi-agent systems and real-time data processing?"

# √âtape 1: retriever + format
step1_context = retriever.invoke(question)
print("STEP 1 - Documents r√©cup√©r√©s:\n")
print(f"  Nombre de docs: {len(step1_context)}")

step1_formatted = "\n\n".join(doc.page_content for doc in step1_context)
print(f"  Contexte format√©:\n\n{step1_formatted[:200]}...\n")

# √âtape 2: prompt formatting
step2_input = {
    "context": step1_formatted,
    "question": question
}
step2_prompt = prompt.format(**step2_input)
print("STEP 2 - Prompt format√©:\n")
print(f"{step2_prompt}\n")

# √âtape 3: LLM generation
step3_llm_output = llm.invoke(step2_prompt)
print("STEP 3 - R√©ponse du LLM (AIMessage):")
print(f"  Type: {type(step3_llm_output)}")
print(f"  Contenu: {step3_llm_output.content[:200]}...\n")

# √âtape 4: Parser
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()
step4_final = parser.invoke(step3_llm_output)  # Utiliser invoke() au lieu de parse()
print("STEP 4 - R√©ponse finale (str):")
print(f"  Type: {type(step4_final)}")
print("  Contenu:")
# Indenter chaque ligne de la r√©ponse
for line in step4_final.split('\n'):
    print(f"    {line}")



### Exp√©rience 9 : Ajouter des faits dynamiquement et tester

Simuler un agent qui apprend continuellement.

In [None]:
print("SCENARIO: Apprentissage continu (agent learns new facts)\n")
print("="*60)

# Ajouter un nouveau fait important
fact_to_learn =  [
    Document (
        page_content="Function calling enables AI agents to interact with external APIs and tools to perform real-world actions.",
        metadata={"source": "agent_capabilities", "type": "technique"}
)
]

db.add_documents(fact_to_learn)
print("‚úì Nouveau fait ajout√©: 'Function calling enables AI agents...'")

# Tester la r√©cup√©ration
test_query = "How can agents interact with external systems?"
print(f"\nüîç Query: {test_query}")

# Invoquer avec la cha√Æne directement
qa_updated = qa_chain_lcel.invoke(test_query)
print(f"\nü§ñ Assistant (with updated knowledge):")
print(qa_updated)

# Afficher aussi les documents sources
docs = retriever.invoke(test_query)
print(f"\nüìö Documents utilis√©s ({len(docs)} trouv√©s):")
for i, doc in enumerate(docs, 1):
    print(f"  [{i}] {doc.page_content[:150]}...")

print(f"\n‚úì Agent a acc√®s au nouveau fait appris")

## Summary & Key Insights

### R√©capitulatif de ce que vous avez appris :

In [None]:
summary = """\n‚ïî‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïó
‚ïë         SHORT-TERM vs LONG-TERM MEMORY IN AGENTIC AI              ‚ïë
‚ïö‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïù

üìå SHORT-TERM MEMORY (Context Window)
‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
‚úì Stocke l'historique r√©cent de conversation
‚úì Permet au mod√®le de r√©f√©rencer les messages pr√©c√©dents
‚úì Limit√© par la taille de la context window (ex: 4K, 8K, 128K tokens)
‚úó Se r√©initialise √† chaque nouvelle conversation
‚úó Ne persiste pas entre les sessions

Cas d'usage:
‚Ä¢ Conversations interactives
‚Ä¢ Contexte imm√©diat et r√©f√©rences
‚Ä¢ Dialogue naturel

---

üíæ LONG-TERM MEMORY (Vector Database)
‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
‚úì Stocke les connaissances de fa√ßon persistante
‚úì R√©cup√©ration s√©mantique (recherche par similarit√©)
‚úì Scalable √† de grands corpus de connaissances
‚úì Persiste entre les sessions
‚úì Am√©liore accuracy et r√©duit hallucination (RAG)
‚úó N√©cessite une setup suppl√©mentaire
‚úó Co√ªt en appels embeddings

Cas d'usage:
‚Ä¢ Knowledge bases persistantes
‚Ä¢ Fact retrieval
‚Ä¢ Context augmentation (RAG)
‚Ä¢ Agents autonomes √† long terme

---

üîó HYBRID APPROACH (Short-term + Long-term)
‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
‚úì Combine conversational context + persistent knowledge
‚úì Meilleure compr√©hension du contexte utilisateur
‚úì Acc√®s √† des faits pr√©cis stock√©s
‚úì R√©duction des hallucinations
‚úì Agents plus intelligents et contextuels

Flux:
1. Contexte court-terme: Lire l'historique conversation
2. R√©cup√©ration long-terme: Chercher docs pertinents via RAG
3. Fusion: Combiner contexte + docs pour meilleure r√©ponse

---

üöÄ NEXT STEPS
‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
1. Essayez avec vos propres documents (PDFs, web pages)
2. Tunerez embedding model et retriever (top_k, distance threshold)
3. Impl√©mentez un agent loop complet (ReAct pattern)
4. Explorez multi-agent orchestration (CrewAI)
5. Comparez diff√©rentes Vector DBs: Chroma vs Pinecone vs Weaviate
"""
print(summary)

## üéì Challenges & Questions

Essayez ces variations pour approfondir votre compr√©hension :

1. **Challenge 1:** Ajoutez 5 nouveaux faits √† la base de connaissances. Testez les requ√™tes complexes qui les combinent.

2. **Challenge 2:** Cr√©ez une conversation multi-tour o√π l'agent doit:
   - Se rappeler le nom de l'utilisateur (short-term)
   - R√©cup√©rer des faits sur les frameworks (long-term)
   - G√©n√©rer une r√©ponse personnalis√©e

3. **Challenge 3:** Comparez 2 diff√©rentes requ√™tes:
   - L'une qui demande information dans la base (retrievable)
   - L'une qui demande information hors de la base (hallucination risk)

4. **Challenge 4:** Impl√©mentez un feedback loop:
   - L'agent g√©n√®re une r√©ponse
   - Vous √©valuez la confiance (hallucination ou pas)
   - Ajoutez des documents pour am√©liorer futures requ√™tes

5. **Challenge 5:** Explorez `similarity_search_with_score()` pour voir les distances d'embedding

## üìñ Ressources suppl√©mentaires

- [LangChain Documentation](https://python.langchain.com/)
- [Chroma Vector Database](https://www.trychroma.com/)
- [OpenAI Embeddings](https://platform.openai.com/docs/models/embeddings)
- [RAG Pattern](https://python.langchain.com/docs/use_cases/question_answering/)
- [CrewAI for Multi-Agent](https://crewai.io/)

In [None]:
print("\nüéâ Lab termin√©!")
print("‚úÖ Vous avez appris:")
print("   ‚Ä¢ Comment construire une m√©moire court-terme (context window)")
print("   ‚Ä¢ Comment construire une m√©moire long-terme (Vector DB)")
print("   ‚Ä¢ Comment combiner les deux pour des agents intelligents")
print("   ‚Ä¢ Comment √©viter les hallucinations avec RAG")
print("\nüìù Sauvegardez ce notebook pour r√©f√©rences futures!")