# AI4Industry - Mensaflow ¬© - RAG avec Ollama + Qdrant
## Architecture: Mistral + Vector DB + Agents (Math + Wikipedia)

**Corrig√© par Claude - Janvier 2025**

## üì¶ Installation des d√©pendances

In [None]:
%%capture
# Installation Ollama
!sudo apt-get install -y zstd
!curl -fsSL https://ollama.com/install.sh | sh

# Installation des packages Python
!pip install -q langchain==0.3.20 langchain-community==0.3.20 langchain-core==0.3.40
!pip install -q qdrant-client sentence-transformers wikipedia
!pip install -q numexpr sympy

## üöÄ D√©marrage du serveur Ollama

In [None]:
import os
import subprocess
import time

# D√©marrer Ollama en arri√®re-plan
print("üîÑ D√©marrage du serveur Ollama...")
subprocess.Popen(["ollama", "serve"], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
time.sleep(5)

# T√©l√©charger Mistral 7B
print("üì• T√©l√©chargement de Mistral 7B...")
!ollama pull mistral
print("‚úÖ Mistral pr√™t !")

## üß† Initialisation du LLM

In [None]:
from langchain_community.llms import Ollama

llm = Ollama(
    model="mistral",
    base_url="http://localhost:11434",
    temperature=0.7
)

# Test rapide
response = llm.invoke("Bonjour, qui es-tu ?")
print("ü§ñ Mistral:", response[:200])

## üóÉÔ∏è Configuration de Qdrant (Vector Database)

In [None]:
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import Qdrant
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams

# Embedding model
print("üìä Chargement du mod√®le d'embeddings...")
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2",
    model_kwargs={'device': 'cpu'}
)

# Client Qdrant (en m√©moire pour Colab)
print("üóÑÔ∏è Initialisation de Qdrant...")
qdrant_client = QdrantClient(":memory:")

collection_name = "ai4industry_docs"

# Cr√©er la collection
qdrant_client.create_collection(
    collection_name=collection_name,
    vectors_config=VectorParams(size=384, distance=Distance.COSINE)
)

print("‚úÖ Qdrant pr√™t !")

## üìö Alimentation de la base vectorielle (Wikipedia)

In [None]:
import wikipedia
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.schema import Document

# Configurer Wikipedia en fran√ßais
wikipedia.set_lang("fr")

# Topics pertinents pour l'industrie
topics = [
    "Intelligence artificielle",
    "Machine learning",
    "Industrie 4.0",
    "Maintenance pr√©dictive",
    "Internet des objets"
]

print("üìñ R√©cup√©ration des articles Wikipedia...")
documents = []

for topic in topics:
    try:
        page = wikipedia.page(topic)
        doc = Document(
            page_content=page.content,
            metadata={"source": topic, "url": page.url}
        )
        documents.append(doc)
        print(f"  ‚úì {topic}")
    except Exception as e:
        print(f"  ‚úó {topic}: {e}")

# D√©coupage en chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50
)
chunks = text_splitter.split_documents(documents)
print(f"\nüìù {len(chunks)} chunks cr√©√©s")

# Indexation dans Qdrant
print("üîÑ Indexation dans Qdrant...")
vectorstore = Qdrant.from_documents(
    chunks,
    embeddings,
    client=qdrant_client,
    collection_name=collection_name
)
print("‚úÖ Base vectorielle pr√™te !")

## üõ†Ô∏è Cr√©ation des outils (Tools)

In [None]:
from langchain.tools import Tool
from langchain_community.utilities import WikipediaAPIWrapper
import numexpr

# 1. Outil de calcul math√©matique
def calculate(expression: str) -> str:
    """√âvalue une expression math√©matique."""
    try:
        result = numexpr.evaluate(expression).item()
        return f"R√©sultat: {result}"
    except Exception as e:
        return f"Erreur de calcul: {str(e)}"

math_tool = Tool(
    name="Calculator",
    func=calculate,
    description="Utile pour les calculs math√©matiques. Exemple: '2+2' ou '10*5'"
)

# 2. Outil Wikipedia
wikipedia_wrapper = WikipediaAPIWrapper(lang="fr")
wikipedia_tool = Tool(
    name="Wikipedia",
    func=wikipedia_wrapper.run,
    description="Recherche des informations sur Wikipedia en fran√ßais"
)

# 3. Outil RAG (retrieval)
def rag_search(query: str) -> str:
    """Recherche dans la base vectorielle."""
    docs = vectorstore.similarity_search(query, k=3)
    context = "\n\n".join([doc.page_content for doc in docs])
    return context

rag_tool = Tool(
    name="RAG_Search",
    func=rag_search,
    description="Recherche dans la base de connaissances vectorielle (documents index√©s)"
)

tools = [math_tool, wikipedia_tool, rag_tool]
print("‚úÖ 3 outils cr√©√©s: Calculator, Wikipedia, RAG_Search")

## ü§ñ Cr√©ation de l'Agent (CORRIG√â)

In [None]:
# CORRECTION: Import depuis langchain.agents (nouvelle API)
from langchain.agents import create_react_agent, AgentExecutor
from langchain_core.prompts import PromptTemplate

# Template pour l'agent ReAct
template = """Tu es un assistant IA sp√©cialis√© dans l'industrie et la technologie.

Tu as acc√®s aux outils suivants:
{tools}

Noms des outils: {tool_names}

Utilise ce format:
Question: la question pos√©e
Thought: je dois r√©fl√©chir √† quel outil utiliser
Action: le nom de l'outil (un parmi [{tool_names}])
Action Input: l'entr√©e pour l'outil
Observation: le r√©sultat de l'outil
... (r√©p√®te Thought/Action/Action Input/Observation si n√©cessaire)
Thought: je connais maintenant la r√©ponse finale
Final Answer: la r√©ponse finale √† la question

Question: {input}
{agent_scratchpad}
"""

prompt = PromptTemplate.from_template(template)

# Cr√©er l'agent
agent = create_react_agent(
    llm=llm,
    tools=tools,
    prompt=prompt
)

# Cr√©er l'executor
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
    handle_parsing_errors=True,
    max_iterations=5
)

print("‚úÖ Agent RAG cr√©√© avec succ√®s !")

## üß™ Tests de l'Agent

In [None]:
# Test 1: Calcul math√©matique
print("="*60)
print("TEST 1: Calcul math√©matique")
print("="*60)
response = agent_executor.invoke({"input": "Calcule 147 * 23 + 456"})
print("\nüìä R√©ponse:", response['output'])

In [None]:
# Test 2: Recherche Wikipedia
print("\n" + "="*60)
print("TEST 2: Recherche Wikipedia")
print("="*60)
response = agent_executor.invoke({"input": "Qu'est-ce que l'Industrie 4.0 ?"})
print("\nüìö R√©ponse:", response['output'])

In [None]:
# Test 3: RAG (recherche vectorielle)
print("\n" + "="*60)
print("TEST 3: RAG Search")
print("="*60)
response = agent_executor.invoke({"input": "Parle-moi du machine learning dans l'industrie"})
print("\nüîç R√©ponse:", response['output'])

In [None]:
# Test 4: Question complexe combinant plusieurs outils
print("\n" + "="*60)
print("TEST 4: Question complexe")
print("="*60)
response = agent_executor.invoke({
    "input": "Si une usine utilise 3 capteurs IoT qui g√©n√®rent chacun 500 MB de donn√©es par jour, combien de GB cela repr√©sente en une semaine ?"
})
print("\nüè≠ R√©ponse:", response['output'])

## üí¨ Interface Interactive

In [None]:
def chat():
    print("\nü§ñ Agent RAG AI4Industry - Mensaflow")
    print("Type 'exit' pour quitter\n")
    
    while True:
        user_input = input("Vous: ")
        if user_input.lower() in ['exit', 'quit', 'sortir']:
            print("Au revoir !")
            break
        
        try:
            response = agent_executor.invoke({"input": user_input})
            print(f"\nAgent: {response['output']}\n")
        except Exception as e:
            print(f"Erreur: {str(e)}\n")

# Lancer le chat
chat()

## üìä Statistiques et Diagnostics

In [None]:
# V√©rifier le nombre de documents dans Qdrant
collection_info = qdrant_client.get_collection(collection_name)
print(f"üìà Nombre de vectors dans Qdrant: {collection_info.vectors_count}")

# Test de similarit√©
query = "maintenance pr√©dictive"
results = vectorstore.similarity_search(query, k=3)
print(f"\nüîç Top 3 r√©sultats pour '{query}':")
for i, doc in enumerate(results, 1):
    print(f"\n{i}. Source: {doc.metadata.get('source', 'N/A')}")
    print(f"   Extrait: {doc.page_content[:150]}...")

---
## üéØ R√©sum√© de l'Architecture

**Stack Technique:**
- **LLM**: Mistral 7B (via Ollama)
- **Vector DB**: Qdrant (in-memory)
- **Embeddings**: sentence-transformers/all-MiniLM-L6-v2
- **Agents**: Calculator, Wikipedia, RAG Search
- **Framework**: LangChain 0.3.x

**Corrections appliqu√©es:**
1. ‚úÖ Import correct de `AgentExecutor` (depuis `langchain.agents`)
2. ‚úÖ Utilisation de `create_react_agent` moderne
3. ‚úÖ Template ReAct compatible
4. ‚úÖ Gestion d'erreurs am√©lior√©e

**Mensaflow ¬© 2025 - Formation AI4Industry CNAM**