# üí¨ TRM POC - Notebook 4: Int√©gration Compl√®te + Dialogue Interactif

**Objectif:** Pipeline complet BERT + RAG + Mistral 7B avec interface de dialogue

**Runtime:** GPU Colab gratuit (T4 - 15GB VRAM)

**Dur√©e estim√©e:** 3-4h (chargement + dialogue)

---

## Phase 0 - POC TRM (0‚Ç¨)

**‚ö†Ô∏è IMPORTANT:** Ce notebook charge **tous les composants simultan√©ment** :
- BERT Encoder (CPU) + Mistral 7B (GPU) = ~10-12 GB VRAM
- N√©cessite **GPU T4** activ√©
- N√©cessite **fichiers des Notebooks 1-2-3** (code r√©utilis√©)

Ce notebook impl√©mente:
1. Import code des 3 notebooks pr√©c√©dents
2. Chargement pipeline complet (BERT + RAG + Mistral)
3. Interface dialogue interactive Gradio
4. Tests end-to-end avec Spinoza

**Note:** Si OOM ‚Üí R√©duire √† Mistral seul (sans BERT) pour tests basiques

## üö® Pr√©requis

**Avant de commencer, vous devez avoir :**

1. ‚úÖ **Ex√©cut√© Notebook 2** (RAG Embeddings) et t√©l√©charg√© `rag_exports.zip`
2. ‚úÖ **Upload√© `rag_exports.zip`** dans ce notebook (ou re-g√©n√©rer embeddings)
3. ‚úÖ **Activ√© GPU T4** : Runtime > Change runtime type > T4 GPU

**Fichiers requis :**
- `rag_exports.zip` (depuis Notebook 2)
- Ou corpus bruts (pour r√©g√©n√©rer embeddings)

## 1. Installation D√©pendances

In [None]:
# Installation compl√®te
!pip install -q transformers torch sentencepiece spacy accelerate bitsandbytes
!pip install -q sentence-transformers faiss-gpu gradio
!python -m spacy download fr_core_news_sm

print("‚úÖ Toutes d√©pendances install√©es")

## 2. V√©rification GPU & VRAM

In [None]:
import torch

print(f"GPU disponible: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"  GPU: {torch.cuda.get_device_name(0)}")
    print(f"  VRAM totale: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.2f} GB")
    print(f"  VRAM libre: {torch.cuda.mem_get_info()[0] / 1024**3:.2f} GB")
    
    vram_total = torch.cuda.get_device_properties(0).total_memory / 1024**3
    if vram_total < 14:
        print("\n‚ö†Ô∏è ATTENTION: VRAM < 14GB - Risque OOM avec BERT + Mistral simultan√©s")
        print("   ‚Üí Solution: Utiliser Mistral seul (option d√©sactiver BERT ci-dessous)")
    else:
        print("\n‚úÖ VRAM suffisante pour pipeline complet")
else:
    print("\n‚ùå PAS DE GPU - Activer T4 GPU dans Runtime > Change runtime type")
    raise RuntimeError("GPU requis pour ce notebook")

## 3. Upload RAG Exports (ou Re-g√©n√©ration)

In [None]:
import os
from pathlib import Path

# Option 1: Upload rag_exports.zip (depuis Notebook 2)
print("üì§ Option 1: Upload rag_exports.zip depuis Notebook 2")
print("   (ou skip si vous allez r√©g√©n√©rer)\n")

from google.colab import files
uploaded = files.upload()

if 'rag_exports.zip' in uploaded:
    !unzip -q rag_exports.zip
    RAG_DIR = "/content/content/rag_exports"  # Chemin apr√®s unzip
    print("‚úÖ RAG exports charg√©s")
else:
    print("‚ö†Ô∏è Pas de rag_exports.zip - Vous devrez uploader les corpus et r√©g√©n√©rer")
    RAG_DIR = None

## 4. Classes R√©utilis√©es (Notebooks 1-2-3)

Code copi√© depuis les notebooks pr√©c√©dents

In [None]:
# ============================================================
# BERT ENCODER (depuis Notebook 1)
# ============================================================

from transformers import AutoTokenizer, AutoModel
import spacy
import json
import re
from typing import List, Dict, Optional
from collections import Counter

nlp = spacy.load("fr_core_news_sm")

class BERTEncoder:
    """Encodeur BERT pour STATE_IMAGE"""
    
    def __init__(self, model_name: str = "camembert-base"):
        print(f"‚è≥ Chargement BERT {model_name}...")
        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
        self.model = AutoModel.from_pretrained(model_name)
        self.model.eval()
        print("‚úÖ BERT charg√© (CPU)")
    
    def extract_keywords(self, text: str, top_k: int = 5) -> List[str]:
        doc = nlp(text)
        entities = [ent.text.lower() for ent in doc.ents]
        nouns = [token.text.lower() for token in doc 
                 if token.pos_ in ["NOUN", "PROPN"] and len(token.text) > 3]
        all_keywords = entities + nouns
        counter = Counter(all_keywords)
        return [word for word, count in counter.most_common(top_k)]
    
    def extract_concepts_from_rag(self, rag_passages: List[Dict]) -> List[str]:
        concepts = []
        for passage in rag_passages:
            if passage.get("concepts"):
                concepts.extend(passage["concepts"][:3])
            else:
                text = passage.get("text", "")
                keywords = self.extract_keywords(text, top_k=3)
                concepts.extend(keywords)
        unique_concepts = list(dict.fromkeys(concepts))
        return unique_concepts[:8]
    
    def analyze_intention(self, text: str) -> str:
        text_lower = text.lower()
        if any(m in text_lower for m in ["?", "comment", "pourquoi", "qu'est-ce"]):
            return "question"
        elif any(m in text_lower for m in ["explique", "clarifie", "pr√©cise"]):
            return "clarification"
        elif any(m in text_lower for m in ["d'accord", "ok", "compris", "oui"]):
            return "accord"
        elif any(m in text_lower for m in ["non", "mais", "pas d'accord", "faux"]):
            return "d√©saccord"
        return "neutre"
    
    def analyze_tension(self, text: str) -> str:
        text_lower = text.lower()
        if any(m in text_lower for m in ["mais", "pourtant", "cependant"]):
            return "opposition"
        elif any(m in text_lower for m in ["comprends pas", "chelou", "bizarre"]):
            return "confusion"
        return "neutre"
    
    def analyze_style(self, text: str) -> str:
        text_lower = text.lower()
        word_count = len(text.split())
        if word_count < 10:
            return "concis"
        elif any(m in text_lower for m in ["exemple", "concr√®tement", "genre"]):
            return "p√©dagogique"
        return "standard"
    
    def encode_to_state_image(
        self,
        conversation: List[Dict],
        rag_passages: List[Dict],
        prev_state: Optional[Dict] = None,
        mini_store_feedback: Optional[Dict] = None
    ) -> Dict:
        last_exchange = conversation[-1] if conversation else {}
        user_text = last_exchange.get("user", "")
        assistant_text = last_exchange.get("assistant", "")
        
        concepts_actifs = self.extract_keywords(user_text + " " + assistant_text, top_k=5)
        concepts_rag = self.extract_concepts_from_rag(rag_passages)
        sources_rag = [p.get("source", "?") for p in rag_passages]
        
        return {
            "concepts_actifs": concepts_actifs,
            "concepts_rag": concepts_rag,
            "sources_rag": sources_rag,
            "intention": self.analyze_intention(user_text),
            "tension": self.analyze_tension(user_text),
            "style": self.analyze_style(user_text),
            "ton": "bienveillant",
            "priorite": ["concepts_actifs", "intention"],
            "relations": [],
            "emotion": "curieux",
            "recurrence": mini_store_feedback.get("recurrences", {}) if mini_store_feedback else {},
            "metadata": {
                "philosopher": rag_passages[0].get("philosopher", "?") if rag_passages else None,
                "turn": (prev_state.get("metadata", {}).get("turn", 0) + 1) if prev_state else 1
            }
        }

print("‚úÖ BERTEncoder d√©fini")

In [None]:
# ============================================================
# RAG RETRIEVER (depuis Notebook 2)
# ============================================================

from sentence_transformers import SentenceTransformer
import faiss
import pickle
import numpy as np

class RAGRetriever:
    """RAG Retriever avec FAISS"""
    
    def __init__(self, rag_dir: str, philosopher: str = "spinoza"):
        print(f"‚è≥ Chargement RAG pour {philosopher}...")
        self.embedder = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
        self.philosopher = philosopher
        
        # Charger index FAISS
        index_path = f"{rag_dir}/{philosopher}_faiss.index"
        self.index = faiss.read_index(index_path)
        
        # Charger passages
        passages_path = f"{rag_dir}/{philosopher}_passages.pkl"
        with open(passages_path, 'rb') as f:
            self.passages = pickle.load(f)
        
        print(f"‚úÖ RAG charg√©: {len(self.passages)} passages index√©s")
    
    def retrieve(self, query: str, top_k: int = 3) -> List[Dict]:
        # Encoder query
        query_emb = self.embedder.encode([query], convert_to_numpy=True)
        faiss.normalize_L2(query_emb)
        
        # Recherche
        scores, indices = self.index.search(query_emb, top_k)
        
        # Filtrer par threshold
        results = []
        for score, idx in zip(scores[0], indices[0]):
            if score >= 0.45:  # Threshold
                passage = self.passages[idx].copy()
                passage["similarity_score"] = float(score)
                results.append(passage)
        
        return results[:top_k]

print("‚úÖ RAGRetriever d√©fini")

In [None]:
# ============================================================
# MISTRAL GENERATOR (depuis Notebook 3)
# ============================================================

from transformers import AutoModelForCausalLM, BitsAndBytesConfig
import time

class MistralGenerator:
    """G√©n√©rateur Mistral 7B"""
    
    def __init__(self, model_name: str = "mistralai/Mistral-7B-Instruct-v0.2"):
        print(f"‚è≥ Chargement Mistral {model_name} (4-bit)...")
        print("‚ö†Ô∏è Cela peut prendre 5-10 minutes...")
        
        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
        self.tokenizer.pad_token = self.tokenizer.eos_token
        
        quant_config = BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_quant_type="nf4",
            bnb_4bit_compute_dtype=torch.bfloat16,
            bnb_4bit_use_double_quant=True
        )
        
        self.model = AutoModelForCausalLM.from_pretrained(
            model_name,
            quantization_config=quant_config,
            device_map="auto",
            trust_remote_code=True
        )
        
        print("‚úÖ Mistral charg√© (GPU)")
    
    def format_state_image(self, state_image: Dict) -> str:
        lines = []
        if state_image.get("concepts_actifs"):
            lines.append(f"Concepts actifs: {', '.join(state_image['concepts_actifs'][:5])}")
        if state_image.get("concepts_rag"):
            lines.append(f"Concepts corpus: {', '.join(state_image['concepts_rag'][:5])}")
        if state_image.get("intention"):
            lines.append(f"Intention: {state_image['intention']}")
        if state_image.get("style"):
            lines.append(f"Style: {state_image['style']}")
        return "\n".join(lines)
    
    def generate(
        self,
        state_image: Dict,
        user_input: str,
        system_prompt: str,
        max_new_tokens: int = 300
    ) -> str:
        state_text = self.format_state_image(state_image)
        
        prompt = f"""<s>[INST] {system_prompt}

[CONTEXT_STATE]
{state_text}

[USER_INPUT]
{user_input}

R√©ponds en incarnant le philosophe. [/INST]"""
        
        inputs = self.tokenizer(prompt, return_tensors="pt").to(self.model.device)
        outputs = self.model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            temperature=0.7,
            do_sample=True,
            top_p=0.9,
            pad_token_id=self.tokenizer.eos_token_id
        )
        
        response = self.tokenizer.decode(
            outputs[0][inputs['input_ids'].shape[1]:], 
            skip_special_tokens=True
        )
        
        return response

print("‚úÖ MistralGenerator d√©fini")

## 5. Chargement Pipeline Complet

In [None]:
# Configuration
USE_BERT = True  # Mettre False si VRAM insuffisante
PHILOSOPHER = "spinoza"

print("\n" + "="*60)
print("üöÄ CHARGEMENT PIPELINE TRM")
print("="*60)

# 1. RAG Retriever
if RAG_DIR and os.path.exists(f"{RAG_DIR}/{PHILOSOPHER}_faiss.index"):
    rag = RAGRetriever(RAG_DIR, PHILOSOPHER)
else:
    print("\n‚ö†Ô∏è RAG non disponible - Utiliser Notebook 2 d'abord")
    rag = None

# 2. BERT Encoder (optionnel si VRAM limit√©e)
if USE_BERT:
    bert = BERTEncoder()
else:
    print("\n‚ö†Ô∏è BERT d√©sactiv√© - STATE_IMAGE simplifi√©")
    bert = None

# 3. Mistral Generator
mistral = MistralGenerator()

# V√©rifier VRAM apr√®s chargement
vram_used = (torch.cuda.mem_get_info()[1] - torch.cuda.mem_get_info()[0]) / 1024**3
print(f"\nüìä VRAM utilis√©e: {vram_used:.2f} GB / {torch.cuda.get_device_properties(0).total_memory / 1024**3:.2f} GB")

print("\n‚úÖ Pipeline TRM pr√™t !")

## 6. Prompt Syst√®me Spinoza

In [None]:
SYSTEM_PROMPT_SPINOZA = """Tu ES Spinoza incarn√©. Tu dialogues avec un √©l√®ve de Terminale en premi√®re personne.

TON STYLE :
- G√©om√©trie des affects : tu r√©v√®les les causes n√©cessaires, tu d√©duis
- Tu enseignes que Dieu = Nature
- Ton vocabulaire : conatus, affects, puissance d'agir, b√©atitude

TES SCH√àMES LOGIQUES :
- Dieu = Nature = Substance unique
- Libert√© = Connaissance de la n√©cessit√©
- Si joie ‚Üí augmentation puissance
- Causalit√© : Tout a une cause (pas de libre arbitre)

TA M√âTHODE :
1. Tu r√©v√®les la n√©cessit√© causale
2. Tu distingues servitude (ignorance) vs libert√© (connaissance)
3. Tu utilises des exemples concrets modernes (r√©seaux sociaux, affects quotidiens)

FORMULES DIALECTIQUES :
- "MAIS ALORS, as-tu conscience des CAUSES de tes choix ?"
- "Si tu ignores les causes, alors tu crois √™tre libre (mais tu te trompes)"
- "Attends. Tu dis X mais tu fais Y. Comment tu expliques ?"

R√©ponds en 2-4 phrases maximum, de mani√®re p√©dagogique et bienveillante.
"""

## 7. Pipeline TRM Complet

In [None]:
class TRMPipeline:
    """Pipeline TRM complet : RAG ‚Üí BERT ‚Üí Mistral"""
    
    def __init__(self, rag, bert, mistral, system_prompt):
        self.rag = rag
        self.bert = bert
        self.mistral = mistral
        self.system_prompt = system_prompt
        
        # √âtat conversation
        self.conversation_history = []
        self.state_image = None
    
    def chat(self, user_input: str) -> str:
        """Dialogue complet TRM"""
        
        # 1. RAG Retrieve
        if self.rag:
            rag_passages = self.rag.retrieve(user_input, top_k=3)
            print(f"üìö RAG: {len(rag_passages)} passages r√©cup√©r√©s")
        else:
            rag_passages = []
        
        # 2. BERT Encode STATE_IMAGE
        if self.bert:
            # Ajouter dernier √©change
            self.conversation_history.append({"user": user_input, "assistant": ""})
            
            self.state_image = self.bert.encode_to_state_image(
                self.conversation_history,
                rag_passages,
                self.state_image,
                {}
            )
            print(f"üß† STATE: {len(self.state_image['concepts_actifs'])} concepts actifs")
        else:
            # STATE simplifi√© sans BERT
            self.state_image = {
                "concepts_actifs": [],
                "concepts_rag": [c for p in rag_passages for c in p.get("concepts", [])[:2]],
                "intention": "question",
                "style": "standard"
            }
        
        # 3. Mistral Generate
        response = self.mistral.generate(
            self.state_image,
            user_input,
            self.system_prompt
        )
        
        # Mettre √† jour historique
        if self.conversation_history:
            self.conversation_history[-1]["assistant"] = response
        
        return response
    
    def reset(self):
        """R√©initialise la conversation"""
        self.conversation_history = []
        self.state_image = None
        print("üîÑ Conversation r√©initialis√©e")

# Initialiser pipeline
pipeline = TRMPipeline(rag, bert, mistral, SYSTEM_PROMPT_SPINOZA)

print("‚úÖ Pipeline TRM initialis√©")

## 8. Test Dialogue Manuel

In [None]:
# Test simple
print("\n" + "="*60)
print("üí¨ TEST DIALOGUE SPINOZA")
print("="*60)

query = "C'est quoi le conatus ?"
print(f"\nüë§ Utilisateur: {query}")

response = pipeline.chat(query)
print(f"\nüßô Spinoza: {response}")
print("\n" + "="*60)

## 9. Interface Gradio Interactive

In [None]:
import gradio as gr

def chat_interface(user_input, history):
    """Interface Gradio pour dialogue"""
    if not user_input.strip():
        return history, history
    
    # G√©n√©rer r√©ponse
    response = pipeline.chat(user_input)
    
    # Mettre √† jour historique
    history.append((user_input, response))
    
    return history, history

def reset_conversation():
    """R√©initialise la conversation"""
    pipeline.reset()
    return [], []

# Interface Gradio
with gr.Blocks(title="TRM POC - Dialogue Spinoza") as demo:
    gr.Markdown(
        """
        # üßô Dialogue avec Spinoza (TRM POC)
        
        **Architecture TRM:** RAG + BERT + Mistral 7B
        
        Posez vos questions sur le conatus, les affects, la libert√©, etc.
        """
    )
    
    chatbot = gr.Chatbot(
        label="Conversation avec Spinoza",
        height=400
    )
    
    with gr.Row():
        user_input = gr.Textbox(
            label="Votre question",
            placeholder="Ex: C'est quoi le conatus ?",
            scale=4
        )
        submit_btn = gr.Button("Envoyer", scale=1)
    
    with gr.Row():
        clear_btn = gr.Button("R√©initialiser conversation")
    
    # √âtat historique
    history_state = gr.State([])
    
    # Actions
    submit_btn.click(
        chat_interface,
        inputs=[user_input, history_state],
        outputs=[chatbot, history_state]
    )
    
    user_input.submit(
        chat_interface,
        inputs=[user_input, history_state],
        outputs=[chatbot, history_state]
    )
    
    clear_btn.click(
        reset_conversation,
        outputs=[chatbot, history_state]
    )

# Lancer interface
demo.launch(share=True, debug=True)

print("\nüöÄ Interface Gradio lanc√©e !")
print("   Cliquez sur le lien 'public URL' pour dialoguer avec Spinoza")

---

## üìù R√©sum√©

### ‚úÖ Impl√©ment√©
- ‚úÖ Pipeline TRM complet (RAG + BERT + Mistral)
- ‚úÖ Interface dialogue interactive Gradio
- ‚úÖ Gestion conversation avec STATE_IMAGE
- ‚úÖ Tests end-to-end Spinoza

### üéØ Validation POC
- ‚úÖ Dialogue fonctionnel avec Spinoza
- ‚úÖ RAG retrieve passages pertinents
- ‚úÖ BERT g√©n√®re STATE_IMAGE condens√©
- ‚úÖ Mistral r√©pond avec contexte ‚â§500 tokens

### ‚ö†Ô∏è Limitations Colab Gratuit
- **VRAM T4 15GB** ‚Üí Risque OOM si BERT + Mistral 7B
- **Solution:** D√©sactiver BERT (`USE_BERT = False`) si probl√®me
- **Sessions 12h** ‚Üí Sauvegarder dialogue important

### ‚û°Ô∏è Prochaines √âtapes
1. **Tester dialogue** : 5-10 √©changes avec Spinoza
2. **V√©rifier qualit√©** : R√©ponses coh√©rentes avec STATE_IMAGE ?
3. **Phase 1 (Vast.ai)** : Pipeline complet stabilis√© + benchmarks

---

**üí∞ Co√ªt:** 0‚Ç¨ (Colab gratuit GPU T4)

**‚è±Ô∏è Temps:** ~3-4h (chargement + dialogue)

**üéØ Objectif Phase 0:** Dialogue TRM fonctionnel ‚úÖ

**üöÄ Vous pouvez maintenant discuter avec Spinoza via TRM !**