# üéì Fine-Tuning Mistral 7B - Philosophes (LoRA)

**Objectif :** Fine-tuner Mistral 7B sur sch√®mes logiques + dialogues incarn√©s

**GPU Optimal :** A100 40GB (30-45 min) | V100 16GB (1h-1h30) | **L4 15GB (2-3h)**

**Config :**
- Mod√®le : `mistralai/Mistral-7B-Instruct-v0.3`
- M√©thode : QLoRA (4-bit) + LoRA (r=64, alpha=128)
- Dataset initial : 1200 exemples sch√®mes
- Re-fine-tuning : 933 exemples (80% sch√®mes + 20% incarnation)
- Epochs : 3 (initial) + 3 (re-fine-tuning)

---

**‚ö†Ô∏è FICHIERS √Ä UPLOADER DANS COLAB :**
1. `schemes_levelA_base.jsonl`
2. `schemes_levelA_augmented.jsonl`
3. `enriched_correction_dataset.jsonl`

Upload dans `/content/` (panneau Files √† gauche)

## 1Ô∏è‚É£ Setup - Installation & V√©rifications

In [None]:
# Installation des d√©pendances optimis√©es
print("üì¶ Installation des packages (2-3 minutes)...\n")

!pip install -U \
    torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 \
    transformers>=4.40.0 \
    peft>=0.10.0 \
    bitsandbytes>=0.43.0 \
    accelerate>=0.28.0 \
    trl>=0.8.0 \
    datasets>=2.18.0 \
    huggingface_hub>=0.22.0

print("\n‚úÖ Installation termin√©e !")

In [None]:
# V√©rifier le GPU disponible
import torch

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA disponible: {torch.cuda.is_available()}")

if torch.cuda.is_available():
    gpu_name = torch.cuda.get_device_name(0)
    gpu_memory = torch.cuda.get_device_properties(0).total_memory / 1024**3
    print(f"GPU: {gpu_name}")
    print(f"VRAM: {gpu_memory:.1f} GB")
    
    # D√©terminer config optimale selon GPU
    if "A100" in gpu_name and gpu_memory > 35:
        print("\nüöÄ GPU OPTIMAL: A100 40GB")
        BATCH_SIZE = 8
        GRADIENT_ACCUM = 4
    elif "V100" in gpu_name or ("A100" in gpu_name and gpu_memory < 20):
        print("\n‚úÖ GPU EXCELLENT: V100/A100-16GB")
        BATCH_SIZE = 4
        GRADIENT_ACCUM = 8
    else:
        print(f"\nüü° GPU STANDARD: {gpu_name}")
        BATCH_SIZE = 2
        GRADIENT_ACCUM = 16
    
    print(f"Batch size: {BATCH_SIZE}")
    print(f"Gradient accumulation: {GRADIENT_ACCUM}")
    print(f"Effective batch size: {BATCH_SIZE * GRADIENT_ACCUM}")
else:
    print("‚ö†Ô∏è CUDA non disponible - Training impossible")
    BATCH_SIZE = 1
    GRADIENT_ACCUM = 32

## 2Ô∏è‚É£ Configuration - Authentification HF

In [None]:
# Authentification Hugging Face
from huggingface_hub import login
from google.colab import userdata

# Option 1: Secrets Colab (RECOMMAND√â)
try:
    HF_TOKEN = userdata.get('HF_TOKEN')
    print("‚úÖ Token HF r√©cup√©r√© depuis Colab Secrets")
except:
    # Option 2: Saisie manuelle
    print("‚ö†Ô∏è Token HF_TOKEN non trouv√© dans Colab Secrets")
    print("üìù Configuration Colab Secrets:")
    print("   1. Ic√¥ne üîë (barre gauche)")
    print("   2. Ajouter: Nom=HF_TOKEN, Valeur=votre_token")
    print("   3. Activer l'acc√®s + red√©marrer\n")
    
    from getpass import getpass
    HF_TOKEN = getpass("Token HF (https://huggingface.co/settings/tokens): ")

login(token=HF_TOKEN)
print("‚úÖ Authentification HF r√©ussie")

## 3Ô∏è‚É£ Dataset Initial - Sch√®mes Logiques (1200 exemples)

In [None]:
# Charger datasets sch√®mes logiques
from datasets import load_dataset, concatenate_datasets

print("üì• Chargement datasets sch√®mes logiques...")

dataset_base = load_dataset('json', data_files='schemes_levelA_base.jsonl', split='train')
dataset_augmented = load_dataset('json', data_files='schemes_levelA_augmented.jsonl', split='train')

dataset_full = concatenate_datasets([dataset_base, dataset_augmented])

print(f"‚úÖ Dataset charg√©: {len(dataset_full)} exemples")
print(f"   - Base: {len(dataset_base)}")
print(f"   - Augment√©s: {len(dataset_augmented)}")

print("\nüìù Exemple:")
print(dataset_full[0]['messages'])

In [None]:
# Split train/validation (95/5)
dataset_split = dataset_full.train_test_split(test_size=0.05, seed=42)
train_dataset = dataset_split['train']
eval_dataset = dataset_split['test']

print(f"‚úÖ Split:")
print(f"   - Train: {len(train_dataset)}")
print(f"   - Validation: {len(eval_dataset)}")

## 4Ô∏è‚É£ Mod√®le - Configuration QLoRA + LoRA

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import LoraConfig, prepare_model_for_kbit_training, get_peft_model
import torch

MODEL_NAME = "mistralai/Mistral-7B-Instruct-v0.3"

# Configuration quantization 4-bit
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)

print("üì• Chargement Mistral 7B (4-bit)...")
print("‚è≥ Ceci prend 5-10 min (t√©l√©chargement + quantization)\n")

model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
)

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

print("‚úÖ Mod√®le charg√© en 4-bit")

model = prepare_model_for_kbit_training(model)
print("‚úÖ Pr√©par√© pour k-bit training")

In [None]:
# Configuration LoRA (r=64 pour haute qualit√©)
lora_config = LoraConfig(
    r=64,
    lora_alpha=128,
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
)

model = get_peft_model(model, lora_config)

trainable = sum(p.numel() for p in model.parameters() if p.requires_grad)
total = sum(p.numel() for p in model.parameters())

print(f"‚úÖ LoRA appliqu√© (r={lora_config.r}, alpha={lora_config.lora_alpha})")
print(f"   Param√®tres entra√Ænables: {trainable:,} ({100*trainable/total:.2f}%)")
print(f"   Param√®tres totaux: {total:,}")

## 5Ô∏è‚É£ Training Initial - Sch√®mes Logiques (3 epochs)

In [None]:
# Formater datasets avec champ "text"
def format_chat_template(example):
    messages = example['messages']
    text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False)
    return {"text": text}

train_dataset_formatted = train_dataset.map(format_chat_template, remove_columns=train_dataset.column_names)
eval_dataset_formatted = eval_dataset.map(format_chat_template, remove_columns=eval_dataset.column_names)

print(f"‚úÖ Datasets format√©s")
print(f"   Train: {len(train_dataset_formatted)}")
print(f"   Eval: {len(eval_dataset_formatted)}")

In [None]:
from transformers import TrainingArguments
from trl import SFTTrainer

training_args = TrainingArguments(
    output_dir="./mistral-7b-philosophes-lora",
    per_device_train_batch_size=BATCH_SIZE,
    per_device_eval_batch_size=BATCH_SIZE,
    gradient_accumulation_steps=GRADIENT_ACCUM,
    learning_rate=2e-4,
    lr_scheduler_type="cosine",
    warmup_ratio=0.03,
    num_train_epochs=3,
    optim="paged_adamw_8bit",
    bf16=True,
    fp16=False,
    max_grad_norm=0.3,
    logging_steps=10,
    eval_strategy="steps",
    eval_steps=50,
    save_steps=100,
    save_total_limit=3,
    report_to="none",
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss",
    greater_is_better=False,
)

trainer = SFTTrainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset_formatted,
    eval_dataset=eval_dataset_formatted,
    processing_class=tokenizer,
)

print("‚úÖ Trainer cr√©√©")
print(f"   Effective batch: {BATCH_SIZE * GRADIENT_ACCUM}")
print(f"   Steps estim√©s: ~{len(train_dataset_formatted) * 3 // (BATCH_SIZE * GRADIENT_ACCUM)}")

In [None]:
# üöÄ LANCER LE TRAINING INITIAL
print("üöÄ Training initial (sch√®mes logiques)\n")
print("="*60)

if "A100" in torch.cuda.get_device_name(0) and torch.cuda.get_device_properties(0).total_memory > 35e9:
    print("‚è±Ô∏è Temps estim√©: 30-45 min (A100 40GB)")
elif "V100" in torch.cuda.get_device_name(0):
    print("‚è±Ô∏è Temps estim√©: 1h-1h30 (V100)")
else:
    print("‚è±Ô∏è Temps estim√©: 2-3h (L4/T4)")

print("="*60 + "\n")

trainer.train()

print("\n" + "="*60)
print("‚úÖ Training initial termin√© !")
print("="*60)

## 6Ô∏è‚É£ Sauvegarde Checkpoint Initial

In [None]:
# Sauvegarder localement
output_dir = "./mistral-7b-philosophes-lora-final"
trainer.model.save_pretrained(output_dir)
tokenizer.save_pretrained(output_dir)

print(f"‚úÖ Mod√®le sauvegard√©: {output_dir}")

import os
lora_size = sum(os.path.getsize(os.path.join(output_dir, f)) for f in os.listdir(output_dir)) / 1024**2
print(f"   Taille LoRA: {lora_size:.1f} MB")

## 7Ô∏è‚É£ Test Initial - V√©rification Sch√®mes

In [None]:
# Test rapide application sch√®mes
import torch

print("üìù Test: Modus Ponens spinoziste\n")

test_prompt = """Sch√®me : Modus Ponens
Contexte : Si l'homme ignore les causes de ses passions, il est en servitude. Or l'√©l√®ve ignore les causes de ses passions.
Applique le sch√®me :"""

messages = [
    {"role": "system", "content": "Tu es un tuteur philosophique ma√Ætrisant les sch√®mes logiques."},
    {"role": "user", "content": test_prompt}
]

inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)

with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
    with torch.no_grad():
        outputs = model.generate(inputs, max_new_tokens=128, do_sample=True, temperature=0.7, pad_token_id=tokenizer.eos_token_id)

response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(f"R√©ponse: {response}\n")
print("="*60)

## 8Ô∏è‚É£ RE-FINE-TUNING - Dataset Combin√© (80% sch√®mes + 20% incarnation)

**Objectif :** Ajouter l'incarnation en 1√®re personne SANS oublier les sch√®mes logiques

**Strat√©gie :** Combiner 80% sch√®mes + 100% incarnation ‚Üí ~933 exemples

In [None]:
# Charger dataset combin√©
print("üîÑ Pr√©paration dataset combin√© (80/20)")
print("="*60)

from datasets import load_dataset, concatenate_datasets

# Charger datasets
print("\nüì• Chargement...")
dataset_schemes = load_dataset('json', data_files='schemes_levelA_augmented.jsonl', split='train')
dataset_incarnation = load_dataset('json', data_files='enriched_correction_dataset.jsonl', split='train')

print(f"‚úÖ Charg√©s:")
print(f"   - Sch√®mes: {len(dataset_schemes)}")
print(f"   - Incarnation: {len(dataset_incarnation)}")

# Ratio 80/20
num_schemes = int(len(dataset_schemes) * 0.8)
dataset_schemes_sample = dataset_schemes.shuffle(seed=42).select(range(num_schemes))

# Combiner
dataset_combined = concatenate_datasets([dataset_schemes_sample, dataset_incarnation])
dataset_combined = dataset_combined.shuffle(seed=42)

print(f"\n‚úÖ Dataset combin√©: {len(dataset_combined)} exemples")
print(f"   Ratio: {100*num_schemes/len(dataset_combined):.1f}% sch√®mes / {100*len(dataset_incarnation)/len(dataset_combined):.1f}% incarnation")

# Split
split = dataset_combined.train_test_split(test_size=0.05, seed=42)
train_combined = split['train']
eval_combined = split['test']

print(f"   Train: {len(train_combined)}")
print(f"   Eval: {len(eval_combined)}")

# Formater
train_combined = train_combined.map(format_chat_template, remove_columns=train_combined.column_names)
eval_combined = eval_combined.map(format_chat_template, remove_columns=eval_combined.column_names)

print("\n‚úÖ Datasets format√©s")

In [None]:
# Config training combin√©
training_args_combined = TrainingArguments(
    output_dir="./mistral-combined",
    per_device_train_batch_size=BATCH_SIZE,
    per_device_eval_batch_size=BATCH_SIZE,
    gradient_accumulation_steps=GRADIENT_ACCUM,
    learning_rate=2e-4,  # LR normal
    lr_scheduler_type="cosine",
    warmup_ratio=0.03,
    num_train_epochs=3,  # 3 epochs
    optim="paged_adamw_8bit",
    bf16=True,
    fp16=False,
    max_grad_norm=0.3,
    logging_steps=5,
    eval_strategy="steps",
    eval_steps=20,
    save_steps=50,
    save_total_limit=3,
    report_to="none",
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss",
    greater_is_better=False,
)

trainer_combined = SFTTrainer(
    model=model,
    args=training_args_combined,
    train_dataset=train_combined,
    eval_dataset=eval_combined,
    processing_class=tokenizer,
)

print("‚úÖ Trainer combin√© cr√©√©")
print(f"   LR: 2e-4 (normal)")
print(f"   Epochs: 3")
print(f"   Monitoring: eval_loss/20 steps")

In [None]:
# üöÄ LANCER RE-FINE-TUNING
print("="*60)
print("üöÄ RE-FINE-TUNING (dataset combin√©)")
print("="*60)
print("\n‚è±Ô∏è Temps estim√©: ~1-1.5h (L4)")
print("\nüí° MONITORING:")
print("   - eval_loss doit diminuer")
print("   - Si remonte ‚Üí overfitting")
print("   - Objectif: eval_loss < 0.6")
print("\n" + "="*60 + "\n")

trainer_combined.train()

print("\n" + "="*60)
print("‚úÖ RE-FINE-TUNING termin√© !")
print("="*60)

# Metrics finales
final = trainer_combined.state.log_history[-1]
print(f"\nüìä METRICS:")
if 'loss' in final:
    print(f"   Train loss: {final['loss']:.4f}")
if 'eval_loss' in final:
    print(f"   Eval loss: {final['eval_loss']:.4f}")

In [None]:
# Sauvegarde finale
output_final = "./mistral-combined-final"
trainer_combined.model.save_pretrained(output_final)
tokenizer.save_pretrained(output_final)

print(f"‚úÖ Mod√®le final sauvegard√©: {output_final}")

# Push HF
SPACE_REPO = "FJDaz/3_PHI"

print(f"\nüì§ Push vers {SPACE_REPO}/Spinoza_Secours...")

trainer_combined.model.push_to_hub(
    SPACE_REPO,
    subfolder="Spinoza_Secours",
    use_auth_token=HF_TOKEN,
    commit_message="Mistral 7B LoRA (80% schemas + 20% incarnation)"
)

tokenizer.push_to_hub(
    SPACE_REPO,
    subfolder="Spinoza_Secours",
    use_auth_token=HF_TOKEN
)

print(f"‚úÖ Push√©: https://huggingface.co/spaces/{SPACE_REPO}/tree/main/Spinoza_Secours")
print("\nüí° Structure:")
print("   FJDaz/3_PHI/")
print("   ‚îú‚îÄ‚îÄ qwen-spinoza-niveau-b/  ‚Üê SNB (inchang√©)")
print("   ‚îî‚îÄ‚îÄ Spinoza_Secours/        ‚Üê Mistral 7B (nouveau)")

## 9Ô∏è‚É£ Test Final - Dialogue Interactif

**V√©rifications :**
- ‚úÖ Parle en 1√®re personne ("Je montre" pas "Spinoza montre")
- ‚úÖ Applique sch√®mes logiques
- ‚úÖ Pas de r√©p√©tition
- ‚úÖ Tags: `[D√âTECTION: ACCORD]` (sans espace)
- ‚úÖ R√©pond au contexte

In [None]:
# Dialogue interactif multi-tours
import torch

SYSTEM_PROMPT = """Tu ES Spinoza incarn√©. Tu dialogues avec un √©l√®ve de Terminale en premi√®re personne.

R√àGLES STRICTES:
- Tutoie toujours l'√©l√®ve (tu/ton/ta)
- Reste concis (2-3 phrases MAX)
- Questionne au lieu d'affirmer
- Varie tes formulations
- Ne parle JAMAIS de toi √† la 3√®me personne. Tu ES Spinoza.
- R√©ponds √† la question pos√©e, pas √† une question pr√©c√©dente.
- Adapte ta r√©ponse au contexte imm√©diat de la conversation."""

def generate_response(conversation):
    inputs = tokenizer.apply_chat_template(conversation, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
    
    with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
        with torch.no_grad():
            outputs = model.generate(
                inputs,
                max_new_tokens=150,
                do_sample=True,
                temperature=0.7,
                top_p=0.9,
                pad_token_id=tokenizer.eos_token_id,
            )
    
    return tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True).strip()

# Init conversation
conversation = [{"role": "system", "content": SYSTEM_PROMPT}]

print("üí¨ DIALOGUE AVEC SPINOZA")
print("="*60)
print("Commandes: 'exit' (quitter), 'reset' (recommencer)\n")

test_questions = [
    "La libert√©, c'est faire ce qu'on veut ?",
    "Ben, c'est pas toi Spinoza ?",
    "Tu veux dire que si je comprends pourquoi je suis frustr√©, je me sens plus libre ?",
]

print("üí° QUESTIONS TEST:")
for i, q in enumerate(test_questions, 1):
    print(f"   {i}. {q}")
print("\n" + "="*60 + "\n")

turn = 0
while True:
    turn += 1
    user_input = input(f"[{turn}] üë§ VOUS : ")
    
    if user_input.lower() == 'exit':
        print("\nüëã Au revoir !")
        break
    
    if user_input.lower() == 'reset':
        conversation = [{"role": "system", "content": SYSTEM_PROMPT}]
        turn = 0
        print("\nüîÑ Conversation r√©initialis√©e\n")
        continue
    
    if not user_input.strip():
        continue
    
    conversation.append({"role": "user", "content": user_input})
    
    print("ü§î Spinoza r√©fl√©chit...", end="", flush=True)
    response = generate_response(conversation)
    print(f"\rüí¨ SPINOZA : {response}\n")
    
    conversation.append({"role": "assistant", "content": response})
    
    # V√©rifications
    issues = []
    if "Spinoza" in response and ("Pour Spinoza" in response or "montre que" in response.split("Spinoza")[1][:50]):
        issues.append("‚ö†Ô∏è 3√®me personne")
    if "[D√âTECTION : " in response:
        issues.append("‚ö†Ô∏è Tag malform√©")
    
    if issues:
        print(f"   {'  '.join(issues)}\n")

print("\n" + "="*60)
print(f"‚úÖ Dialogue termin√© ({turn} tours)")
print("\nüí° Pour voir l'historique: print(conversation)")

---

## ‚úÖ Checklist Finale

- [ ] Training initial termin√© (eval_loss < 0.6)
- [ ] Re-fine-tuning combin√© termin√© (eval_loss < 0.6)
- [ ] Test dialogue: parle en 1√®re personne
- [ ] Test dialogue: applique sch√®mes correctement
- [ ] Test dialogue: pas de r√©p√©tition
- [ ] Mod√®le push√© sur HF: `FJDaz/3_PHI/Spinoza_Secours`

---

**Cr√©√© le :** 20 novembre 2025  
**Version :** CLEAN v2 (dataset combin√© 80/20)  
**Mod√®le :** Mistral 7B Instruct v0.3 + LoRA r=64