<a href="https://colab.research.google.com/github/saida-chalouach/incident_detection-llm-fine_tuning/blob/main/Fine_Tuning_Mistral_Incidents.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ü§ñ Fine-tuning Mistral pour la D√©tection d'Incidents Web

Ce notebook vous guide √† travers toutes les √©tapes du fine-tuning dans Google Colab.

## ‚öôÔ∏è Configuration requise
- **GPU**: T4 (gratuit) ou A100 (Colab Pro)
- **Runtime**: GPU activ√©

## üìã √âtapes
1. Configuration de l'environnement
2. Upload et pr√©paration des donn√©es
3. Fine-tuning du mod√®le
4. √âvaluation
5. Test et utilisation

## üîß √âTAPE 1: Configuration de l'environnement

In [None]:
# V√©rifier le GPU disponible
!nvidia-smi

In [1]:
# Installation des d√©pendances avec versions compatibles pour Colab
print("üì¶ Installation des d√©pendances...")

# D'abord, installer torch compatible avec l'environnement Colab
!pip install -q torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1

# Puis installer le reste des packages
!pip install -q -U transformers datasets accelerate peft bitsandbytes
!pip install pandas==2.2.2 numpy==2.1.0 --break-system-packages

print("‚úÖ Installation termin√©e!")

üì¶ Installation des d√©pendances...
‚úÖ Installation termin√©e!


In [2]:
# Imports
import torch
import pandas as pd
import json
import random
import numpy as np
from pathlib import Path
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_recall_fscore_support, confusion_matrix

from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    TrainingArguments,
    Trainer,
    DataCollatorForLanguageModeling
)
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training, PeftModel
from datasets import load_dataset

print("‚úÖ Imports r√©ussis")
print(f"üî• PyTorch version: {torch.__version__}")
print(f"üéÆ CUDA disponible: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"üéØ GPU: {torch.cuda.get_device_name(0)}")

‚úÖ Imports r√©ussis
üî• PyTorch version: 2.5.1+cu124
üéÆ CUDA disponible: True
üéØ GPU: Tesla T4


## üì§ √âTAPE 2: Upload et pr√©paration des donn√©es

**Instructions:**
1. Ex√©cutez la cellule ci-dessous
2. Cliquez sur "Choose Files"
3. S√©lectionnez votre fichier `data_stage_2A.txt`

In [3]:
from google.colab import files

print("üìÇ Upload de votre fichier de donn√©es...")
uploaded = files.upload()

# R√©cup√©rer le nom du fichier upload√©
data_file = list(uploaded.keys())[0]
print(f"\n‚úÖ Fichier upload√©: {data_file}")

üìÇ Upload de votre fichier de donn√©es...


Saving data_stage_2A.txt to data_stage_2A.txt

‚úÖ Fichier upload√©: data_stage_2A.txt


In [4]:
# Fonction de parsing des donn√©es
def parse_data_file(file_path):
    """Parse le fichier texte et extrait les donn√©es structur√©es"""
    data = []

    print(f"üìÇ Lecture du fichier: {file_path}")
    with open(file_path, 'r', encoding='utf-8') as f:
        lines = f.readlines()

    for i in range(len(lines)):
        line = lines[i].strip()
        if line.startswith('A: '):
            values = line[3:].split(';')
            if len(values) >= 9:
                record = {
                    'heartbeat_id': values[0],
                    'is_important': values[1],
                    'monitor_name': values[2],
                    'monitor_id': values[3],
                    'heartbeat_status': values[4],
                    'heartbeat_msg': values[5],
                    'heartbeat_time': values[6],
                    'heartbeat_ping': values[7],
                    'duration': values[8]
                }
                data.append(record)

    print(f"‚úÖ {len(data)} enregistrements extraits")
    return pd.DataFrame(data)

# Parser les donn√©es
df = parse_data_file(data_file)
print(f"\nüìä DataFrame cr√©√©: {df.shape[0]} lignes, {df.shape[1]} colonnes")
df.head()

üìÇ Lecture du fichier: data_stage_2A.txt
‚úÖ 1666 enregistrements extraits

üìä DataFrame cr√©√©: 1666 lignes, 9 colonnes


Unnamed: 0,heartbeat_id,is_important,monitor_name,monitor_id,heartbeat_status,heartbeat_msg,heartbeat_time,heartbeat_ping,duration
0,41837805,0,Frontend PiTransfer Standalone,67,1,200 - OK,2024-07-05 02:37:28.238,49,60
1,41837806,0,Frontend PiDF Standalone,69,1,200 - OK,2024-07-05 02:37:28.307,40,60
2,41837807,0,Socket Chat,35,1,200 - OK,2024-07-05 02:37:28.426,43,60
3,41837808,0,New Media Server 9,59,1,200 - OK,2024-07-05 02:37:30.179,15,60
4,41837809,0,S3 Minio Store,63,1,200 - OK,2024-07-05 02:37:30.358,40,60


In [5]:
# Fonction de d√©tection d'incidents
def detect_incident(row):
    """D√©termine si un heartbeat indique un incident"""
    # Status 0 = Down
    if row['heartbeat_status'] == '0':
        return True, 'Service down - heartbeat status is 0'

    # Message vide ou erreur
    msg = row['heartbeat_msg'].strip()
    if not msg or msg == '':
        return True, 'No response message received'

    # Codes d'erreur HTTP
    error_codes = ['400', '401', '403', '404', '500', '502', '503', '504']
    if any(code in msg for code in error_codes):
        return True, f'HTTP error detected: {msg}'

    # Ping anormalement √©lev√© (> 250ms)
    try:
        ping_str = row['heartbeat_ping'].strip()
        if ping_str:
            ping = float(ping_str)
            if ping > 250:
                return True, f'High latency detected: {ping}ms'
    except (ValueError, AttributeError):
        pass

    return False, 'Normal operation - all metrics within acceptable range'

# Appliquer la d√©tection
print("üîç D√©tection des incidents...")
df['is_incident'], df['incident_reason'] = zip(*df.apply(detect_incident, axis=1))

print(f"\nüìä Statistiques:")
print(f"  Total: {len(df)} enregistrements")
print(f"  Incidents: {df['is_incident'].sum()} ({df['is_incident'].sum()/len(df)*100:.1f}%)")
print(f"  Normaux: {(~df['is_incident']).sum()} ({(~df['is_incident']).sum()/len(df)*100:.1f}%)")

üîç D√©tection des incidents...

üìä Statistiques:
  Total: 1666 enregistrements
  Incidents: 35 (2.1%)
  Normaux: 1631 (97.9%)


In [6]:
# Cr√©er les √©chantillons d'entra√Ænement
def create_training_samples(df, num_variations=2):
    """Cr√©e des √©chantillons d'entra√Ænement vari√©s"""
    samples = []

    print(f"üîÑ G√©n√©ration des √©chantillons d'entra√Ænement...")

    for idx, row in df.iterrows():
        monitor = row['monitor_name']
        status = row['heartbeat_status']
        msg = row['heartbeat_msg']
        ping = row['heartbeat_ping']
        time = row['heartbeat_time']
        is_incident = row['is_incident']
        reason = row['incident_reason']

        # Templates de questions
        question_templates = [
            f"Analyser le statut du service {monitor}: status={status}, message={msg}, ping={ping}ms",
            f"Le service {monitor} fonctionne-t-il correctement? Status: {status}, Response: {msg}",
            f"V√©rifier l'√©tat de {monitor}. Status={status}, Latency={ping}ms",
            f"Diagnostic pour {monitor}: {msg}, latence {ping}ms",
        ]

        # R√©ponses
        if is_incident:
            answer_templates = [
                f"üö® INCIDENT D√âTECT√â sur {monitor}. Raison: {reason}. Action requise imm√©diatement.",
                f"‚ö†Ô∏è Alerte: {monitor} rencontre un probl√®me - {reason}. Intervention n√©cessaire.",
                f"Probl√®me identifi√© sur {monitor}: {reason}. Statut critique.",
            ]
        else:
            answer_templates = [
                f"‚úÖ {monitor} fonctionne normalement. Status: {status}, {msg}. Tous les indicateurs sont au vert.",
                f"Aucun incident d√©tect√©. {monitor} est op√©rationnel. Latence: {ping}ms.",
                f"Service {monitor} en bon √©tat de fonctionnement. {msg}",
            ]

        # G√©n√©rer variations
        for _ in range(num_variations):
            question = random.choice(question_templates)
            answer = random.choice(answer_templates)

            samples.append({
                'instruction': question,
                'output': answer,
                'input': '',
                'is_incident': is_incident,
            })

    print(f"‚úÖ {len(samples)} √©chantillons g√©n√©r√©s")
    return samples

# G√©n√©rer les √©chantillons
samples = create_training_samples(df, num_variations=2)

üîÑ G√©n√©ration des √©chantillons d'entra√Ænement...
‚úÖ 3332 √©chantillons g√©n√©r√©s


In [7]:
# √âquilibrer et diviser le dataset
print("‚öñÔ∏è √âquilibrage du dataset...")

incident_samples = [s for s in samples if s['is_incident']]
normal_samples = [s for s in samples if not s['is_incident']]

print(f"  Incidents: {len(incident_samples)}")
print(f"  Normaux: {len(normal_samples)}")

# √âquilibrer
min_count = min(len(incident_samples), len(normal_samples))
incident_samples = random.sample(incident_samples, min_count)
normal_samples = random.sample(normal_samples, min_count)

all_samples = incident_samples + normal_samples
random.shuffle(all_samples)

print(f"\nüìä Dataset √©quilibr√©: {len(all_samples)} √©chantillons")

# Split train/val/test (80/10/10)
train_data, temp_data = train_test_split(
    all_samples,
    test_size=0.2,
    random_state=42,
    stratify=[s['is_incident'] for s in all_samples]
)

val_data, test_data = train_test_split(
    temp_data,
    test_size=0.5,
    random_state=42,
    stratify=[s['is_incident'] for s in temp_data]
)

print(f"\nüìà R√©partition:")
print(f"  Train: {len(train_data)} ({len(train_data)/len(all_samples)*100:.1f}%)")
print(f"  Validation: {len(val_data)} ({len(val_data)/len(all_samples)*100:.1f}%)")
print(f"  Test: {len(test_data)} ({len(test_data)/len(all_samples)*100:.1f}%)")

‚öñÔ∏è √âquilibrage du dataset...
  Incidents: 70
  Normaux: 3262

üìä Dataset √©quilibr√©: 140 √©chantillons

üìà R√©partition:
  Train: 112 (80.0%)
  Validation: 14 (10.0%)
  Test: 14 (10.0%)


In [8]:
# Sauvegarder les datasets
Path('data').mkdir(exist_ok=True)

for split_name, split_data in [('train', train_data), ('val', val_data), ('test', test_data)]:
    with open(f'data/{split_name}_data.jsonl', 'w', encoding='utf-8') as f:
        for sample in split_data:
            f.write(json.dumps(sample, ensure_ascii=False) + '\n')
    print(f"‚úÖ data/{split_name}_data.jsonl sauvegard√©")

print("\nüéâ Pr√©paration des donn√©es termin√©e!")

‚úÖ data/train_data.jsonl sauvegard√©
‚úÖ data/val_data.jsonl sauvegard√©
‚úÖ data/test_data.jsonl sauvegard√©

üéâ Pr√©paration des donn√©es termin√©e!


## üèãÔ∏è √âTAPE 3: Fine-tuning du mod√®le

Cette √©tape peut prendre **2-4 heures** sur T4 gratuit, ou **30-60 minutes** sur A100.

In [9]:
# Configuration du mod√®le
MODEL_NAME = "mistralai/Mistral-7B-Instruct-v0.2"
OUTPUT_DIR = "./mistral-incident-detector"

print("="*60)
print("üì¶ Chargement de Mistral-7B...")
print("="*60)
print("‚è∞ Cela peut prendre 5-10 minutes...\n")

üì¶ Chargement de Mistral-7B...
‚è∞ Cela peut prendre 5-10 minutes...



In [10]:
# Installation de bitsandbytes pour la quantification 4-bit
print("üì¶ Installation de bitsandbytes...")

!pip install -q bitsandbytes>=0.46.1

print("‚úÖ Installation termin√©e!")
print("\n‚ö†Ô∏è IMPORTANT: Red√©marrez le runtime maintenant")
print("Runtime ‚Üí Restart runtime")

üì¶ Installation de bitsandbytes...
‚úÖ Installation termin√©e!

‚ö†Ô∏è IMPORTANT: Red√©marrez le runtime maintenant
Runtime ‚Üí Restart runtime


In [11]:
# Configuration quantification 4-bit
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

# Charger le tokenizer
print("üî§ Chargement du tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

# Charger le mod√®le
print("üß† Chargement du mod√®le...")
model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
)

model.config.use_cache = False
model = prepare_model_for_kbit_training(model)

print("\n‚úÖ Mod√®le charg√© avec succ√®s!")

üî§ Chargement du tokenizer...


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/596 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]



tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

üß† Chargement du mod√®le...


model.safetensors.index.json: 0.00B [00:00, ?B/s]

Downloading (incomplete total...): 0.00B [00:00, ?B/s]

Fetching 3 files:   0%|          | 0/3 [00:00<?, ?it/s]

Loading weights:   0%|          | 0/291 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]


‚úÖ Mod√®le charg√© avec succ√®s!


In [12]:
# Configuration LoRA
print("‚öôÔ∏è Configuration LoRA...")

lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=[
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
    ],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

model = get_peft_model(model, lora_config)

# Afficher les param√®tres
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
total_params = sum(p.numel() for p in model.parameters())

print(f"\nüìä Param√®tres du mod√®le:")
print(f"  Total: {total_params:,}")
print(f"  Entra√Ænables: {trainable_params:,} ({100 * trainable_params / total_params:.2f}%)")

‚öôÔ∏è Configuration LoRA...

üìä Param√®tres du mod√®le:
  Total: 3,794,014,208
  Entra√Ænables: 41,943,040 (1.11%)


In [13]:
# Pr√©parer les datasets
print("üìö Chargement des datasets...")

train_dataset = load_dataset('json', data_files='data/train_data.jsonl', split='train')
val_dataset = load_dataset('json', data_files='data/val_data.jsonl', split='train')

print(f"  Train: {len(train_dataset)} √©chantillons")
print(f"  Validation: {len(val_dataset)} √©chantillons")

# Fonction de formatage
def format_instruction(sample):
    return f"<s>[INST] {sample['instruction']} [/INST] {sample['output']}</s>"

# Fonction de tokenization
def tokenize_function(examples):
    texts = [format_instruction({
        'instruction': inst,
        'output': out
    }) for inst, out in zip(examples['instruction'], examples['output'])]

    tokenized = tokenizer(
        texts,
        truncation=True,
        max_length=512,
        padding="max_length",
        return_tensors=None
    )

    tokenized["labels"] = tokenized["input_ids"].copy()
    return tokenized

# Tokeniser
print("\nüîÑ Tokenization...")
train_dataset = train_dataset.map(
    tokenize_function,
    batched=True,
    remove_columns=train_dataset.column_names
)

val_dataset = val_dataset.map(
    tokenize_function,
    batched=True,
    remove_columns=val_dataset.column_names
)

print("‚úÖ Tokenization termin√©e!")

üìö Chargement des datasets...


Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

  Train: 112 √©chantillons
  Validation: 14 √©chantillons

üîÑ Tokenization...


Map:   0%|          | 0/112 [00:00<?, ? examples/s]

Map:   0%|          | 0/14 [00:00<?, ? examples/s]

‚úÖ Tokenization termin√©e!


In [14]:
# Configuration de l'entra√Ænement
print("üèãÔ∏è Configuration de l'entra√Ænement...\n")

training_args = TrainingArguments(
    output_dir=OUTPUT_DIR,
    num_train_epochs=3,
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    gradient_accumulation_steps=4,
    learning_rate=2e-4,
    warmup_steps=100,
    logging_steps=10,
    eval_strategy="steps",
    eval_steps=50,
    save_strategy="steps",
    save_steps=100,
    save_total_limit=3,
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss",
    fp16=True,
    optim="paged_adamw_8bit",
    logging_dir=f"{OUTPUT_DIR}/logs",
    report_to="none",
    gradient_checkpointing=True,
    max_grad_norm=0.3,
    warmup_ratio=0.03,
    lr_scheduler_type="cosine",
)

print(f"üìã Param√®tres:")
print(f"  Epochs: {training_args.num_train_epochs}")
print(f"  Batch size: {training_args.per_device_train_batch_size}")
print(f"  Learning rate: {training_args.learning_rate}")
print(f"  Effective batch: {training_args.per_device_train_batch_size * training_args.gradient_accumulation_steps}")

warmup_ratio is deprecated and will be removed in v5.2. Use `warmup_steps` instead.
`logging_dir` is deprecated and will be removed in v5.2. Please set `TENSORBOARD_LOGGING_DIR` instead.


üèãÔ∏è Configuration de l'entra√Ænement...

üìã Param√®tres:
  Epochs: 3
  Batch size: 4
  Learning rate: 0.0002
  Effective batch: 16


In [15]:
# Cr√©er le Trainer
data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    data_collator=data_collator,
)

print("‚úÖ Trainer cr√©√©!")

‚úÖ Trainer cr√©√©!


In [16]:
# ENTRA√éNEMENT
print("="*60)
print("üöÄ D√âBUT DE L'ENTRA√éNEMENT")
print("="*60)
print("\n‚è∞ Dur√©e estim√©e: 2-4 heures sur T4")
print("üìä Surveillez les logs ci-dessous\n")

# Lancer l'entra√Ænement
trainer.train()

print("\n" + "="*60)
print("‚úÖ ENTRA√éNEMENT TERMIN√â!")
print("="*60)

üöÄ D√âBUT DE L'ENTRA√éNEMENT

‚è∞ Dur√©e estim√©e: 2-4 heures sur T4
üìä Surveillez les logs ci-dessous



Step,Training Loss,Validation Loss



‚úÖ ENTRA√éNEMENT TERMIN√â!


In [17]:
# Sauvegarder le mod√®le
print("üíæ Sauvegarde du mod√®le...")

trainer.save_model(OUTPUT_DIR)
tokenizer.save_pretrained(OUTPUT_DIR)

print(f"‚úÖ Mod√®le sauvegard√© dans {OUTPUT_DIR}/")

# √âvaluation finale
print("\nüìä √âvaluation finale...")
eval_results = trainer.evaluate()

print("\nR√©sultats:")
for key, value in eval_results.items():
    print(f"  {key}: {value:.4f}")

üíæ Sauvegarde du mod√®le...
‚úÖ Mod√®le sauvegard√© dans ./mistral-incident-detector/

üìä √âvaluation finale...



R√©sultats:
  eval_loss: 0.8402
  eval_runtime: 8.2076
  eval_samples_per_second: 1.7060
  eval_steps_per_second: 0.4870
  epoch: 3.0000


## üìä √âTAPE 4: √âvaluation du mod√®le

In [18]:
# Fonction de pr√©diction
def predict_incident(instruction, max_new_tokens=150):
    """Pr√©dit la r√©ponse pour une instruction"""
    prompt = f"<s>[INST] {instruction} [/INST]"

    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            temperature=0.7,
            do_sample=True,
            top_p=0.9,
            repetition_penalty=1.1,
        )

    full_response = tokenizer.decode(outputs[0], skip_special_tokens=True)

    if "[/INST]" in full_response:
        response = full_response.split("[/INST]")[-1].strip()
    else:
        response = full_response

    return response

def is_incident_prediction(response):
    """D√©termine si c'est un incident"""
    response_lower = response.lower()

    incident_keywords = [
        'incident', 'alerte', 'probl√®me', 'erreur', 'anomalie',
        'üö®', '‚ö†Ô∏è', 'down', 'critique', 'panne'
    ]

    normal_keywords = [
        'normal', 'op√©rationnel', 'aucun incident', 'fonctionne',
        '‚úÖ', 'ok', 'bon √©tat', 'nominal'
    ]

    incident_score = sum(1 for kw in incident_keywords if kw in response_lower)
    normal_score = sum(1 for kw in normal_keywords if kw in response_lower)

    return incident_score > normal_score

print("‚úÖ Fonctions de pr√©diction pr√™tes")

‚úÖ Fonctions de pr√©diction pr√™tes


In [None]:
# Test sur quelques exemples
print("="*60)
print("üß™ TESTS DU MOD√àLE")
print("="*60)

test_cases = [
    "Analyser le statut du service MySQL: status=0, message=, ping=ms",
    "Le service WebSite fonctionne-t-il correctement? Status: 1, Response: 200 - OK",
    "V√©rifier l'√©tat de Backend API. Status=1, message=500 - Error, Latency=120ms",
    "Service Sentry: status=1, 200 - OK, ping=350ms",
]

for i, test in enumerate(test_cases, 1):
    print(f"\n{'‚îÄ'*60}")
    print(f"Test {i}")
    print(f"{'‚îÄ'*60}")
    print(f"\nüìù Question:\n   {test}")

    response = predict_incident(test)
    is_incident = is_incident_prediction(response)

    print(f"\nü§ñ R√©ponse:\n   {response}")
    print(f"\nüè∑Ô∏è Classification: {'üö® INCIDENT' if is_incident else '‚úÖ NORMAL'}")

In [19]:
# √âvaluation sur le test set
print("="*60)
print("üìà √âVALUATION SUR LE TEST SET")
print("="*60)

with open('data/test_data.jsonl', 'r') as f:
    test_data = [json.loads(line) for line in f]

# Limiter √† 100 pour la vitesse (retirez [:100] pour tout tester)
test_sample = test_data[:100]

predictions = []
ground_truth = []

print(f"\nüîÑ √âvaluation de {len(test_sample)} √©chantillons...\n")

for idx, sample in enumerate(test_sample):
    instruction = sample['instruction']
    true_label = sample['is_incident']

    response = predict_incident(instruction)
    pred_label = is_incident_prediction(response)

    predictions.append(pred_label)
    ground_truth.append(true_label)

    if (idx + 1) % 20 == 0:
        print(f"  Progression: {idx + 1}/{len(test_sample)}")

print("\n‚úÖ √âvaluation termin√©e!")

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


üìà √âVALUATION SUR LE TEST SET

üîÑ √âvaluation de 14 √©chantillons...



Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.



‚úÖ √âvaluation termin√©e!


In [20]:
# Calculer les m√©triques
accuracy = accuracy_score(ground_truth, predictions)
precision, recall, f1, _ = precision_recall_fscore_support(
    ground_truth, predictions, average='binary', zero_division=0
)

cm = confusion_matrix(ground_truth, predictions)
tn, fp, fn, tp = cm.ravel()

print("="*60)
print("üìä R√âSULTATS FINAUX")
print("="*60)

print(f"\nüìà M√©triques:")
print(f"  Accuracy:  {accuracy:.3f} ({accuracy*100:.1f}%)")
print(f"  Precision: {precision:.3f}")
print(f"  Recall:    {recall:.3f}")
print(f"  F1-Score:  {f1:.3f}")

print(f"\nüéØ Matrice de confusion:")
print(f"                 Pr√©dit Normal  Pr√©dit Incident")
print(f"  Vrai Normal         {tn:4d}          {fp:4d}")
print(f"  Vrai Incident       {fn:4d}          {tp:4d}")

print(f"\nüìù Interpr√©tation:")
print(f"  ‚úÖ Vrais positifs: {tp} - Incidents correctement d√©tect√©s")
print(f"  ‚úÖ Vrais n√©gatifs: {tn} - Normaux correctement identifi√©s")
print(f"  ‚ùå Faux positifs: {fp} - Fausses alertes")
print(f"  ‚ùå Faux n√©gatifs: {fn} - Incidents manqu√©s")

if accuracy >= 0.90:
    print("\nüéâ EXCELLENT! Le mod√®le est pr√™t pour la production!")
elif accuracy >= 0.80:
    print("\n‚úÖ BON! Performances acceptables, quelques am√©liorations possibles.")
else:
    print("\n‚ö†Ô∏è Performances insuffisantes. Entra√Ænez plus longtemps ou ajoutez des donn√©es.")

üìä R√âSULTATS FINAUX

üìà M√©triques:
  Accuracy:  0.929 (92.9%)
  Precision: 1.000
  Recall:    0.857
  F1-Score:  0.923

üéØ Matrice de confusion:
                 Pr√©dit Normal  Pr√©dit Incident
  Vrai Normal            7             0
  Vrai Incident          1             6

üìù Interpr√©tation:
  ‚úÖ Vrais positifs: 6 - Incidents correctement d√©tect√©s
  ‚úÖ Vrais n√©gatifs: 7 - Normaux correctement identifi√©s
  ‚ùå Faux positifs: 0 - Fausses alertes
  ‚ùå Faux n√©gatifs: 1 - Incidents manqu√©s

üéâ EXCELLENT! Le mod√®le est pr√™t pour la production!


## üíæ √âTAPE 5: T√©l√©charger le mod√®le

T√©l√©chargez votre mod√®le fine-tun√© pour l'utiliser plus tard.

In [21]:
# Compresser le mod√®le
print("üì¶ Compression du mod√®le...")
!zip -r mistral-incident-detector.zip mistral-incident-detector/
print("‚úÖ Compression termin√©e!")

üì¶ Compression du mod√®le...
  adding: mistral-incident-detector/ (stored 0%)
  adding: mistral-incident-detector/adapter_config.json (deflated 58%)
  adding: mistral-incident-detector/README.md (deflated 66%)
  adding: mistral-incident-detector/chat_template.jinja (deflated 64%)
  adding: mistral-incident-detector/checkpoint-21/ (stored 0%)
  adding: mistral-incident-detector/checkpoint-21/adapter_config.json (deflated 58%)
  adding: mistral-incident-detector/checkpoint-21/README.md (deflated 66%)
  adding: mistral-incident-detector/checkpoint-21/chat_template.jinja (deflated 64%)
  adding: mistral-incident-detector/checkpoint-21/trainer_state.json (deflated 56%)
  adding: mistral-incident-detector/checkpoint-21/scheduler.pt (deflated 56%)
  adding: mistral-incident-detector/checkpoint-21/tokenizer_config.json (deflated 48%)
  adding: mistral-incident-detector/checkpoint-21/adapter_model.safetensors (deflated 8%)
  adding: mistral-incident-detector/checkpoint-21/training_args.bin (d

In [22]:
# T√©l√©charger
from google.colab import files

print("‚¨áÔ∏è T√©l√©chargement du mod√®le...")
files.download('mistral-incident-detector.zip')
print("\n‚úÖ T√©l√©chargement lanc√©! V√©rifiez vos t√©l√©chargements.")

‚¨áÔ∏è T√©l√©chargement du mod√®le...


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>


‚úÖ T√©l√©chargement lanc√©! V√©rifiez vos t√©l√©chargements.


## üéâ F√âLICITATIONS!

Vous avez termin√© le fine-tuning de Mistral pour la d√©tection d'incidents!

### Prochaines √©tapes:
1. ‚úÖ T√©l√©chargez votre mod√®le (voir cellule ci-dessus)
2. ‚úÖ Testez avec vos propres donn√©es
3. ‚úÖ D√©ployez en production (API Flask, FastAPI, etc.)

### Pour sauvegarder sur Google Drive:
```python
from google.colab import drive
drive.mount('/content/drive')
!cp -r mistral-incident-detector /content/drive/MyDrive/
```