# Notebook 2 : Entra√Ænement du Mod√®le CNN-MFCC

---

## üìã Table des Mati√®res

1. [Introduction et Contexte](#1-introduction)
2. [Architecture du Mod√®le CNN-MFCC](#2-architecture)
3. [Configuration d'Entra√Ænement](#3-configuration)
4. [Processus d'Entra√Ænement](#4-entrainement)
5. [R√©sultats et M√©triques](#5-resultats)
6. [Visualisations et Analyses](#6-visualisations)
7. [Analyse des Performances Par Classe](#7-analyse-classe)
8. [Conclusion](#8-conclusion)

---

## 1. Introduction et Contexte {#1-introduction}

### Objectif

Ce notebook documente l'entra√Ænement du mod√®le **CNN-MFCC** (Convolutional Neural Network with MFCC features), qui sert de **baseline** pour le projet SereneSense.

### Mod√®le CNN-MFCC

Le CNN-MFCC est un mod√®le de classification audio qui :
- Utilise des **MFCC (Mel-Frequency Cepstral Coefficients)** comme features d'entr√©e
- Applique une architecture **CNN √† 3 couches convolutives**
- Classifie les sons en **7 cat√©gories** de v√©hicules militaires

### R√©sultats Attendus

D'apr√®s l'analyse du projet :
- **Best Validation Accuracy** : 66.88% (epoch 29)
- **Final Validation Accuracy** : 57.95% (epoch 150)
- **Nombre de param√®tres** : 242,000 (242K)
- **Temps d'entra√Ænement** : 2-3 heures sur GPU

In [None]:
# Import des biblioth√®ques n√©cessaires
import os
import sys
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import json
import yaml
import warnings
warnings.filterwarnings('ignore')

# PyTorch
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader

# Configuration
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")
%matplotlib inline

# Chemins du projet
PROJECT_ROOT = Path(r'c:\Users\MDN\Desktop\SereneSense')
sys.path.insert(0, str(PROJECT_ROOT / 'src'))

CONFIG_PATH = PROJECT_ROOT / 'configs' / 'models' / 'legacy_cnn_mfcc.yaml'
HISTORY_PATH = PROJECT_ROOT / 'outputs' / 'history' / 'cnn_baseline.json'
OUTPUT_DIR = PROJECT_ROOT / 'outputs' / 'training_cnn'
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)

print("‚úÖ Biblioth√®ques import√©es avec succ√®s")
print(f"üìÅ Projet : {PROJECT_ROOT}")
print(f"üîß PyTorch version : {torch.__version__}")
print(f"üéÆ CUDA disponible : {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"   GPU : {torch.cuda.get_device_name(0)}")

---

## 2. Architecture du Mod√®le CNN-MFCC {#2-architecture}

### Structure du Mod√®le

Le mod√®le CNN-MFCC est compos√© de :

```
Input: (batch, 3, 40, 92)
  ‚Üì
Conv2D(48, kernel=3√ó3) ‚Üí BatchNorm ‚Üí ReLU ‚Üí MaxPool(2√ó2) ‚Üí Dropout(0.25)
  ‚Üì
Conv2D(96, kernel=3√ó3) ‚Üí BatchNorm ‚Üí ReLU ‚Üí MaxPool(2√ó2) ‚Üí Dropout(0.30)
  ‚Üì
Conv2D(192, kernel=3√ó3) ‚Üí BatchNorm ‚Üí ReLU ‚Üí MaxPool(2√ó2) ‚Üí Dropout(0.30)
  ‚Üì
GlobalAveragePooling2D ‚Üí (192 features)
  ‚Üì
Dense(160) ‚Üí ReLU ‚Üí Dropout(0.35)
  ‚Üì
Dense(7) ‚Üí Softmax
```

### Features d'Entr√©e : MFCC

- **Shape** : (3, 40, 92)
- **Canaux** : 3 (MFCC + Delta + Delta-Delta)
- **Coefficients MFCC** : 40
- **Frames temporelles** : 92 (pour 3 secondes d'audio)
- **Audio duration** : 3.0 secondes
- **Hop length** : 512 samples (31.25ms)

### Param√®tres Totaux : 242,000

In [None]:
# Chargement de la configuration
print("üìÑ Chargement de la configuration CNN-MFCC...\n")

if CONFIG_PATH.exists():
    with open(CONFIG_PATH, 'r', encoding='utf-8') as f:
        cnn_config = yaml.safe_load(f)
    
    print("üîß Configuration MFCC :")
    mfcc_cfg = cnn_config.get('mfcc', {})
    for key, value in mfcc_cfg.items():
        print(f"   ‚Ä¢ {key:20s} : {value}")
    
    print("\nüèóÔ∏è Architecture CNN :")
    cnn_arch = cnn_config.get('cnn', {})
    for key, value in cnn_arch.items():
        print(f"   ‚Ä¢ {key:20s} : {value}")
    
    print("\nüé® SpecAugment :")
    spec_aug = cnn_config.get('spec_augment', {})
    for key, value in spec_aug.items():
        print(f"   ‚Ä¢ {key:20s} : {value}")
else:
    print(f"‚ö†Ô∏è Configuration non trouv√©e : {CONFIG_PATH}")
    cnn_config = None

In [None]:
# D√©finition du mod√®le CNN-MFCC (architecture exacte du projet)
class CNNMFCCModel(nn.Module):
    """Mod√®le CNN-MFCC pour classification audio."""
    
    def __init__(self, num_classes=7, input_channels=3):
        super(CNNMFCCModel, self).__init__()
        
        # Bloc Conv 1
        self.conv1 = nn.Conv2d(input_channels, 48, kernel_size=3, padding=1)
        self.bn1 = nn.BatchNorm2d(48)
        self.pool1 = nn.MaxPool2d(2, 2)
        self.dropout1 = nn.Dropout(0.25)
        
        # Bloc Conv 2
        self.conv2 = nn.Conv2d(48, 96, kernel_size=3, padding=1)
        self.bn2 = nn.BatchNorm2d(96)
        self.pool2 = nn.MaxPool2d(2, 2)
        self.dropout2 = nn.Dropout(0.30)
        
        # Bloc Conv 3
        self.conv3 = nn.Conv2d(96, 192, kernel_size=3, padding=1)
        self.bn3 = nn.BatchNorm2d(192)
        self.pool3 = nn.MaxPool2d(2, 2)
        self.dropout3 = nn.Dropout(0.30)
        
        # Global Average Pooling
        self.global_pool = nn.AdaptiveAvgPool2d(1)
        
        # Fully Connected
        self.fc1 = nn.Linear(192, 160)
        self.dropout4 = nn.Dropout(0.35)
        self.fc2 = nn.Linear(160, num_classes)
        
    def forward(self, x):
        # Conv Block 1
        x = self.conv1(x)
        x = self.bn1(x)
        x = torch.relu(x)
        x = self.pool1(x)
        x = self.dropout1(x)
        
        # Conv Block 2
        x = self.conv2(x)
        x = self.bn2(x)
        x = torch.relu(x)
        x = self.pool2(x)
        x = self.dropout2(x)
        
        # Conv Block 3
        x = self.conv3(x)
        x = self.bn3(x)
        x = torch.relu(x)
        x = self.pool3(x)
        x = self.dropout3(x)
        
        # Global Average Pooling
        x = self.global_pool(x)
        x = x.view(x.size(0), -1)
        
        # Fully Connected
        x = self.fc1(x)
        x = torch.relu(x)
        x = self.dropout4(x)
        x = self.fc2(x)
        
        return x

# Instanciation du mod√®le
model = CNNMFCCModel(num_classes=7, input_channels=3)

# Comptage des param√®tres
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

print("\nüéØ Mod√®le CNN-MFCC instanci√© :")
print(f"   ‚Ä¢ Param√®tres totaux      : {total_params:,}")
print(f"   ‚Ä¢ Param√®tres entra√Ænables : {trainable_params:,}")
print(f"   ‚Ä¢ Taille du mod√®le        : {total_params * 4 / 1024 / 1024:.2f} MB (FP32)")

# Afficher l'architecture
print("\nüìê Architecture du mod√®le :\n")
print(model)

In [None]:
# Test de la forme de sortie
print("üß™ Test de la forme de sortie du mod√®le :\n")

# Cr√©er un batch d'entr√©e fictif
batch_size = 4
dummy_input = torch.randn(batch_size, 3, 40, 92)

print(f"   Input shape  : {dummy_input.shape}")

# Forward pass
model.eval()
with torch.no_grad():
    output = model(dummy_input)

print(f"   Output shape : {output.shape}")
print(f"   Expected     : (batch_size={batch_size}, num_classes=7)")
print(f"\n‚úÖ Test r√©ussi !")

---

## 3. Configuration d'Entra√Ænement {#3-configuration}

### Hyperparam√®tres d'Entra√Ænement

D'apr√®s le fichier `legacy_cnn_mfcc.yaml` et les r√©sultats obtenus :

**Optimizer** :
- Type : Adam
- Learning Rate : 1e-3 (0.001)
- Weight Decay : 0.0
- Betas : (0.9, 0.999)

**Training** :
- Batch Size : 32
- Epochs : 150
- Loss Function : CrossEntropyLoss (avec class weights)

**Learning Rate Schedule** :
- Type : ReduceLROnPlateau
- Factor : 0.5 (r√©duction de moiti√©)
- Patience : 10 epochs
- Min LR : 1e-7

**Data Augmentation (SpecAugment)** :
- Frequency Masking : 15% (2 masks)
- Time Masking : 10% (2 masks)
- Probability : 0.8

### Commande d'Entra√Ænement

```bash
python scripts/train_legacy_model.py \
    --model cnn \
    --config configs/models/legacy_cnn_mfcc.yaml \
    --epochs 150 \
    --batch-size 32 \
    --learning-rate 1e-3 \
    --checkpoint outputs/phase1/cnn_baseline.pth
```

In [None]:
# Configuration d'entra√Ænement (pour r√©f√©rence)
training_config = {
    'model': 'CNN-MFCC',
    'optimizer': 'Adam',
    'learning_rate': 1e-3,
    'weight_decay': 0.0,
    'batch_size': 32,
    'epochs': 150,
    'loss_function': 'CrossEntropyLoss',
    'lr_scheduler': 'ReduceLROnPlateau',
    'lr_factor': 0.5,
    'lr_patience': 10,
    'num_classes': 7,
    'input_shape': (3, 40, 92),
    'audio_duration': 3.0,
    'sample_rate': 16000,
}

print("‚öôÔ∏è Configuration d'entra√Ænement CNN-MFCC :\n")
for key, value in training_config.items():
    print(f"   ‚Ä¢ {key:20s} : {value}")

---

## 4. Processus d'Entra√Ænement {#4-entrainement}

### D√©tails du Training

L'entra√Ænement a √©t√© effectu√© sur **150 epochs** avec les observations suivantes :

**Convergence** :
- **Meilleure epoch** : 29
- **Best Val Accuracy** : 66.88%
- **Best Val Loss** : ~1.0

**Probl√®mes observ√©s** :
- **Overfitting** apr√®s l'epoch 29
- Val Accuracy diminue √† 57.95% (epoch 150)
- Val Loss augmente √† 1.3161

**Learning Rate Schedule** :
- Epochs 1-26 : LR = 1e-3
- Epochs 27-40 : LR = 5e-4
- Epochs 41-51 : LR = 2.5e-4
- Continues jusqu'√† LR ‚âà 4.88e-7

### Temps d'Entra√Ænement

- **Total** : 2-3 heures sur GPU
- **Par epoch** : ~1-1.5 minutes

---

## 5. R√©sultats et M√©triques {#5-resultats}

### Chargement de l'Historique d'Entra√Ænement

Les r√©sultats d'entra√Ænement sont sauvegard√©s dans `outputs/history/cnn_baseline.json`.

In [None]:
# Chargement de l'historique d'entra√Ænement
print("üìä Chargement de l'historique d'entra√Ænement...\n")

if HISTORY_PATH.exists():
    with open(HISTORY_PATH, 'r') as f:
        history = json.load(f)
    
    print("‚úÖ Historique charg√© avec succ√®s !\n")
    print(f"üìà R√©sum√© des r√©sultats :")
    print(f"   ‚Ä¢ Mod√®le              : {history.get('model')}")
    print(f"   ‚Ä¢ Epochs demand√©es    : {history.get('epochs_requested')}")
    print(f"   ‚Ä¢ Epochs compl√©t√©es   : {len(history.get('train_loss', []))}")
    print(f"   ‚Ä¢ Best Val Accuracy   : {history.get('best_accuracy', 0)*100:.2f}%")
    print(f"   ‚Ä¢ Best Epoch          : {history.get('best_epoch')}")
    print(f"   ‚Ä¢ Final Train Loss    : {history['train_loss'][-1]:.4f}")
    print(f"   ‚Ä¢ Final Val Loss      : {history['val_loss'][-1]:.4f}")
    print(f"   ‚Ä¢ Final Val Accuracy  : {history['val_accuracy'][-1]*100:.2f}%")
else:
    print(f"‚ö†Ô∏è Historique non trouv√© : {HISTORY_PATH}")
    print("   Utilisation de donn√©es simul√©es bas√©es sur les r√©sultats connus...\n")
    
    # Simulation bas√©e sur les r√©sultats r√©els
    history = {
        'model': 'CNN-MFCC',
        'epochs_requested': 150,
        'best_accuracy': 0.6688,
        'best_epoch': 29,
        'train_loss': [],
        'val_loss': [],
        'val_accuracy': []
    }
    
    # Simulation des courbes
    for epoch in range(150):
        if epoch < 29:
            # Phase d'am√©lioration
            train_loss = 2.0 - (epoch / 29) * 1.2
            val_loss = 1.8 - (epoch / 29) * 0.8
            val_acc = 0.2 + (epoch / 29) * 0.4688
        else:
            # Phase d'overfitting
            train_loss = 0.8 - ((epoch - 29) / 121) * 0.0348
            val_loss = 1.0 + ((epoch - 29) / 121) * 0.3161
            val_acc = 0.6688 - ((epoch - 29) / 121) * 0.0893
        
        history['train_loss'].append(train_loss)
        history['val_loss'].append(val_loss)
        history['val_accuracy'].append(val_acc)
    
    print("‚úÖ Donn√©es simul√©es cr√©√©es !")

In [None]:
# Extraction des m√©triques
train_loss = np.array(history['train_loss'])
val_loss = np.array(history['val_loss'])
val_accuracy = np.array(history['val_accuracy'])
epochs_range = np.arange(1, len(train_loss) + 1)

best_epoch = history['best_epoch']
best_acc = history['best_accuracy']

print(f"\nüìä Statistiques d√©taill√©es :\n")
print(f"   Epoch {best_epoch:3d} (Meilleure) :")
print(f"      Train Loss : {train_loss[best_epoch-1]:.4f}")
print(f"      Val Loss   : {val_loss[best_epoch-1]:.4f}")
print(f"      Val Acc    : {val_accuracy[best_epoch-1]*100:.2f}%")
print(f"\n   Epoch 150 (Finale) :")
print(f"      Train Loss : {train_loss[-1]:.4f}")
print(f"      Val Loss   : {val_loss[-1]:.4f}")
print(f"      Val Acc    : {val_accuracy[-1]*100:.2f}%")
print(f"\n   üìâ D√©gradation apr√®s best epoch : {(best_acc - val_accuracy[-1])*100:.2f}%")

---

## 6. Visualisations et Analyses {#6-visualisations}

### Courbes d'Entra√Ænement

In [None]:
# Visualisation des courbes d'entra√Ænement
fig, axes = plt.subplots(2, 2, figsize=(16, 10))

# 1. Loss curves
axes[0, 0].plot(epochs_range, train_loss, label='Train Loss', color='steelblue', linewidth=2)
axes[0, 0].plot(epochs_range, val_loss, label='Val Loss', color='darkorange', linewidth=2)
axes[0, 0].axvline(x=best_epoch, color='red', linestyle='--', linewidth=1.5, 
                   label=f'Best Epoch ({best_epoch})')
axes[0, 0].set_xlabel('Epoch', fontsize=12)
axes[0, 0].set_ylabel('Loss', fontsize=12)
axes[0, 0].set_title('Courbes de Loss (Train vs Validation)', fontsize=14, fontweight='bold')
axes[0, 0].legend(fontsize=10)
axes[0, 0].grid(True, alpha=0.3)

# 2. Validation Accuracy
axes[0, 1].plot(epochs_range, val_accuracy * 100, color='forestgreen', linewidth=2)
axes[0, 1].axvline(x=best_epoch, color='red', linestyle='--', linewidth=1.5,
                   label=f'Best: {best_acc*100:.2f}%')
axes[0, 1].axhline(y=best_acc*100, color='red', linestyle=':', linewidth=1, alpha=0.5)
axes[0, 1].set_xlabel('Epoch', fontsize=12)
axes[0, 1].set_ylabel('Accuracy (%)', fontsize=12)
axes[0, 1].set_title('Pr√©cision de Validation', fontsize=14, fontweight='bold')
axes[0, 1].legend(fontsize=10)
axes[0, 1].grid(True, alpha=0.3)

# 3. Loss zoom (premiers 50 epochs)
axes[1, 0].plot(epochs_range[:50], train_loss[:50], label='Train Loss', 
                color='steelblue', linewidth=2)
axes[1, 0].plot(epochs_range[:50], val_loss[:50], label='Val Loss', 
                color='darkorange', linewidth=2)
axes[1, 0].axvline(x=best_epoch, color='red', linestyle='--', linewidth=1.5)
axes[1, 0].set_xlabel('Epoch', fontsize=12)
axes[1, 0].set_ylabel('Loss', fontsize=12)
axes[1, 0].set_title('Zoom: 50 Premi√®res Epochs', fontsize=14, fontweight='bold')
axes[1, 0].legend(fontsize=10)
axes[1, 0].grid(True, alpha=0.3)

# 4. Overfitting analysis
gap = val_loss - train_loss
axes[1, 1].plot(epochs_range, gap, color='purple', linewidth=2)
axes[1, 1].axhline(y=0, color='black', linestyle='-', linewidth=1, alpha=0.3)
axes[1, 1].axvline(x=best_epoch, color='red', linestyle='--', linewidth=1.5,
                   label='D√©but Overfitting')
axes[1, 1].set_xlabel('Epoch', fontsize=12)
axes[1, 1].set_ylabel('Val Loss - Train Loss', fontsize=12)
axes[1, 1].set_title('Analyse de l\'Overfitting', fontsize=14, fontweight='bold')
axes[1, 1].legend(fontsize=10)
axes[1, 1].grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig(OUTPUT_DIR / 'cnn_training_curves.png', dpi=300, bbox_inches='tight')
plt.show()

print("üíæ Graphique sauvegard√© : cnn_training_curves.png")

In [None]:
# Learning Rate Schedule (simulation)
print("\nüìâ √âvolution du Learning Rate (ReduceLROnPlateau) :\n")

# Simulation du LR schedule bas√© sur patience=10
lr_schedule = []
current_lr = 1e-3
plateau_counter = 0
best_val_loss = float('inf')

for epoch in range(150):
    lr_schedule.append(current_lr)
    
    # V√©rifier si am√©lioration
    if val_loss[epoch] < best_val_loss:
        best_val_loss = val_loss[epoch]
        plateau_counter = 0
    else:
        plateau_counter += 1
    
    # R√©duire LR si plateau
    if plateau_counter >= 10:
        current_lr *= 0.5
        plateau_counter = 0
        print(f"   Epoch {epoch+1:3d} : LR r√©duit √† {current_lr:.2e}")

# Visualisation
fig, ax = plt.subplots(figsize=(12, 5))
ax.plot(epochs_range, lr_schedule, color='crimson', linewidth=2)
ax.set_xlabel('Epoch', fontsize=12)
ax.set_ylabel('Learning Rate', fontsize=12)
ax.set_title('√âvolution du Learning Rate (ReduceLROnPlateau)', 
             fontsize=14, fontweight='bold')
ax.set_yscale('log')
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig(OUTPUT_DIR / 'cnn_lr_schedule.png', dpi=300, bbox_inches='tight')
plt.show()

print("\nüíæ Graphique sauvegard√© : cnn_lr_schedule.png")

---

## 7. Analyse des Performances Par Classe {#7-analyse-classe}

### R√©sultats Par Classe (Best Model - Epoch 29)

D'apr√®s l'analyse du projet, voici les performances par classe :

| Classe | Precision | Recall | F1-Score | Support |
|--------|-----------|--------|----------|---------|
| **Helicopter** | 0.82 | 0.93 | 0.87 | - |
| **Fighter Aircraft** | 1.00 | 0.30 | 0.46 | - |
| **Military Vehicle** | 0.52 | 0.50 | 0.51 | - |
| **Truck** | 0.68 | 0.46 | 0.55 | - |
| **Footsteps** | 0.60 | 0.35 | 0.45 | - |
| **Speech** | 0.61 | 0.73 | 0.66 | - |
| **Background** | 0.37 | 0.95 | 0.53 | - |

**Moyennes pond√©r√©es** :
- Precision : 0.69
- Recall : 0.58
- F1-Score : 0.57

In [None]:
# D√©finition des classes
CLASS_NAMES = [
    'Helicopter',
    'Fighter Aircraft',
    'Military Vehicle',
    'Truck',
    'Footsteps',
    'Speech',
    'Background'
]

# M√©triques par classe (r√©sultats r√©els du projet)
class_metrics = {
    'Class': CLASS_NAMES,
    'Precision': [0.82, 1.00, 0.52, 0.68, 0.60, 0.61, 0.37],
    'Recall': [0.93, 0.30, 0.50, 0.46, 0.35, 0.73, 0.95],
    'F1-Score': [0.87, 0.46, 0.51, 0.55, 0.45, 0.66, 0.53]
}

df_metrics = pd.DataFrame(class_metrics)

print("üìä Performances par Classe (Best Model - Epoch 29) :\n")
print(df_metrics.to_string(index=False))

print(f"\nüìà Moyennes pond√©r√©es :")
print(f"   ‚Ä¢ Precision : 0.69")
print(f"   ‚Ä¢ Recall    : 0.58")
print(f"   ‚Ä¢ F1-Score  : 0.57")

In [None]:
# Visualisation des m√©triques par classe
fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# 1. Barplot des m√©triques
x = np.arange(len(CLASS_NAMES))
width = 0.25

axes[0].bar(x - width, df_metrics['Precision'], width, 
            label='Precision', color='steelblue', edgecolor='black')
axes[0].bar(x, df_metrics['Recall'], width, 
            label='Recall', color='darkorange', edgecolor='black')
axes[0].bar(x + width, df_metrics['F1-Score'], width, 
            label='F1-Score', color='forestgreen', edgecolor='black')

axes[0].set_xlabel('Classe', fontsize=12, fontweight='bold')
axes[0].set_ylabel('Score', fontsize=12, fontweight='bold')
axes[0].set_title('M√©triques par Classe (CNN-MFCC)', fontsize=14, fontweight='bold')
axes[0].set_xticks(x)
axes[0].set_xticklabels(CLASS_NAMES, rotation=45, ha='right')
axes[0].legend(fontsize=11)
axes[0].grid(True, alpha=0.3, axis='y')
axes[0].set_ylim([0, 1.05])

# 2. Heatmap
metrics_matrix = df_metrics[['Precision', 'Recall', 'F1-Score']].values.T
im = axes[1].imshow(metrics_matrix, aspect='auto', cmap='RdYlGn', vmin=0, vmax=1)
axes[1].set_xticks(x)
axes[1].set_xticklabels(CLASS_NAMES, rotation=45, ha='right')
axes[1].set_yticks([0, 1, 2])
axes[1].set_yticklabels(['Precision', 'Recall', 'F1-Score'])
axes[1].set_title('Heatmap des M√©triques', fontsize=14, fontweight='bold')

# Ajouter les valeurs sur le heatmap
for i in range(3):
    for j in range(len(CLASS_NAMES)):
        text = axes[1].text(j, i, f'{metrics_matrix[i, j]:.2f}',
                           ha='center', va='center', color='black', fontweight='bold')

fig.colorbar(im, ax=axes[1])

plt.tight_layout()
plt.savefig(OUTPUT_DIR / 'cnn_class_metrics.png', dpi=300, bbox_inches='tight')
plt.show()

print("üíæ Graphique sauvegard√© : cnn_class_metrics.png")

In [None]:
# Matrice de confusion (simulation bas√©e sur les m√©triques)
print("\nüéØ Analyse des classes :")
print("\n‚úÖ Classes bien classifi√©es :")
print("   ‚Ä¢ Helicopter (F1=0.87) : Meilleure performance")
print("   ‚Ä¢ Speech (F1=0.66) : Bonne reconnaissance")
print("\n‚ö†Ô∏è Classes difficiles :")
print("   ‚Ä¢ Fighter Aircraft (F1=0.46) : Haute pr√©cision mais faible rappel")
print("   ‚Ä¢ Footsteps (F1=0.45) : Difficult√© de d√©tection")
print("   ‚Ä¢ Background (F1=0.53) : Confusion avec autres classes")
print("\nüîÑ Confusions probables :")
print("   ‚Ä¢ Truck ‚Üî Military Vehicle")
print("   ‚Ä¢ Background ‚Üî Autres sons")
print("   ‚Ä¢ Fighter Aircraft confondu avec autres bruits a√©riens")

---

## 8. Conclusion {#8-conclusion}

### R√©sum√© des R√©sultats CNN-MFCC

**‚úÖ Points Forts** :
1. **Architecture l√©g√®re** : 242K param√®tres (~1 MB)
2. **Convergence rapide** : Meilleure performance √† l'epoch 29
3. **Accuracy respectable** : 66.88% sur validation
4. **Classes bien reconnues** : Helicopter (87%), Speech (66%)
5. **Entra√Ænement rapide** : 2-3 heures sur GPU

**‚ö†Ô∏è Limitations** :
1. **Overfitting s√©v√®re** : D√©gradation de 66.88% ‚Üí 57.95%
2. **Classes difficiles** : Fighter (46%), Footsteps (45%)
3. **Dur√©e audio courte** : Seulement 3 secondes
4. **Features limit√©es** : MFCC seuls, pas de contexte temporel long

### M√©triques Finales

| M√©trique | Valeur |
|----------|--------|
| **Best Val Accuracy** | 66.88% (epoch 29) |
| **Final Val Accuracy** | 57.95% (epoch 150) |
| **Train Loss (final)** | 0.7652 |
| **Val Loss (final)** | 1.3161 |
| **Param√®tres** | 242,000 |
| **Temps d'entra√Ænement** | 2-3 heures |
| **F1-Score moyen** | 0.57 |

### Am√©liorations Possibles

1. **Early Stopping** : Arr√™ter √† l'epoch 29 pour √©viter l'overfitting
2. **Plus de Regularization** : Augmenter dropout, weight decay
3. **Data Augmentation** : Plus de vari√©t√© dans SpecAugment
4. **Dur√©e audio** : Augmenter √† 4-5 secondes
5. **Architecture** : Tester CRNN pour contexte temporel

### Prochaines √âtapes

Le **Notebook 3** explorera le mod√®le **CRNN-MFCC** qui am√©liore les r√©sultats √† **73.21%** gr√¢ce √† :
- Mod√©lisation temporelle avec BiLSTM
- Dur√©e audio de 4 secondes
- Architecture plus profonde (1.5M param√®tres)

---

<div style="text-align: center; padding: 20px; background-color: #e8f4f8; border-radius: 10px;">
    <h3>üéâ Notebook 2 Compl√©t√© !</h3>
    <p><b>CNN-MFCC : Baseline du Projet SereneSense</b></p>
    <p>Accuracy : 66.88% | Param√®tres : 242K | Dur√©e : 3s</p>
</div>