# üèóÔ∏è Construire Votre Premier R√©seau Complet

Maintenant qu'on comprend tous les concepts, construisons un r√©seau de neurones complet from scratch !

## üéØ Objectifs

1. **Cr√©er une classe NeuralNetwork** modulaire et r√©utilisable
2. **Entra√Æner sur MNIST complet** avec toutes les optimisations
3. **Suivre les m√©triques** pendant l'entra√Ænement
4. **Visualiser les r√©sultats** et analyser les performances
5. **Sauvegarder et charger** le mod√®le entra√Æn√©

---

In [None]:
# Imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import sys
import pickle
from time import time

sys.path.append(str(Path.cwd().parent))

plt.style.use('seaborn-v0_8-darkgrid')
plt.rcParams['figure.figsize'] = (14, 8)

np.random.seed(42)

print("‚úÖ Ready to build!")

## 1Ô∏è‚É£ Classe NeuralNetwork Compl√®te

Cr√©ons une classe qui encapsule tout ce qu'on a appris !

In [None]:
class NeuralNetwork:
    """
    R√©seau de neurones multi-couches from scratch
    """
    
    def __init__(self, layer_dims, learning_rate=0.01):
        """
        Initialise le r√©seau
        
        Args:
            layer_dims: liste des dimensions [input, hidden1, hidden2, ..., output]
            learning_rate: taux d'apprentissage
        """
        self.layer_dims = layer_dims
        self.learning_rate = learning_rate
        self.parameters = self._initialize_parameters()
        self.history = {'loss': [], 'train_acc': [], 'val_acc': []}
        
    def _initialize_parameters(self):
        """Initialise les poids avec He initialization"""
        np.random.seed(42)
        parameters = {}
        L = len(self.layer_dims)
        
        for l in range(1, L):
            parameters[f'W{l}'] = np.random.randn(self.layer_dims[l-1], self.layer_dims[l]) * \
                                 np.sqrt(2 / self.layer_dims[l-1])
            parameters[f'b{l}'] = np.zeros((1, self.layer_dims[l]))
        
        return parameters
    
    def relu(self, Z):
        """ReLU activation"""
        return np.maximum(0, Z)
    
    def softmax(self, Z):
        """Softmax activation"""
        exp_Z = np.exp(Z - np.max(Z, axis=1, keepdims=True))
        return exp_Z / np.sum(exp_Z, axis=1, keepdims=True)
    
    def forward(self, X):
        """
        Forward propagation
        """
        W1, b1 = self.parameters['W1'], self.parameters['b1']
        W2, b2 = self.parameters['W2'], self.parameters['b2']
        
        # Layer 1
        Z1 = np.dot(X, W1) + b1
        A1 = self.relu(Z1)
        
        # Layer 2
        Z2 = np.dot(A1, W2) + b2
        A2 = self.softmax(Z2)
        
        cache = {'Z1': Z1, 'A1': A1, 'Z2': Z2, 'A2': A2, 'X': X}
        return A2, cache
    
    def compute_loss(self, Y_true, Y_pred):
        """
        Cross-entropy loss
        """
        n_samples = Y_true.shape[0]
        epsilon = 1e-7
        Y_pred = np.clip(Y_pred, epsilon, 1 - epsilon)
        loss = -np.sum(Y_true * np.log(Y_pred)) / n_samples
        return loss
    
    def backward(self, Y, cache):
        """
        Backpropagation
        """
        X = cache['X']
        A1 = cache['A1']
        A2 = cache['A2']
        Z1 = cache['Z1']
        
        n_samples = X.shape[0]
        W2 = self.parameters['W2']
        
        # Layer 2 gradients
        dZ2 = A2 - Y
        dW2 = np.dot(A1.T, dZ2) / n_samples
        db2 = np.sum(dZ2, axis=0, keepdims=True) / n_samples
        
        # Layer 1 gradients
        dA1 = np.dot(dZ2, W2.T)
        dZ1 = dA1 * (Z1 > 0)
        dW1 = np.dot(X.T, dZ1) / n_samples
        db1 = np.sum(dZ1, axis=0, keepdims=True) / n_samples
        
        gradients = {'dW1': dW1, 'db1': db1, 'dW2': dW2, 'db2': db2}
        return gradients
    
    def update_parameters(self, gradients):
        """
        Mise √† jour des poids par descente de gradient
        """
        for key in self.parameters.keys():
            self.parameters[key] -= self.learning_rate * gradients['d' + key]
    
    def predict(self, X):
        """
        Fait des pr√©dictions
        """
        A2, _ = self.forward(X)
        return np.argmax(A2, axis=1)
    
    def accuracy(self, X, y):
        """
        Calcule l'accuracy
        """
        predictions = self.predict(X)
        return np.mean(predictions == y)
    
    def one_hot_encode(self, y, n_classes=10):
        """
        One-hot encoding
        """
        one_hot = np.zeros((y.shape[0], n_classes))
        one_hot[np.arange(y.shape[0]), y] = 1
        return one_hot
    
    def train(self, X_train, y_train, X_val, y_val, epochs=10, batch_size=128, verbose=True):
        """
        Entra√Æne le r√©seau
        """
        n_samples = X_train.shape[0]
        n_batches = n_samples // batch_size
        
        # One-hot encode labels
        Y_train = self.one_hot_encode(y_train)
        
        if verbose:
            print("\n" + "="*70)
            print("üéì D√âBUT DE L'ENTRA√éNEMENT")
            print("="*70)
            print(f"\nConfiguration:")
            print(f"  ‚Ä¢ Architecture: {' ‚Üí '.join(map(str, self.layer_dims))}")
            print(f"  ‚Ä¢ Learning rate: {self.learning_rate}")
            print(f"  ‚Ä¢ Batch size: {batch_size}")
            print(f"  ‚Ä¢ √âpoques: {epochs}")
            print(f"  ‚Ä¢ Exemples d'entra√Ænement: {n_samples:,}")
            print(f"  ‚Ä¢ Batches par √©poque: {n_batches}")
            print("\n" + "="*70 + "\n")
        
        for epoch in range(epochs):
            epoch_start = time()
            
            # M√©langer les donn√©es
            indices = np.random.permutation(n_samples)
            X_shuffled = X_train[indices]
            Y_shuffled = Y_train[indices]
            
            epoch_loss = 0
            
            # Mini-batch training
            for i in range(n_batches):
                start = i * batch_size
                end = start + batch_size
                
                X_batch = X_shuffled[start:end]
                Y_batch = Y_shuffled[start:end]
                
                # Forward
                A2, cache = self.forward(X_batch)
                
                # Loss
                loss = self.compute_loss(Y_batch, A2)
                epoch_loss += loss
                
                # Backward
                gradients = self.backward(Y_batch, cache)
                
                # Update
                self.update_parameters(gradients)
            
            # M√©triques
            avg_loss = epoch_loss / n_batches
            train_acc = self.accuracy(X_train, y_train)
            val_acc = self.accuracy(X_val, y_val)
            
            # Sauvegarder l'historique
            self.history['loss'].append(avg_loss)
            self.history['train_acc'].append(train_acc)
            self.history['val_acc'].append(val_acc)
            
            epoch_time = time() - epoch_start
            
            if verbose:
                print(f"√âpoque {epoch+1:2d}/{epochs} - "
                      f"Loss: {avg_loss:.4f} - "
                      f"Train Acc: {train_acc:.4f} - "
                      f"Val Acc: {val_acc:.4f} - "
                      f"Temps: {epoch_time:.2f}s")
        
        if verbose:
            print("\n" + "="*70)
            print("‚úÖ ENTRA√éNEMENT TERMIN√â")
            print("="*70)
    
    def save(self, filepath):
        """
        Sauvegarde le mod√®le
        """
        with open(filepath, 'wb') as f:
            pickle.dump({
                'parameters': self.parameters,
                'layer_dims': self.layer_dims,
                'learning_rate': self.learning_rate,
                'history': self.history
            }, f)
        print(f"\nüíæ Mod√®le sauvegard√© dans: {filepath}")
    
    @staticmethod
    def load(filepath):
        """
        Charge un mod√®le sauvegard√©
        """
        with open(filepath, 'rb') as f:
            data = pickle.load(f)
        
        model = NeuralNetwork(data['layer_dims'], data['learning_rate'])
        model.parameters = data['parameters']
        model.history = data['history']
        print(f"\nüìÇ Mod√®le charg√© depuis: {filepath}")
        return model

print("‚úÖ Classe NeuralNetwork cr√©√©e!")

## 2Ô∏è‚É£ Chargement des Donn√©es MNIST Compl√®tes

In [None]:
from src.utils import load_mnist_data

print("‚è≥ Chargement de MNIST...")
X_train, y_train, X_test, y_test = load_mnist_data()

print("\n‚úÖ Donn√©es charg√©es!")
print(f"\nTrain set: {X_train.shape[0]:,} exemples")
print(f"Test set:  {X_test.shape[0]:,} exemples")

## 3Ô∏è‚É£ Entra√Ænement du R√©seau

C'est le moment de v√©rit√© ! Entra√Ænons notre r√©seau sur MNIST complet !

In [None]:
# Cr√©er le r√©seau
model = NeuralNetwork(
    layer_dims=[784, 128, 10],
    learning_rate=0.1
)

# Entra√Æner
model.train(
    X_train, y_train,
    X_test, y_test,
    epochs=10,
    batch_size=128,
    verbose=True
)

## 4Ô∏è‚É£ Visualisation des R√©sultats

In [None]:
def plot_training_results(model):
    """
    Visualise les courbes d'apprentissage
    """
    history = model.history
    epochs = range(1, len(history['loss']) + 1)
    
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))
    fig.suptitle('üìä R√©sultats de l\'Entra√Ænement', fontsize=18, fontweight='bold')
    
    # Loss
    ax1.plot(epochs, history['loss'], 'o-', linewidth=3, markersize=8,
            color='#e74c3c', label='Loss')
    ax1.set_xlabel('√âpoque', fontsize=13, fontweight='bold')
    ax1.set_ylabel('Loss', fontsize=13, fontweight='bold')
    ax1.set_title('üìâ √âvolution de la Loss', fontsize=15, fontweight='bold')
    ax1.grid(True, alpha=0.3)
    ax1.legend(fontsize=12)
    
    # Accuracy
    ax2.plot(epochs, [acc*100 for acc in history['train_acc']],
            'o-', linewidth=3, markersize=8, color='#2ecc71', label='Train')
    ax2.plot(epochs, [acc*100 for acc in history['val_acc']],
            's-', linewidth=3, markersize=8, color='#3498db', label='Validation')
    ax2.set_xlabel('√âpoque', fontsize=13, fontweight='bold')
    ax2.set_ylabel('Accuracy (%)', fontsize=13, fontweight='bold')
    ax2.set_title('üìà √âvolution de l\'Accuracy', fontsize=15, fontweight='bold')
    ax2.grid(True, alpha=0.3)
    ax2.legend(fontsize=12)
    ax2.set_ylim(0, 100)
    
    plt.tight_layout()
    plt.show()
    
    # Statistiques
    print("\n" + "="*70)
    print("üèÜ R√âSULTATS FINAUX")
    print("="*70)
    print(f"\nüìä Loss: {history['loss'][0]:.4f} ‚Üí {history['loss'][-1]:.4f}")
    print(f"   R√©duction: {(1-history['loss'][-1]/history['loss'][0])*100:.1f}%")
    print(f"\nüéØ Train Accuracy: {history['train_acc'][-1]:.2%}")
    print(f"üéØ Val Accuracy: {history['val_acc'][-1]:.2%}")
    
    # Comparaison
    baseline = 0.10  # Random guessing
    improvement = (history['val_acc'][-1] - baseline) / baseline * 100
    print(f"\nüí™ Am√©lioration vs random: +{improvement:.0f}%")

plot_training_results(model)

## 5Ô∏è‚É£ Matrice de Confusion

In [None]:
from sklearn.metrics import confusion_matrix
import seaborn as sns

def plot_confusion_matrix(model, X_test, y_test):
    """
    Affiche la matrice de confusion
    """
    # Pr√©dictions
    y_pred = model.predict(X_test)
    
    # Matrice de confusion
    cm = confusion_matrix(y_test, y_pred)
    
    # Normaliser
    cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
    
    # Visualiser
    plt.figure(figsize=(12, 10))
    sns.heatmap(cm_normalized, annot=True, fmt='.2f', cmap='Blues',
               xticklabels=range(10), yticklabels=range(10),
               cbar_kws={'label': 'Accuracy'})
    plt.xlabel('Pr√©diction', fontsize=14, fontweight='bold')
    plt.ylabel('Vrai Label', fontsize=14, fontweight='bold')
    plt.title('üéØ Matrice de Confusion (Normalis√©e)', fontsize=16, fontweight='bold', pad=20)
    plt.tight_layout()
    plt.show()
    
    # Analyse
    print("\nüìä Accuracy par classe:")
    for i in range(10):
        acc = cm_normalized[i, i]
        print(f"   Chiffre {i}: {acc:.2%}")
    
    # Erreurs les plus fr√©quentes
    cm_errors = cm_normalized.copy()
    np.fill_diagonal(cm_errors, 0)
    
    print("\n‚ùå Confusions les plus fr√©quentes:")
    for _ in range(5):
        i, j = np.unravel_index(cm_errors.argmax(), cm_errors.shape)
        if cm_errors[i, j] > 0.01:
            print(f"   {i} confondu avec {j}: {cm_errors[i,j]:.1%}")
            cm_errors[i, j] = 0

plot_confusion_matrix(model, X_test, y_test)

## 6Ô∏è‚É£ Visualiser des Pr√©dictions

In [None]:
def show_predictions(model, X, y, n_samples=20):
    """
    Affiche des pr√©dictions du mod√®le
    """
    indices = np.random.choice(len(X), n_samples, replace=False)
    
    # Pr√©dictions
    preds = model.predict(X[indices])
    probs, _ = model.forward(X[indices])
    
    fig, axes = plt.subplots(4, 5, figsize=(16, 13))
    fig.suptitle('üé® Pr√©dictions du Mod√®le Entra√Æn√©', fontsize=18, fontweight='bold')
    
    for idx, ax in enumerate(axes.flat):
        image = X[indices[idx]].reshape(28, 28)
        true_label = y[indices[idx]]
        pred_label = preds[idx]
        confidence = probs[idx, pred_label]
        
        # Afficher l'image
        ax.imshow(image, cmap='gray_r')
        
        # Titre avec couleur
        color = 'green' if pred_label == true_label else 'red'
        title = f'Vrai: {true_label} | Pred: {pred_label}\nConf: {confidence:.1%}'
        ax.set_title(title, fontsize=11, fontweight='bold', color=color)
        ax.axis('off')
        
        # Bordure
        for spine in ax.spines.values():
            spine.set_edgecolor(color)
            spine.set_linewidth(3)
            spine.set_visible(True)
    
    plt.tight_layout()
    plt.show()
    
    # Stats
    correct = np.sum(preds == y[indices])
    print(f"\n‚úÖ Correct: {correct}/{n_samples} ({correct/n_samples:.1%})")

show_predictions(model, X_test, y_test)

## 7Ô∏è‚É£ Sauvegarder le Mod√®le

In [None]:
# Cr√©er le dossier models s'il n'existe pas
models_dir = Path('../models')
models_dir.mkdir(exist_ok=True)

# Sauvegarder
model_path = models_dir / 'mnist_network.pkl'
model.save(model_path)

print(f"\n‚úÖ Mod√®le pr√™t √† √™tre utilis√©!")

## üéØ R√©capitulatif

**F√©licitations ! Tu as construit et entra√Æn√© ton premier r√©seau de neurones from scratch ! üéâ**

### ‚úÖ Ce que nous avons accompli

1. **Cr√©√© une classe NeuralNetwork** compl√®te et modulaire
2. **Entra√Æn√© sur MNIST** avec ~95%+ d'accuracy
3. **Visualis√© les performances** avec des graphiques clairs
4. **Analys√© les erreurs** avec la matrice de confusion
5. **Sauvegard√© le mod√®le** pour une utilisation future

### üèÜ R√©sultats typiques attendus

Avec ce r√©seau simple (784 ‚Üí 128 ‚Üí 10):
- **Train accuracy**: ~96-98%
- **Test accuracy**: ~95-97%
- **Temps d'entra√Ænement**: 2-5 minutes sur CPU

### üí° Points cl√©s

- üèóÔ∏è **Architecture modulaire** : facile √† r√©utiliser et √©tendre
- üìä **Suivi des m√©triques** : important pour comprendre l'apprentissage
- üéØ **Bonnes performances** : ~95% avec un r√©seau simple!
- üíæ **Sauvegarde/Chargement** : pratique pour r√©utiliser le mod√®le

### üöÄ Prochaine √âtape

Maintenant, comment am√©liorer encore plus les performances ?

**‚û°Ô∏è Dernier notebook: `05_improvements_optimization.ipynb`**

On va explorer:
- Diff√©rentes architectures
- Techniques d'optimisation avanc√©es
- Data augmentation
- R√©gularisation
- Et atteindre 98%+ d'accuracy !

---

**Tu as maintenant un r√©seau de neurones fonctionnel ! Awesome ! üåü**