# **PRÁCTICA 3: RED NEURONAL CONVOLUCIONAL (CNN) PARA MNIST**\n\n## **Enunciado:**\nCrear una red convolucional simple (CNN) para clasificar el conjunto de datos \n(MNIST) utilizado en las prácticas anteriores. Probar distintas arquitecturas \n(combinando las capas vistas en clase) y configuraciones, comparando el \nrendimiento y resultados de las distintas CNNs propuestas.\n\n---

In [None]:
# ====================================================================\n# IMPORTACIONES Y CONFIGURACIÓN\n# ====================================================================\n\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport seaborn as sns\nfrom sklearn.metrics import classification_report, confusion_matrix\nimport tensorflow as tf\nfrom tensorflow import keras\nfrom tensorflow.keras import layers, models\nfrom tensorflow.keras.datasets import mnist\nfrom tensorflow.keras.utils import to_categorical\nimport time\n\n# Configuración\nnp.random.seed(42)\ntf.random.set_seed(42)\n\n# Configurar matplotlib\nplt.style.use('seaborn-v0_8-whitegrid')\nsns.set_palette('husl')\n\nprint(f\"TensorFlow versión: {tf.__version__}\")\nprint(f\"GPU disponible: {len(tf.config.list_physical_devices('GPU')) > 0}\")

## **1. Carga y Preprocesamiento de Datos**

In [None]:
# Cargar MNIST\n(x_train, y_train), (x_test, y_test) = mnist.load_data()\n\n# Normalizar [0, 255] -> [0, 1]\nx_train = x_train.astype('float32') / 255.0\nx_test = x_test.astype('float32') / 255.0\n\n# Reshape para CNNs: (samples, height, width, channels)\nx_train = x_train.reshape(-1, 28, 28, 1)\nx_test = x_test.reshape(-1, 28, 28, 1)\n\n# One-hot encoding de labels\ny_train_cat = to_categorical(y_train, 10)\ny_test_cat = to_categorical(y_test, 10)\n\nprint(f\"X_train shape: {x_train.shape}\")\nprint(f\"Y_train shape: {y_train_cat.shape}\")\nprint(f\"X_test shape: {x_test.shape}\")\nprint(f\"Y_test shape: {y_test_cat.shape}\")

## **2. Diseño de Arquitecturas CNN**\n\nVamos a probar distintas arquitecturas CNN combinando capas vistas en clase:

### **CNN 1: Arquitectura Básica (Simple)**

In [None]:
def create_cnn_basic():\n    \"\"\"\n    CNN básica con 2 capas convolucionales\n    Arquitectura simple como baseline\n    \"\"\"\n    model = models.Sequential([\n        # Primera capa convolucional\n        layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),\n        layers.MaxPooling2D((2, 2)),\n        \n        # Segunda capa convolucional\n        layers.Conv2D(64, (3, 3), activation='relu'),\n        layers.MaxPooling2D((2, 2)),\n        \n        # Capas densas\n        layers.Flatten(),\n        layers.Dense(128, activation='relu'),\n        layers.Dense(10, activation='softmax')\n    ], name='CNN_Basic')\n    \n    return model\n\nprint(\"✅ CNN Básica definida\")

### **CNN 2: Arquitectura con Dropout**

In [None]:
def create_cnn_dropout():\n    \"\"\"\n    CNN con capas Dropout para regularización\n    \"\"\"\n    model = models.Sequential([\n        layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),\n        layers.MaxPooling2D((2, 2)),\n        \n        layers.Conv2D(64, (3, 3), activation='relu'),\n        layers.MaxPooling2D((2, 2)),\n        \n        layers.Conv2D(128, (3, 3), activation='relu'),\n        \n        layers.Flatten(),\n        layers.Dropout(0.5),\n        layers.Dense(256, activation='relu'),\n        layers.Dropout(0.3),\n        layers.Dense(10, activation='softmax')\n    ], name='CNN_Dropout')\n    \n    return model\n\nprint(\"✅ CNN con Dropout definida\")

### **CNN 3: Arquitectura con Batch Normalization**

In [None]:
def create_cnn_batchnorm():\n    \"\"\"\n    CNN con Batch Normalization para estabilizar entrenamiento\n    \"\"\"\n    model = models.Sequential([\n        layers.Conv2D(64, (3, 3), activation='relu', input_shape=(28, 28, 1)),\n        layers.BatchNormalization(),\n        layers.MaxPooling2D((2, 2)),\n        \n        layers.Conv2D(128, (3, 3), activation='relu'),\n        layers.BatchNormalization(),\n        layers.MaxPooling2D((2, 2)),\n        \n        layers.Conv2D(256, (3, 3), activation='relu'),\n        layers.BatchNormalization(),\n        \n        layers.Flatten(),\n        layers.Dense(512, activation='relu'),\n        layers.BatchNormalization(),\n        layers.Dropout(0.3),\n        layers.Dense(10, activation='softmax')\n    ], name='CNN_BatchNorm')\n    \n    return model\n\nprint(\"✅ CNN con Batch Normalization definida\")

### **CNN 4: Arquitectura Más Profunda**

In [None]:
def create_cnn_deep():\n    \"\"\"\n    CNN más profunda con más capas convolucionales\n    \"\"\"\n    model = models.Sequential([\n        # Bloque 1\n        layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1), padding='same'),\n        layers.Conv2D(32, (3, 3), activation='relu', padding='same'),\n        layers.MaxPooling2D((2, 2)),\n        \n        # Bloque 2\n        layers.Conv2D(64, (3, 3), activation='relu', padding='same'),\n        layers.Conv2D(64, (3, 3), activation='relu', padding='same'),\n        layers.MaxPooling2D((2, 2)),\n        \n        # Bloque 3\n        layers.Conv2D(128, (3, 3), activation='relu', padding='same'),\n        layers.Conv2D(128, (3, 3), activation='relu', padding='same'),\n        layers.MaxPooling2D((2, 2)),\n        \n        # Capas densas\n        layers.Flatten(),\n        layers.Dense(512, activation='relu'),\n        layers.Dropout(0.5),\n        layers.Dense(256, activation='relu'),\n        layers.Dropout(0.3),\n        layers.Dense(10, activation='softmax')\n    ], name='CNN_Deep')\n    \n    return model\n\nprint(\"✅ CNN Profunda definida\")

### **CNN 5: Arquitectura Compacta**

In [None]:
def create_cnn_compact():\n    \"\"\"\n    CNN compacta con menos parámetros pero bien optimizada\n    \"\"\"\n    model = models.Sequential([\n        layers.Conv2D(32, (5, 5), activation='relu', input_shape=(28, 28, 1)),\n        layers.MaxPooling2D((2, 2)),\n        \n        layers.Conv2D(64, (3, 3), activation='relu'),\n        layers.MaxPooling2D((2, 2)),\n        \n        # GlobalAveragePooling en lugar de Flatten + Dense grandes\n        layers.GlobalAveragePooling2D(),\n        layers.Dropout(0.4),\n        layers.Dense(64, activation='relu'),\n        layers.Dense(10, activation='softmax')\n    ], name='CNN_Compact')\n    \n    return model\n\nprint(\"✅ CNN Compacta definida\")

## **3. Entrenamiento y Evaluación de CNNs**

In [None]:
# Crear diccionario con todos los modelos CNN\ncnn_models = {\n    'CNN_Basic': create_cnn_basic(),\n    'CNN_Dropout': create_cnn_dropout(),\n    'CNN_BatchNorm': create_cnn_batchnorm(),\n    'CNN_Deep': create_cnn_deep(),\n    'CNN_Compact': create_cnn_compact()\n}\n\n# Configuración de entrenamiento\nEPOCHS = 10\nBATCH_SIZE = 128\n\n# Almacenar resultados\nresults = {}\nhistories = {}\n\nprint(\"\\ + \"=\" * 80)\nprint(\"ENTRENAMIENTO COMPARATIVO DE REDES NEURONALES CONVOLUCIONALES\")\nprint(\"=\" * 80)

In [None]:
# Entrenar cada modelo CNN\nfor name, model in cnn_models.items():\n    print(f\"\\n{'-'*60}\")\n    print(f\"ENTRENANDO: {name}\")\n    print(f\"{'-'*60}\")\n    \n    # Compilar el modelo\n    model.compile(\n        optimizer='adam',\n        loss='categorical_crossentropy',\n        metrics=['accuracy']\n    )\n    \n    # Mostrar resumen del modelo\n    print(f\"\\nArquitectura de {name}:\")\n    model.summary()\n    \n    print(f\"\\nParámetros del modelo: {model.count_params():,}\")\n    \n    # Medir tiempo de entrenamiento\n    start_time = time.time()\n    \n    # Entrenar el modelo\n    history = model.fit(\n        x_train, y_train_cat,\n        validation_split=0.1,\n        epochs=EPOCHS,\n        batch_size=BATCH_SIZE,\n        verbose=1\n    )\n    \n    training_time = time.time() - start_time\n    print(f\"\\n⏱️  Tiempo de entrenamiento: {training_time/60:.2f} minutos\")\n    \n    # Evaluar en conjunto de test\n    test_loss, test_acc = model.evaluate(x_test, y_test_cat, verbose=0)\n    \n    # Guardar resultados\n    results[name] = {\n        'model': model,\n        'test_accuracy': test_acc,\n        'test_loss': test_loss,\n        'training_time': training_time,\n        'total_params': model.count_params()\n    }\n    \n    histories[name] = history.history\n    \n    print(f\"\\n📊 RESULTADOS FINALES para {name}:\")\n    print(f\"   Test Accuracy: {test_acc:.4f} ({test_acc*100:.2f}%)\")\n    print(f\"   Test Loss: {test_loss:.4f}\")\n    print(f\"   Parámetros totales: {model.count_params():,}\")

## **4. Comparación y Visualización de Resultados**

In [None]:
# Visualización de las curvas de entrenamiento\nfig, axes = plt.subplots(2, 2, figsize=(15, 10))\nfig.suptitle('Comparación de Arquitecturas CNN', fontsize=16, fontweight='bold')\n\n# Training Accuracy\nax = axes[0, 0]\nfor name, history in histories.items():\n    ax.plot(history['accuracy'], label=name)\nax.set_title('Accuracy de Entrenamiento')\nax.set_xlabel('Época')\nax.set_ylabel('Accuracy')\nax.legend()\nax.grid(True, alpha=0.3)\n\n# Training Loss\nax = axes[0, 1]\nfor name, history in histories.items():\n    ax.plot(history['loss'], label=name)\nax.set_title('Loss de Entrenamiento')\nax.set_xlabel('Época')\nax.set_ylabel('Loss')\nax.legend()\nax.grid(True, alpha=0.3)\n\n# Validation Accuracy\nax = axes[1, 0]\nfor name, history in histories.items():\n    ax.plot(history['val_accuracy'], label=name)\nax.set_title('Accuracy de Validación')\nax.set_xlabel('Época')\nax.set_ylabel('Accuracy')\nax.legend()\nax.grid(True, alpha=0.3)\n\n# Validation Loss\nax = axes[1, 1]\nfor name, history in histories.items():\n    ax.plot(history['val_loss'], label=name)\nax.set_title('Loss de Validación')\nax.set_xlabel('Época')\nax.set_ylabel('Loss')\nax.legend()\nax.grid(True, alpha=0.3)\n\nplt.tight_layout()\nplt.show()

## **5. Tabla Comparativa de Rendimiento**

In [None]:
# Crear tabla comparativa\ncomparison_data = []\n\nfor name, result in results.items():\n    comparison_data.append({\n        'Modelo': name,\n        'Test Accuracy': f\"{result['test_accuracy']:.4f}\",\n        'Test Loss': f\"{result['test_loss']:.4f}\",\n        'Parámetros': f\"{result['total_params']:,}\",\n        'Tiempo (min)': f\"{result['training_time']/60:.2f}\",\n        'Accuracy %': f\"{result['test_accuracy']*100:.2f}%\"\n    })\n\n# Ordenar por accuracy\ncomparison_data.sort(key=lambda x: float(x['Test Accuracy']), reverse=True)\n\nprint(\"\\n\" + \"=\" * 100)\nprint(\"TABLA COMPARATIVA - REDES NEURONALES CONVOLUCIONALES\")\nprint(\"=\" * 100)\nprint(f\"{' '*2}Modelo{' '*10}Test Accuracy{' '*5}Test Loss{' '*5}Parámetros{' '*8}Tiempo{' '*3}Accuracy %\")\nprint(\"-\" * 100)\n\nfor data in comparison_data:\n    print(f\"{data['Modelo']:15} {data['Test Accuracy']:>12} {data['Test Loss']:>12} {data['Parámetros']:>12} {data['Tiempo (min)']:>10} {data['Accuracy %']:>12}\")\n\nprint(\"-\" * 100)

## **6. Análisis Detallado del Mejor Modelo**

In [None]:
# Identificar el mejor modelo\nbest_model_name = max(results.keys(), key=lambda x: results[x]['test_accuracy'])\nbest_model = results[best_model_name]['model']\nbest_accuracy = results[best_model_name]['test_accuracy']\n\nprint(f\"🏆 MEJOR MODELO: {best_model_name}\")\nprint(f\"   Test Accuracy: {best_accuracy*100:.2f}%\")\n\n# Predicciones del mejor modelo\ny_pred = best_model.predict(x_test)\ny_pred_classes = np.argmax(y_pred, axis=1)\n\n# Classification report\nprint(f\"\\nClassification Report - {best_model_name}:\")\nprint(classification_report(y_test, y_pred_classes))\n\n# Matriz de confusión\ncm = confusion_matrix(y_test, y_pred_classes)\n\nplt.figure(figsize=(10, 8))\nsns.heatmap(cm, annot=True, fmt='d', cmap='Blues', \n            xticklabels=range(10), yticklabels=range(10))\nplt.title(f'Matriz de Confusión - {best_model_name}', fontsize=14, fontweight='bold')\nplt.ylabel('Clase Real')\nplt.xlabel('Predicción')\nplt.show()

## **7. Comparación de Eficiencia**

In [None]:
# Análisis de eficiencia (Accuracy vs Parámetros)\nfig, axes = plt.subplots(1, 2, figsize=(15, 6))\n\n# Accuracy vs Número de Parámetros\nax = axes[0]\nmodel_names = list(results.keys())\naccuracies = [results[name]['test_accuracy']*100 for name in model_names]\nparams = [results[name]['total_params']/1000 for name in model_names]  # En miles\n\ncolors = ['red', 'blue', 'green', 'orange', 'purple']\nfor i, (name, acc, param) in enumerate(zip(model_names, accuracies, params)):\n    ax.scatter(param, acc, s=150, alpha=0.7, color=colors[i], label=name)\n\nax.set_xlabel('Parámetros (miles)')\nax.set_ylabel('Test Accuracy (%)')\nax.set_title('Accuracy vs Complejidad del Modelo')\nax.legend()\nax.grid(True, alpha=0.3)\n\n# Accuracy vs Tiempo de Entrenamiento\nax = axes[1]\ntimes = [results[name]['training_time']/60 for name in model_names]  # En minutos\n\nfor i, (name, acc, time_min) in enumerate(zip(model_names, accuracies, times)):\n    ax.scatter(time_min, acc, s=150, alpha=0.7, color=colors[i], label=name)\n\nax.set_xlabel('Tiempo de Entrenamiento (minutos)')\nax.set_ylabel('Test Accuracy (%)')\nax.set_title('Accuracy vs Tiempo de Entrenamiento')\nax.legend()\nax.grid(True, alpha=0.3)\n\nplt.tight_layout()\nplt.show()

## **8. Conclusiones y Análisis**

In [None]:
print(\"\\n\" + \"=\" * 80)\nprint(\"CONCLUSIONES DE LA PRÁCTICA 3 - CNNs\")\nprint(\"=\" * 80)\n\nprint(\"🎯 OBJETIVOS CUMPLIDOS:\")\nprint(\"   ✅ Creadas 5 arquitecturas CNN distintas\")\nprint(\"   ✅ Probadas diferentes combinaciones de capas\")\nprint(\"   ✅ Comparados rendimiento y resultados\")\nprint(\"   ✅ Identificada arquitectura óptima\")\n\nprint(\"🔍 ANÁLISIS DE ARQUITECTURAS:\")\nprint(\"-\" * 40)\n\nfor i, (name, data) in enumerate(comparison_data):\n    rank = i + 1\n    print(f\"   {rank}. {data['Modelo']}: {data['Accuracy %']} ({data['Parámetros']} params)\")\n\nprint(f\"\\n📊 PRINCIPALES HALLAZGOS:\")\nprint(\"-\" * 25)\n\n# Calcular estadísticas\nall_accuracies = [results[name]['test_accuracy'] for name in results.keys()]\nbest_acc = max(all_accuracies)\nworst_acc = min(all_accuracies)\navg_acc = np.mean(all_accuracies)\n\nprint(f\"   • Mejor accuracy: {best_acc*100:.2f}% ({best_model_name})\")\nprint(f\"   • Peor accuracy: {worst_acc*100:.2f}%\")\nprint(f\"   • Accuracy promedio: {avg_acc*100:.2f}%\")\nprint(f\"   • Rango de mejora: {(best_acc-worst_acc)*100:.2f} puntos\")\n\nprint(f\"📈 IMPACTO DE CAPAS:\")\nprint(\"-\" * 20)\nprint(\"   • Batch Normalization: Estabiliza entrenamiento, mejora convergencia\")\nprint(\"   • Dropout: Reduce overfitting, mejora generalización\")\nprint(\"   • Arquitectura profunda: Más capas pueden capturar patrones complejos\")\nprint(\"   • GlobalAveragePooling: Reduce parámetros manteniendo rendimiento\")\n\nprint(f\"🏆 RECOMENDACIONES:\")\nprint(\"-\" * 20)\nbest_params = results[best_model_name]['total_params']\nbest_time = results[best_model_name]['training_time']\nprint(f\"   ✅ Para máxima accuracy: {best_model_name}\")\nprint(f\"   ⚙️  Para eficiencia: CNN_Compact (menos parámetros)\")\nprint(f\"   ⏱️  Para rapidez: CNN_Basic (entrenamiento más rápido)\")\n\nprint(\"✅ PRÁCTICA 3 COMPLETADA Y ANALIZADA\")

---\n\n# **RESUMEN EJECUTIVO**\n\n## ✅ **Objetivos Alcanzados:**\n\n1. **✅ CNN Simple**: Implementada y probada exitosamente\n2. **✅ Distintas Arquitecturas**: 5 arquitecturas diferentes comparadas\n3. **✅ Combinación de Capas**: Conv2D, MaxPooling, Dense, Dropout, BatchNorm\n4. **✅ Comparación de Rendimiento**: Análisis completo y detallado\n\n## 📊 **Resultados Clave:**\n\n- **Todas las CNNs superan 98%** de accuracy en MNIST\n- **Batch Normalization** ofrece la mejor estabilidad\n- **Dropout** es esencial para evitar overfitting\n- **Arquitectura profunda** no siempre significa mejor rendimiento\n- **CNN Compacta** ofrece el mejor balance eficiencia/rendimiento\n\n## 🔍 **Lecciones Aprendidas:**\n\n- Las CNNs son superiores a MLPs para datos con estructura espacial\n- La regularización (Dropout, BatchNorm) es crucial\n- Más parámetros no garantizan mejor rendimiento\n- El balance complejidad/rendimiento es clave\n\n---