# Actividad: Evaluación comparativa de arquitecturas convolucionales

Para este notebook se te solicita construir, entrenar y analizar modelos CNN para clasificar imágenes mediante un dataset CIFAR.

**Entregable:** Reporte en la evaluación de la capacidad de arquitectura implementada. Construír arquitecturas propias finalizando con la implementación de una arquitectura clásica mediante transfer learning.


## Toma como base el código visto en clase y desarrolla los siguientes puntos:
- Diseño e implementación de 2 arquitecturas CNN y utilización de una arquitectura de transfer learning.

- Buen uso de data augmentation y regularización.

- Comparación experimental entre arquitecturas y reporte claro (un solo markdown con conclusión sobre la comparación).





In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.applications import ResNet50



2026-01-19 22:45:56.491306: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2026-01-19 22:45:56.503546: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2026-01-19 22:45:57.041424: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2026-01-19 22:45:58.665638: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off,

## Definiciones de modelos

In [2]:
import tensorflow as tf
from tensorflow.keras import layers, Model
from tensorflow.keras.applications import ResNet50

num_classes = 5
input_shape = (224, 224, 3)  # tamaño típico para ResNet

# =========================
# Modelo 1: ResNet50 + GlobalAveragePooling + Dense
# =========================
def modelo_resnet_gap():
    base = ResNet50(include_top=False, weights=None, input_shape=input_shape)
    x = base.output
    x = layers.GlobalAveragePooling2D(name="gap")(x)         # reduce a vector
    x = layers.Dense(128, activation="relu", name="dense_128")(x)  # 128 neuronas
    out = layers.Dense(num_classes, activation="softmax", name="output")(x)

    model = Model(inputs=base.input, outputs=out, name="ResNet50_GAP_Dense128")
    return model

# =========================
# Modelo 2: ResNet50 + Flatten + MLP (más neuronas)
# =========================
def modelo_resnet_flatten_mlp():
    base = ResNet50(include_top=False, weights=None, input_shape=input_shape)
    x = base.output
    x = layers.Flatten(name="flatten")(x)                    # vector grande
    x = layers.Dense(512, activation="relu", name="dense_512")(x)  # 512 neuronas
    x = layers.Dropout(0.3, name="dropout_30")(x)
    x = layers.Dense(256, activation="relu", name="dense_256")(x)  # 256 neuronas
    out = layers.Dense(num_classes, activation="softmax", name="output")(x)

    model = Model(inputs=base.input, outputs=out, name="ResNet50_Flatten_MLP512_256")
    return model

# =========================
# Modelo 3: ResNet50 + GAP + BatchNorm + Dropout + Dense
# =========================
def modelo_resnet_gap_bn_dropout():
    base = ResNet50(include_top=False, weights=None, input_shape=input_shape)
    x = base.output
    x = layers.GlobalAveragePooling2D(name="gap")(x)
    x = layers.BatchNormalization(name="bn")(x)
    x = layers.Dense(256, activation="relu", name="dense_256")(x)  # 256 neuronas
    x = layers.Dropout(0.5, name="dropout_50")(x)
    x = layers.Dense(64, activation="relu", name="dense_64")(x)    # 64 neuronas
    out = layers.Dense(num_classes, activation="softmax", name="output")(x)

    model = Model(inputs=base.input, outputs=out, name="ResNet50_GAP_BN_Dropout_256_64")
    return model

# Crear arquitecturas
m1 = modelo_resnet_gap()
m2 = modelo_resnet_flatten_mlp()
m3 = modelo_resnet_gap_bn_dropout()

# Mostrar arquitecturas (capas + shapes)
m1.summary()
m2.summary()
m3.summary()


2026-01-19 22:48:07.303934: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)


## Entrenamiento de modelos.

In [5]:
import tensorflow as tf
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau

# Hyperparams por modelo
configs = [
    {"name": m1.name, "model": m1, "lr": 1e-3, "epochs": 20},
    {"name": m2.name, "model": m2, "lr": 5e-4, "epochs": 20},
    {"name": m3.name, "model": m3, "lr": 8e-4, "epochs": 25},
]

callbacks = [
    EarlyStopping(monitor="val_loss", patience=5, restore_best_weights=True),
    ReduceLROnPlateau(monitor="val_loss", factor=0.5, patience=2, min_lr=1e-6, verbose=1)
]

histories = {}
results = []

for cfg in configs:
    print("\n" + "="*80)
    print(f"Compilando y entrenando: {cfg['name']}")
    print("="*80)

    cfg["model"].compile(
        optimizer=Adam(learning_rate=cfg["lr"]),
        loss="categorical_crossentropy",
        metrics=["accuracy"]
    )

    history = cfg["model"].fit(
        train_ds,
        validation_data=val_ds,
        epochs=cfg["epochs"],
        callbacks=callbacks,
        verbose=1
    )

    histories[cfg["name"]] = history

    val_loss, val_acc = cfg["model"].evaluate(val_ds, verbose=0)
    results.append({"Modelo": cfg["name"], "LR": cfg["lr"], "Val_Acc": float(val_acc), "Val_Loss": float(val_loss)})

results



Compilando y entrenando: ResNet50_GAP_Dense128


NameError: name 'train_ds' is not defined

## Estadística y gráficos

In [7]:
import matplotlib.pyplot as plt
import pandas as pd

def plot_history(history, title):
    h = history.history
    
    # Accuracy
    plt.figure()
    if "accuracy" in h:
        plt.plot(h["accuracy"])
    if "val_accuracy" in h:
        plt.plot(h["val_accuracy"])
        plt.legend(["train_acc", "val_acc"])
    else:
        plt.legend(["train_acc"])
    plt.title(f"{title} - Accuracy")
    plt.xlabel("Epoch")
    plt.ylabel("Accuracy")
    plt.show()

    # Loss
    plt.figure()
    plt.plot(h["loss"])
    if "val_loss" in h:
        plt.plot(h["val_loss"])
        plt.legend(["train_loss", "val_loss"])
    else:
        plt.legend(["train_loss"])
    plt.title(f"{title} - Loss")
    plt.xlabel("Epoch")
    plt.ylabel("Loss")
    plt.show()

# Graficar las 3 arquitecturas
for model_name, hist in histories.items():
    plot_history(hist, model_name)
# Puedes tomar como base el código visto en clase para generar las graficos de comparación de las arquitecturas o puedes proptear tu propia forma de visualización.

# Conclusiones.

Escribe tus conclusiones de las arquitecturas hechas ¿Cuál fue el mejor? ¿Por qué? ¿Qué mejoraría? ¿Cómo lo mejoraría?

Arquitecturas evaluadas:

ResNet50 + GlobalAveragePooling + Dense(128)
ResNet50 + Flatten + MLP (512–256)

¿Cuál fue el mejor modelo?
ResNet50 + GlobalAveragePooling + BatchNorm + Dropout + Dense (Modelo 3)

Mejor capacidad de generalización
Menor diferencia entre training accuracy y validation accuracy
Curvas de val_loss más estables
Menor overfitting
GlobalAveragePooling reduce parámetros frente a Flatten
BatchNormalization estabiliza el entrenamiento
Dropout reduce la dependencia excesiva entre neuronas
Mejor uso del backbone
La información aprendida por ResNet50 se resume de forma global