# Trabajo Pracatico 4 - Redes Neuronales Convolucionales

Vamos a realizar una clasificacion binaria de imagnes que pueden ser de gatos o perros, para ello vamos a utilizar un dataset de Kaggle llamado "cats-vs-dogs" que contiene 23.409 imágenes de gatos y perros. El objetivo es entrenar un modelo de clasificación binaria que pueda distinguir entre imágenes de gatos y perros.

Se van a proponer los siguientes modelos:

- **Modelo 1:** Red convolucional simple (SimpleCNN) creada desde cero.
- **Modelo 2:** Red convolucional basada en ResNet18.
- **Modelo 4:** Red convolucional avanzacda (AdvancedCNN) creada desde cero.
- **Modelo 3:** Red convolucional basada en ResNet18 con cambios en hiperparametros.
- **Modelo 5:** Red convolucional basada en Inception de Google con cambios en hiperparametros.

Asignamos el dataset a la variable **dataset**

In [None]:
from datasets import load_dataset

dataset = load_dataset("cats_vs_dogs")



Creamos un *DataFrame* llamado **mydataset**, el cual almacenará el path de cada imágen junto a su etiqueta (perro o gato). Además creamos un directorio llamado dataset y almacenamos allí las imágenes.


In [None]:
import pandas as pd
import os

main_dir = './dataset'
os.makedirs(main_dir, exist_ok=True)

mydataset = pd.DataFrame(columns=['image_path', 'label'])

for i in range(len(dataset['train'])):
    img_path = f"{main_dir}/img_{i}.jpeg"

    if not os.path.exists(img_path):
        dataset['train'][i]['image'].save(img_path)

    mydataset.at[i, 'image_path'] = img_path
    mydataset.at[i, 'label'] = dataset['train'][i]['labels']

mydataset.head()

Unnamed: 0,image_path,label
0,./dataset/img_0.jpeg,0
1,./dataset/img_1.jpeg,0
2,./dataset/img_2.jpeg,0
3,./dataset/img_3.jpeg,0
4,./dataset/img_4.jpeg,0


Creamos un diccionario para almacenar los parámetros que usaremos.

In [None]:
exp_config = dict()

Definimos la semilla para que al divir el dataset en train, test y val, sea siempre la misma división de datos. Además, especificamos la proporción de datos que serán para testeo y para validación.

In [None]:
seed = 42
test_size = 0.15
val_size = 0.20

exp_config['seed'] = seed
exp_config['test_size'] = test_size
exp_config['val_size'] = val_size

Dividimos el dataset en *train*, *test*, *val*.

**Aclaración:** los datos de validación surgen de una parte de los datos de testeo.

In [None]:
from sklearn.model_selection import train_test_split

train_val_df, test_df = train_test_split(mydataset, test_size=test_size, stratify=mydataset['label'], random_state=seed)

train_df, val_df = train_test_split(train_val_df, test_size=val_size, stratify=train_val_df['label'], random_state=seed)

Añadimos parámetros de configuración al diccionario.

In [None]:
exp_config['train_n_cats'] = train_df['label'].value_counts()[0]
exp_config['train_n_dogs'] = train_df['label'].value_counts()[1]
exp_config['val_n_cats'] = val_df['label'].value_counts()[0]
exp_config['val_n_dogs'] = val_df['label'].value_counts()[1]
exp_config['test_n_cats'] = test_df['label'].value_counts()[0]
exp_config['test_n_dogs'] = test_df['label'].value_counts()[1]

La clase **CatsDogsDataset** es una implementación personalizada de una clase llamda *Dataset* de PyTorch que permite cargar y transformar las imágenes del dataset.

**Explicación**
1. Constructor (\_\_init\_\_):  
- img_path_list: Lista de rutas de las imágenes.
- lab_list: Lista de etiquetas correspondientes a las imágenes (0 para gatos, 1 para perros).
- transform: Transformaciones opcionales que se aplicarán a las imágenes (por ejemplo, redimensionar, normalizar).
2. Método \_\_len\_\_:  
- Devuelve la cantidad de imágenes en el conjunto de datos.
3. Método \_\_getitem\_\_:
- idx: Índice de la imagen y etiqueta que se desea obtener.
- img_path: Obtiene la ruta de la imagen en el índice idx.
- image: Abre la imagen y la convierte a formato RGB.
- label: Obtiene la etiqueta correspondiente a la imagen y la convierte a un tensor de PyTorch.
- Si se especificaron transformaciones, se aplican a la imagen.
- Devuelve la imagen transformada y su etiqueta correspondiente.

In [None]:
from PIL import Image
import torch
from torch.utils.data import Dataset

class CatsDogsDataset(Dataset):
    def __init__(self, img_path_list, lab_list, transform=None):
        self.transform = transform
        self.images = img_path_list
        self.labels = lab_list

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img_path = self.images[idx]
        image = Image.open(img_path).convert("RGB")

        label = self.labels[idx]
        label = torch.Tensor([label])

        if self.transform:
            image = self.transform(image)

        return image, label

Definimos la resolución de las imágenes que serán procesadas.

In [None]:
input_size = (224,224)
exp_config['input_size'] = input_size

Como las imágenes son a color en formato RGB, definiremos 3 canales

In [None]:
n_channels = 3
exp_config['n_channels'] = n_channels

Creamos el *transform* que será usado, el cual redimensiona las imágenes a la resolución dada.

In [None]:
from torchvision import transforms

transform = transforms.Compose([
    transforms.Resize(input_size),
    transforms.ToTensor(),
])

Creamos los datasets de train, test y val.

In [None]:
train_dataset = CatsDogsDataset(train_df['image_path'].tolist(), train_df['label'].tolist(), transform)
test_dataset = CatsDogsDataset(test_df['image_path'].tolist(), test_df['label'].tolist(), transform)
val_dataset = CatsDogsDataset(val_df['image_path'].tolist(), val_df['label'].tolist(), transform)

Creamos los *DataLoaders* de train, test y val, y definimos el tamaño de lote.

**Aclaración:** el batch size de test es 1,los datos no serán mezclados por cada época y no se eliminarán datos para alcanzar el tamaño de lote establecido.

In [None]:
from torch.utils.data import DataLoader

batch_size = 64
exp_config['batch_size'] = batch_size

train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
val_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
test_dataloader = DataLoader(test_dataset, batch_size=1, shuffle=False, drop_last=False)

## WandB

In [None]:
import wandb

wandb.login(key="d567fa512c6502cc7986d8c90fd37c4f0969de0d")

[34m[1mwandb[0m: Currently logged in as: [33mintart-estudiantes[0m ([33mar-um[0m). Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


True

## Modelos

In [None]:
import torch.nn as nn

class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 16, kernel_size=3, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.relu = nn.ReLU()
        self.conv2 = nn.Conv2d(16, 32, kernel_size=3, padding=1)
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(32 * 56 * 56, 64)
        self.fc2 = nn.Linear(64, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
      x = self.conv1(x)
      x = self.relu(x)
      x = self.pool(x)

      x = self.conv2(x)
      x = self.relu(x)
      x = self.pool(x)

      x = self.flatten(x)
      x = self.fc1(x)
      x = self.relu(x)
      x = self.fc2(x)
      x = torch.sigmoid(x)

      return x

In [None]:
import torchvision.models as models

class ResNet18(nn.Module):
    def __init__(self):
        super(ResNet18, self).__init__()
        self.base_model = models.resnet18(pretrained=True)
        self.base_model.fc = nn.Linear(self.base_model.fc.in_features, 1)

    def forward(self, x):
        x = self.base_model(x)
        x = torch.sigmoid(x)

        return x

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F

class AdvancedCNN(nn.Module):
    def __init__(self):
        super(AdvancedCNN, self).__init__()

        self.conv1 = nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, stride=1, padding=1)
        self.bn1 = nn.BatchNorm2d(32)
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2, padding=0)

        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.bn2 = nn.BatchNorm2d(64)
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2, padding=0)

        self.conv3 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=1, padding=1)
        self.bn3 = nn.BatchNorm2d(128)
        self.pool3 = nn.MaxPool2d(kernel_size=2, stride=2, padding=0)

        self.conv4 = nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, stride=1, padding=1)
        self.bn4 = nn.BatchNorm2d(256)
        self.pool4 = nn.MaxPool2d(kernel_size=2, stride=2, padding=0)

        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(256 * 14 * 14, 512)
        self.dropout = nn.Dropout(0.5)
        self.fc2 = nn.Linear(512, 1)

    def forward(self, x):
        x = self.pool1(F.relu(self.bn1(self.conv1(x))))
        x = self.pool2(F.relu(self.bn2(self.conv2(x))))
        x = self.pool3(F.relu(self.bn3(self.conv3(x))))
        x = self.pool4(F.relu(self.bn4(self.conv4(x))))

        x = self.flatten(x)

        x = F.relu(self.fc1(x))
        x = self.dropout(x)
        x = torch.sigmoid(self.fc2(x))


In [None]:
class InceptionCNN(nn.Module):
    def __init__(self):
        super(InceptionCNN, self).__init__()
        self.base_model = models.inception_v3(pretrained=True)
        self.base_model.fc = nn.Linear(self.base_model.fc.in_features, 1)
        self.base_model.aux_logits = False
    def forward(self, x):
        x = self.base_model(x)
        x = torch.sigmoid(x)

        return x

Definimos el dispositivo donde se realizará el entrenamiento (CPU o GPU).

In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

device

### Función de entrenamiento y validación.

In [None]:
def train(model, train_dataloader, criterion, optimizer, device):

    model.to(device)
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0

    for images, labels in train_dataloader:
        print(f" SHAPE: {images.shape}")
        images, labels = images.to(device), labels.to(device)

        optimizer.zero_grad()
        outputs = model(images)

        loss = criterion(outputs, labels)
        loss.backward()

        optimizer.step()

        running_loss += loss.item()

        threshold = 0.5
        predicted = (outputs.detach() >= threshold)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    train_avg_loss = running_loss / len(train_dataloader)
    train_accuracy = correct / total

    return train_avg_loss, train_accuracy

def validate(model, val_dataloader, criterion, device):

    model.eval()

    running_loss = 0.0
    correct = 0
    total = 0
    with torch.no_grad():
        for images, labels in val_dataloader:
            images, labels = images.to(device), labels.to(device)

            outputs = model(images)

            loss = criterion(outputs, labels)
            running_loss += loss.item()

            threshold = 0.5
            predicted = (outputs.detach() >= threshold)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    val_avg_loss = running_loss / len(val_dataloader)
    val_accuracy = correct / total

    return val_avg_loss, val_accuracy

def train_and_validate(model, train_dataloader, val_dataloader, criterion, optimizer, device, num_epochs, early_stopping_patience, checkpoint_path):

    best_val_loss = 5

    for epoch in range(num_epochs):
        train_loss, train_accuracy = train(model, train_dataloader, criterion, optimizer, device)
        val_loss, val_accuracy = validate(model, val_dataloader, criterion, device)

        print(f'Epoch [{epoch + 1}/{num_epochs}], '
              f'Train Loss: {train_loss:.4f}, Train Accuracy: {train_accuracy:.2f}, '
              f'Validation Loss: {val_loss:.4f}, Validation Accuracy: {val_accuracy:.2f}')

        wandb.log({"epochs": epoch,
                  "train_acc": train_accuracy,
                   "train_loss": train_loss,
                   "val_acc": val_accuracy,
                   "val_loss": val_loss})

        if val_loss < best_val_loss:
          best_val_loss = val_loss
          torch.save(model.state_dict(), checkpoint_path)
          epochs_without_improvement = 0
          print("Checkpoint saved")

        else:
          epochs_without_improvement +=1
          if epochs_without_improvement == early_stopping_patience:
            print("Early Stopping")
            break

### Funcion de testeo

In [None]:
def test(model, test_dataloader, device):
    y_true = []
    y_proba = []

    for image, label in test_dataloader:
        image, label = image.to(device), label.to(device)

        with torch.no_grad():
            output = model(image)

            y_true.append(label.to("cpu").float())
            y_proba.append(output.to("cpu").float())

    return y_true, y_proba

### Funcion para clasificar en base a un umbral

In [None]:
def classify(y_proba, y_true, thr=0.5):

    y_true_tensor = torch.cat(y_true)
    y_proba_tensor = torch.cat(y_proba)

    y_pred_tensor = (y_proba_tensor >= thr).int()

    y_true = y_true_tensor.numpy()
    y_pred = y_pred_tensor.numpy()

    y_proba_flat = y_proba_tensor.numpy().ravel()

    return y_true, y_pred, y_proba_flat

### Funcion para calcular metricas

In [None]:
from sklearn.metrics import accuracy_score, confusion_matrix, roc_curve, auc, precision_score, recall_score

def calculate_metrics(y_true, y_pred, y_proba_flat):
    # Asegúrate de que `y_proba_flat` sea de la forma (n_samples,) si es binario
    if len(y_proba_flat.shape) == 3:
        # Si `y_proba_flat` tiene 3 dimensiones, selecciona solo una clase
        y_proba_flat = y_proba_flat[:, :, 1].flatten()  # Probabilidades de la clase positiva

    # Calcular las métricas
    accuracy = accuracy_score(y_true, y_pred)
    conf_matrix = confusion_matrix(y_true, y_pred)
    precision = precision_score(y_true, y_pred)
    recall = recall_score(y_true, y_pred)
    specificity = recall_score(y_true, y_pred, pos_label=0)

    # Calcular ROC y AUC
    fpr, tpr, _ = roc_curve(y_true, y_proba_flat)
    roc_auc = auc(fpr, tpr)

    # Logging con wandb
    roc_data = [[x, y] for (x, y) in zip(fpr, tpr)]
    table = wandb.Table(data=roc_data, columns=["FPR", "TPR"])
    wandb.log({
        "test_accuracy": accuracy,
        "test_precision": precision,
        "test_recall": recall,
        "test_specificity": specificity,
        "test_confusion_matrix": wandb.plot.confusion_matrix(y_true=y_true.flatten().tolist(), preds=y_pred.flatten().tolist(), class_names=["Clase 0", "Clase 1"]),
        "ROC Curve": wandb.plot.line(table, "FPR", "TPR", title="ROC Curve"),
        "test_roc_auc": roc_auc,
    })

    return accuracy, precision, recall, specificity, conf_matrix, fpr, tpr, roc_auc


# Uso de los CNNs

## SimpleCNN

### Elección de modelo, función de costo y optimizador.

In [None]:
import torch.optim as optim

exp_config_SimpleCNN = exp_config.copy()

wandb.init(project="CNN_CatsvsDogs", entity="ar-um", tags=["BERTOLDI_MANCUSO"], name="Bertoldi_Mancuso_SimpleCNN")
wandb.config.update(exp_config_SimpleCNN)

model = SimpleCNN().to(device)
exp_config_SimpleCNN['model'] = 'SimpleCNN'

criterion = nn.BCELoss()
exp_config_SimpleCNN['model'] = 'BCELoss'

lr = 0.001
exp_config_SimpleCNN['learning_rate'] = lr

optimizer = optim.Adam(model.parameters(), lr=lr)
exp_config_SimpleCNN['optimizador'] = 'Adam'

### Ajuste del modelo

Aqui definimos la cantidad de epocas y el criterio que va a tener en cuenta para detener el entrenamiento en caso de no ver mejoras.

In [None]:
num_epochs = 15
early_stopping_patience = 5
epochs_without_improvement = 0

exp_config_SimpleCNN['num_epochs'] = num_epochs
exp_config_SimpleCNN['early_stopping_patience'] = early_stopping_patience

checkpoint_path = './best_model.pth'

train_and_validate(model, train_dataloader, val_dataloader, criterion, optimizer, device, num_epochs, early_stopping_patience, checkpoint_path)

Epoch [1/15], Train Loss: 0.6445, Train Accuracy: 0.64, Validation Loss: 0.5844, Validation Accuracy: 0.70
Checkpoint saved
Epoch [2/15], Train Loss: 0.5454, Train Accuracy: 0.72, Validation Loss: 0.4650, Validation Accuracy: 0.79
Checkpoint saved
Epoch [3/15], Train Loss: 0.4578, Train Accuracy: 0.78, Validation Loss: 0.3526, Validation Accuracy: 0.85
Checkpoint saved
Epoch [4/15], Train Loss: 0.3482, Train Accuracy: 0.84, Validation Loss: 0.2426, Validation Accuracy: 0.92
Checkpoint saved
Epoch [5/15], Train Loss: 0.2355, Train Accuracy: 0.90, Validation Loss: 0.1422, Validation Accuracy: 0.96
Checkpoint saved
Epoch [6/15], Train Loss: 0.1171, Train Accuracy: 0.96, Validation Loss: 0.0571, Validation Accuracy: 0.99
Checkpoint saved
Epoch [7/15], Train Loss: 0.0581, Train Accuracy: 0.98, Validation Loss: 0.0236, Validation Accuracy: 1.00
Checkpoint saved
Epoch [8/15], Train Loss: 0.0207, Train Accuracy: 1.00, Validation Loss: 0.0078, Validation Accuracy: 1.00
Checkpoint saved
Epoch [9

## Testeo

Cargamos los parametros del modelo desde el checkpoint.

In [None]:
model = SimpleCNN().to(device)

model.load_state_dict(torch.load(checkpoint_path))
model.to(device)

model.eval()

  model.load_state_dict(torch.load(checkpoint_path))


SimpleCNN(
  (conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (relu): ReLU()
  (conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (fc1): Linear(in_features=100352, out_features=64, bias=True)
  (fc2): Linear(in_features=64, out_features=1, bias=True)
  (sigmoid): Sigmoid()
)

Hacemos predicciones en el conjunto de test.

In [None]:
y_true, y_proba = test(model, test_dataloader, device)

Pasamos las predicciones a tensores y clasificamos en base a un umbral.

In [None]:
y_true, y_pred, y_proba_flat = classify(y_proba, y_true)

### Métricas

In [None]:
accuracy, precision, recall, specificity, conf_matrix, fpr, tpr, roc_auc = calculate_metrics(y_true, y_pred, y_proba_flat)

print(f"Accuracy: {accuracy:.2f}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"Specificity: {specificity:.2f}")

Accuracy: 0.96
Precision: 0.96
Recall: 0.96
Specificity: 0.96


In [None]:
exp_config_SimpleCNN

{'seed': 42,
 'test_size': 0.15,
 'val_size': 0.2,
 'train_n_cats': 7984,
 'train_n_dogs': 7934,
 'val_n_cats': 1996,
 'val_n_dogs': 1984,
 'test_n_cats': 1761,
 'test_n_dogs': 1751,
 'input_size': (224, 224),
 'n_channels': 3,
 'batch_size': 64,
 'model': 'BCELoss',
 'learning_rate': 0.001,
 'optimizador': 'Adam',
 'num_epochs': 15,
 'early_stopping_patience': 5}

## ResNet18

### Elección de modelo, función de costo y optimizador.

In [None]:
import torch.optim as optim

exp_config_ResNet18 = exp_config.copy()

wandb.init(project="CNN_CatsvsDogs", entity="ar-um", tags=["BERTOLDI_MANCUSO"], name="Bertoldi_Mancuso_ResNet18CNN")
wandb.config.update(exp_config_ResNet18)

model = ResNet18().to(device)
exp_config_ResNet18['model'] = 'ResNet18'

criterion = nn.BCELoss()
exp_config_ResNet18['model'] = 'BCELoss'

lr = 0.001
exp_config_ResNet18['learning_rate'] = lr

optimizer = optim.Adam(model.parameters(), lr=lr)
exp_config_ResNet18['optimizador'] = 'Adam'

model

VBox(children=(Label(value='0.015 MB of 0.015 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
epochs,▁▁▂▃▃▃▄▅▅▅▆▇▇▇█
train_acc,▁▃▄▅▆▇█████████
train_loss,█▇▆▅▄▂▂▁▁▁▁▁▁▁▁
val_acc,▁▃▄▆▇██████████
val_loss,█▇▅▄▃▂▁▁▁▁▁▁▁▁▁

0,1
epochs,14.0
train_acc,1.0
train_loss,0.00018
val_acc,1.0
val_loss,0.00014


Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /root/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
100%|██████████| 44.7M/44.7M [00:00<00:00, 215MB/s]


ResNet18(
  (base_model): ResNet(
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track

### Ajuste

In [None]:
num_epochs = 15
early_stopping_patience = 5
epochs_without_improvement = 0

exp_config_ResNet18['num_epochs'] = num_epochs
exp_config_ResNet18['early_stopping_patience'] = early_stopping_patience

checkpoint_path = './best_model_ResNet18.pth'

train_and_validate(model, train_dataloader, val_dataloader, criterion, optimizer, device, num_epochs, early_stopping_patience, checkpoint_path)

Epoch [1/15], Train Loss: 0.1445, Train Accuracy: 0.94, Validation Loss: 0.1016, Validation Accuracy: 0.96
Checkpoint saved
Epoch [2/15], Train Loss: 0.0843, Train Accuracy: 0.97, Validation Loss: 0.1421, Validation Accuracy: 0.95
Epoch [3/15], Train Loss: 0.0712, Train Accuracy: 0.97, Validation Loss: 0.0601, Validation Accuracy: 0.98
Checkpoint saved
Epoch [4/15], Train Loss: 0.0565, Train Accuracy: 0.98, Validation Loss: 0.0502, Validation Accuracy: 0.98
Checkpoint saved
Epoch [5/15], Train Loss: 0.0465, Train Accuracy: 0.98, Validation Loss: 0.0425, Validation Accuracy: 0.98
Checkpoint saved
Epoch [6/15], Train Loss: 0.0339, Train Accuracy: 0.99, Validation Loss: 0.0288, Validation Accuracy: 0.99
Checkpoint saved
Epoch [7/15], Train Loss: 0.0346, Train Accuracy: 0.99, Validation Loss: 0.0444, Validation Accuracy: 0.98
Epoch [8/15], Train Loss: 0.0313, Train Accuracy: 0.99, Validation Loss: 0.0375, Validation Accuracy: 0.99
Epoch [9/15], Train Loss: 0.0285, Train Accuracy: 0.99, Val

### Test

In [None]:
model = ResNet18().to(device)

model.load_state_dict(torch.load(checkpoint_path))
model.to(device)

model.eval()

  model.load_state_dict(torch.load(checkpoint_path))


ResNet18(
  (base_model): ResNet(
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track

In [None]:
y_true, y_proba = test(model, test_dataloader, device)

In [None]:
y_true, y_pred, y_proba_flat = classify(y_proba, y_true)

In [None]:
accuracy, precision, recall, specificity, conf_matrix, fpr, tpr, roc_auc = calculate_metrics(y_true, y_pred, y_proba_flat)

print(f"Accuracy: {accuracy:.2f}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"Specificity: {specificity:.2f}")

Accuracy: 0.96
Precision: 0.96
Recall: 0.96
Specificity: 0.96


In [None]:
exp_config_ResNet18

{'seed': 42,
 'test_size': 0.15,
 'val_size': 0.2,
 'train_n_cats': 7984,
 'train_n_dogs': 7934,
 'val_n_cats': 1996,
 'val_n_dogs': 1984,
 'test_n_cats': 1761,
 'test_n_dogs': 1751,
 'input_size': (224, 224),
 'n_channels': 3,
 'batch_size': 64,
 'model': 'BCELoss',
 'learning_rate': 0.001,
 'optimizador': 'Adam',
 'num_epochs': 15,
 'early_stopping_patience': 5}

## ResNet 18 Modificado

### Elección de modelo, función de costo y optimizador.

In [None]:
import torch.optim as optim

exp_config_ResNet18Modificado = exp_config.copy()

wandb.init(project="CNN_CatsvsDogs", entity="ar-um", tags=["BERTOLDI_MANCUSO"], name="Bertoldi_Mancuso_ResNet18ModificadoCNN")
wandb.config.update(exp_config_ResNet18Modificado)

model = ResNet18().to(device)
exp_config_ResNet18Modificado['model'] = 'ResNet18Modificado'

criterion = nn.BCELoss()
exp_config_ResNet18Modificado['model'] = 'BCELoss'

lr = 0.001
exp_config_ResNet18Modificado['learning_rate'] = lr

# Se usara otro optimizador

weight_decay = 0.01
exp_config_ResNet18Modificado['weight_decay'] = weight_decay

optimizer = torch.optim.AdamW(model.parameters(), lr=lr, weight_decay=weight_decay)

exp_config_ResNet18Modificado['optimizador'] = 'AdamW'

model

VBox(children=(Label(value='0.015 MB of 0.015 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
epochs,▁▁▂▃▃▃▄▅▅▅▆▇▇▇█
train_acc,▁▄▅▆▆▇▇▇▇▇██▇██
train_loss,█▅▄▃▃▂▂▂▂▂▂▁▂▁▁
val_acc,▃▁▅▆▆▇▆▆▆▇▇█▆██
val_loss,▆█▄▃▃▂▃▃▃▂▂▁▃▁▁

0,1
epochs,14.0
train_acc,0.99546
train_loss,0.01306
val_acc,0.99842
val_loss,0.0069




ResNet18(
  (base_model): ResNet(
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track

### Ajuste

In [None]:
num_epochs = 15
early_stopping_patience = 5
epochs_without_improvement = 0

exp_config_ResNet18Modificado['num_epochs'] = num_epochs
exp_config_ResNet18Modificado['early_stopping_patience'] = early_stopping_patience

checkpoint_path = './best_model_ResNet18.pth'

train_and_validate(model, train_dataloader, val_dataloader, criterion, optimizer, device, num_epochs, early_stopping_patience, checkpoint_path)

Epoch [1/15], Train Loss: 0.1396, Train Accuracy: 0.94, Validation Loss: 0.1470, Validation Accuracy: 0.94
Checkpoint saved
Epoch [2/15], Train Loss: 0.0896, Train Accuracy: 0.96, Validation Loss: 0.1232, Validation Accuracy: 0.95
Checkpoint saved
Epoch [3/15], Train Loss: 0.0715, Train Accuracy: 0.97, Validation Loss: 0.0981, Validation Accuracy: 0.96
Checkpoint saved
Epoch [4/15], Train Loss: 0.0597, Train Accuracy: 0.98, Validation Loss: 0.0355, Validation Accuracy: 0.99
Checkpoint saved
Epoch [5/15], Train Loss: 0.0441, Train Accuracy: 0.98, Validation Loss: 0.0384, Validation Accuracy: 0.99


KeyboardInterrupt: 

### Test

In [None]:
model = ResNet18().to(device)

model.load_state_dict(torch.load(checkpoint_path))
model.to(device)

model.eval()

  model.load_state_dict(torch.load(checkpoint_path))


ResNet18(
  (base_model): ResNet(
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track

In [None]:
y_true, y_proba = test(model, test_dataloader, device)

In [None]:
y_true, y_pred, y_proba_flat = classify(y_proba, y_true)

In [None]:
accuracy, precision, recall, specificity, conf_matrix, fpr, tpr, roc_auc = calculate_metrics(y_true, y_pred, y_proba_flat)

print(f"Accuracy: {accuracy:.2f}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"Specificity: {specificity:.2f}")

ValueError: Found array with dim 3. None expected <= 2.

In [None]:
exp_config_ResNet18Modificado

## AdvancedCNN

### Elección de modelo, función de costo y optimizador.

In [None]:
import torch.optim as optim

exp_config_AdvancedCNN = exp_config.copy()

wandb.init(project="CNN_CatsvsDogs", entity="ar-um", tags=["BERTOLDI_MANCUSO"], name="Bertoldi_Mancuso_AdvancedCNN")
wandb.config.update(exp_config_AdvancedCNN)

model = AdvancedCNN().to(device)
exp_config_AdvancedCNN['model'] = 'AdvancedCNN'

criterion = nn.BCELoss()
exp_config_AdvancedCNN['model'] = 'BCELoss'

lr = 0.001
exp_config_AdvancedCNN['learning_rate'] = lr

optimizer = optim.Adam(model.parameters(), lr=lr)
exp_config_AdvancedCNN['optimizador'] = 'Adam'

model

### Ajuste

In [None]:
num_epochs = 15
early_stopping_patience = 5
epochs_without_improvement = 0

exp_config_AdvancedCNN['num_epochs'] = num_epochs
exp_config_AdvancedCNN['early_stopping_patience'] = early_stopping_patience

checkpoint_path = './best_model_AdvancedCNN.pth'

train_and_validate(model, train_dataloader, val_dataloader, criterion, optimizer, device, num_epochs, early_stopping_patience, checkpoint_path)

### Test

In [None]:
model = AdvancedCNN().to(device)

model.load_state_dict(torch.load(checkpoint_path))
model.to(device)

model.eval()

Hacemos predicciones en el conjunto de test.

In [None]:
y_true, y_proba = test(model, test_dataloader, device)

In [None]:
y_true, y_pred, y_proba_flat = classify(y_proba, y_true)

In [None]:
accuracy, precision, recall, specificity, conf_matrix, fpr, tpr, roc_auc = calculate_metrics(y_true, y_pred, y_proba_flat)

print(f"Accuracy: {accuracy:.2f}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"Specificity: {specificity:.2f}")

In [None]:
exp_config_AdvancedCNN

## InceptionCNN

In [None]:
input_size = (400,400)
exp_config['input_size'] = input_size

Como las imágenes son a color en formato RGB, definiremos 3 canales

In [None]:
n_channels = 3
exp_config['n_channels'] = n_channels

Creamos el *transform* que será usado, el cual redimensiona las imágenes a la resolución dada.

In [None]:
from torchvision import transforms

transform = transforms.Compose([
    transforms.Resize(input_size),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),  # Normalización estándar para Inception v3

])

Creamos los datasets de train, test y val.

In [None]:
train_dataset = CatsDogsDataset(train_df['image_path'].tolist(), train_df['label'].tolist(), transform)
test_dataset = CatsDogsDataset(test_df['image_path'].tolist(), test_df['label'].tolist(), transform)
val_dataset = CatsDogsDataset(val_df['image_path'].tolist(), val_df['label'].tolist(), transform)

### Elección de modelo, función de costo y optimizador.

In [None]:
from torch.utils.data import DataLoader

batch_size = 64
exp_config['batch_size'] = batch_size

train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
val_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, drop_last=True)
test_dataloader = DataLoader(test_dataset, batch_size=1, shuffle=False, drop_last=False)

In [None]:
import torch.optim as optim

exp_config_Inception = exp_config.copy()

#wandb.init(project="CNN_CatsvsDogs", entity="ar-um", tags=["BERTOLDI_MANCUSO"], name="Bertoldi_Mancuso_InceptionCNN")
#wandb.config.update(exp_config_Inception)

model = InceptionCNN().to(device)
exp_config_Inception['model'] = 'InceptionCNN'

criterion = nn.BCELoss()
exp_config_Inception['model'] = 'BCELoss'

lr = 0.001
exp_config_Inception['learning_rate'] = lr

optimizer = optim.Adam(model.parameters(), lr=lr)
exp_config_Inception['optimizador'] = 'Adam'

model

OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 MiB. GPU 0 has a total capacity of 14.75 GiB of which 3.06 MiB is free. Process 9659 has 14.74 GiB memory in use. Of the allocated memory 14.50 GiB is allocated by PyTorch, and 108.39 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

### Ajuste

In [None]:
num_epochs = 15
early_stopping_patience = 5
epochs_without_improvement = 0

exp_config_Inception['num_epochs'] = num_epochs
exp_config_Inception['early_stopping_patience'] = early_stopping_patience

checkpoint_path = './best_model_InceptionCNN.pth'

train_and_validate(model, train_dataloader, val_dataloader, criterion, optimizer, device, num_epochs, early_stopping_patience, checkpoint_path)

 SHAPE: torch.Size([64, 3, 512, 512])


OutOfMemoryError: CUDA out of memory. Tried to allocate 64.00 MiB. GPU 0 has a total capacity of 14.75 GiB of which 9.06 MiB is free. Process 9659 has 14.74 GiB memory in use. Of the allocated memory 14.55 GiB is allocated by PyTorch, and 51.60 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

### Test

In [None]:
model = InceptionCNN().to(device)

model.load_state_dict(torch.load(checkpoint_path))
model.to(device)

model.eval()

In [None]:
y_true, y_proba = test(model, test_dataloader, device)

In [None]:
y_true, y_pred, y_proba_flat = classify(y_proba, y_true)

In [None]:
accuracy, precision, recall, specificity, conf_matrix, fpr, tpr, roc_auc = calculate_metrics(y_true, y_pred, y_proba_flat)

print(f"Accuracy: {accuracy:.2f}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"Specificity: {specificity:.2f}")

In [None]:
exp_config_Inception

## Inception Modificado

### Elección de modelo, función de costo y optimizador.

In [None]:
import torch.optim as optim

exp_config_InceptionModificado = exp_config.copy()

wandb.init(project="CNN_CatsvsDogs", entity="ar-um", tags=["BERTOLDI_MANCUSO"], name="Bertoldi_Mancuso_InceptionModificadoCNN")
wandb.config.update(exp_config_InceptionModificado)

model = InceptionCNN().to(device)
exp_config_InceptionModificado['model'] = 'InceptionModificado'

criterion = nn.BCELoss()
exp_config_InceptionModificado['model'] = 'BCELoss'

lr = 0.01
exp_config_InceptionModificado['learning_rate'] = lr

optimizer = torch.optim.Adam(model.parameters(), lr=lr, weight_decay=weight_decay)

exp_config_InceptionModificado['optimizador'] = 'Adam'

model

### Ajuste

In [None]:
num_epochs = 15
early_stopping_patience = 5
epochs_without_improvement = 0

exp_config_InceptionModificado['num_epochs'] = num_epochs
exp_config_InceptionModificado['early_stopping_patience'] = early_stopping_patience

checkpoint_path = './best_model_Inception.pth'

train_and_validate(model, train_dataloader, val_dataloader, criterion, optimizer, device, num_epochs, early_stopping_patience, checkpoint_path)

### Test

In [None]:
model = InceptionCNN().to(device)

model.load_state_dict(torch.load(checkpoint_path))
model.to(device)

model.eval()

In [None]:
y_true, y_proba = test(model, test_dataloader, device)

In [None]:
y_true, y_pred, y_proba_flat = classify(y_proba, y_true)

In [None]:
accuracy, precision, recall, specificity, conf_matrix, fpr, tpr, roc_auc = calculate_metrics(y_true, y_pred, y_proba_flat)

print(f"Accuracy: {accuracy:.2f}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"Specificity: {specificity:.2f}")

In [None]:
exp_config_InceptionModificado

In [None]:
wandb.finish()