# **Redes Convolucionales**
### Por **Josmar Dominguez** (16-10315)

## Entrenamiento de red neuronal convolucional
Este notebook se encarga de importar los datos de entrenamiento y testeo, entrenar la red neuronal convolucional y guardar los parametros de la red entrenada.

#### **Importar librerías**
Se importan las librerías a emplear,
* ```torch``` para el manejo de la red neuronal
* ```torchvision``` para la importación de los datos
* ```matplotlib``` para la visualización de los datos

In [None]:
# Import libraries and model
import torch
from torch import nn
from torchvision import datasets, transforms, models
import matplotlib.pyplot as plt

#### **Importar datos**
Se importan los datos de entrenamiento y testeo y se crean los *dataloaders* para el entrenamiento y testeo.

In [None]:
# Specify the path to the .pt file
file_path = "data/train_data_aug.pt"

# Load the train data
try:
    train_data = torch.load(file_path)
except:
    print("File not found. Please, run the _data_ notebook first.")

# Load the test data
test_data = datasets.CIFAR100(
    root="./data", train=False, download=True, transform=transforms.ToTensor()
)

# Create a dictionary to map the labels to the class names
dict_labels = test_data.class_to_idx
dict_ids = {v: k for k, v in dict_labels.items()}

In [None]:
# Verify the number of images in the training dataset
print(f"Number of training images: {len(train_data)}")

# Show the first 5 images with their transformed versions
fig, ax = plt.subplots(5, 2, figsize=(2, 6))
for i in range(5):
    ax[i][0].imshow(train_data[i][0].permute(1, 2, 0))
    ax[i][0].set_title(dict_ids[train_data[i][1]])
    ax[i][0].axis("off")
    ax[i][1].imshow(train_data[i + len(train_data) // 2][0].permute(1, 2, 0))
    ax[i][1].set_title(dict_ids[train_data[i + len(train_data) // 2][1]])
    ax[i][1].axis("off")
plt.show()

In [None]:
# Create data loaders
batch_size = 64
train_loader = torch.utils.data.DataLoader(
    train_data, batch_size=batch_size, shuffle=False
)
test_loader = torch.utils.data.DataLoader(
    test_data, batch_size=batch_size, shuffle=False
)

# Verify the size of the data loaders
print(f"Number of batches in the train loader: {len(train_loader)}")
print(f"Number of batches in the test loader: {len(test_loader)}")

#### **Entrenamiento del modelo**
Función para entrenar un modelo dado con el dataset de entrenamiento de CIFAR100, para un número de épocas dado, con un optimizador dado y un learning rate dado.
Además, se emplea un *scheduler* para el learning rate, el cual disminuye el learning rate cada $n$ épocas.

In [None]:
def train_model(
    model:nn.Module,
    train_loader: torch.utils.data.DataLoader,
    test_loader: torch.utils.data.DataLoader,
    epochs: int,
    optimizer: torch.optim.Optimizer,
    criterion: nn.Module,
    device: torch.device,
    scheduler: torch.optim.lr_scheduler) -> tuple:
    """
    Train a model using the specified optimizer, criterion, and scheduler.

    Parameters
    ----------
    model : nn.Module
        The model to be trained.
    train_loader : torch.utils.data.DataLoader
        The train data loader.
    test_loader : torch.utils.data.DataLoader
        The test data loader.
    epochs : int
        The number of epochs.
    optimizer : torch.optim.Optimizer
        The optimizer.
    criterion : nn.Module
        The criterion.
    device : torch.device
        The device to be used.
    scheduler : torch.optim.lr_scheduler
        The scheduler.
    """
    
    model_stats = {
        'train_loss': [],
        'train_accuracy': [],
        'test_loss': [],
        'test_accuracy': []
    }
    
    for epoch in range(epochs):
        print(f'Epoch {epoch + 1}')
        
        for phase in ['train', 'test']:
            if phase == 'train':
                # Set the model to training mode
                model.train()
                
                # Set the loader to the train loader
                data_loader = train_loader
            else:
                # Set the model to evaluation mode
                model.eval()
                
                # Set the loader to the test loader
                data_loader = test_loader
            
            running_loss = 0
            running_accuracy = 0
            total_images = 0
                        
            # Iterate over the data loader
            for step, (images, labels) in enumerate(data_loader):
                # Move the images and labels to the specified device
                images, labels = images.to(device), labels.to(device)
                
                # Zero the gradients
                optimizer.zero_grad()
                # Forward pass
                
                with torch.set_grad_enabled(phase == 'train'):
                    output = model(images)

                    _, preds = torch.max(output, 1)
                    
                    loss = criterion(output, labels)

                    # Make backward and optimization
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()
                
                # Update the running loss
                running_loss += loss.item() * images.size(0)
                model_stats[f'{phase}_loss'].append(loss.item() * images.size(0))
                
                # Update the running accuracy
                running_accuracy += (output.argmax(1) == labels).sum().item()
                model_stats[f'{phase}_accuracy'].append((output.argmax(1) == labels).sum().item())
                
                # Update the total number of images
                total_images += labels.size(0)
                
            # Print the loss and accuracy
            if phase == 'train':
                scheduler.step()
                print(f'\tTrain loss: {running_loss / len(data_loader):.4f}')
                print(f'\tTrain accuracy: {running_accuracy / total_images:.4f}\n')
            else:
                print(f'Test loss: {running_loss / len(data_loader):.4f}')
                print(f'Test accuracy: {running_accuracy / total_images:.4f}')
                print('-' * 50)
    
    return model_stats, model

### **Cargar el modelo**


##### **VGG16**
Se cargará el modelo preentrenado ```VGG16``` y se entrenará con los datos de entrenamiento.

In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Import the model
vgg16 = models.vgg16(pretrained=True)

num_features = vgg16.classifier[-1].in_features

vgg16.classifier[-1] = nn.Linear(num_features, 100)

# Move the model to the specified device
vgg16.to(device)

# Define the optimizer
optimizer = torch.optim.SGD(vgg16.parameters(), lr=0.001, momentum=0.9)

# Define the criterion
criterion = nn.CrossEntropyLoss()

# Define the scheduler
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)

# Train the model
train_results = train_model(
    vgg16,
    train_loader,
    test_loader,
    epochs=30,
    optimizer=optimizer,
    criterion=criterion,
    device=device,
    scheduler=scheduler
)

vgg16_stats = train_results[0]
vgg16 = train_results[1]

#### **Análisis superficial de los resultados**

In [None]:
train_loss = vgg16_stats['train_loss']
train_accuracy = vgg16_stats['train_accuracy']
test_loss = vgg16_stats['test_loss']
test_accuracy = vgg16_stats['test_accuracy']

# Plot the loss and accuracy
fig, ax = plt.subplots(1, 2, figsize=(10, 5))
ax[0].plot(train_loss, label='train')
ax[0].plot(test_loss, label='test')
ax[0].set_xlabel('Epoch')
ax[0].set_ylabel('Loss')
ax[0].legend()
ax[1].plot(train_accuracy, label='train')
ax[1].plot(test_accuracy, label='test')
ax[1].set_xlabel('Epoch')
ax[1].set_ylabel('Accuracy')
ax[1].legend()
plt.show()

In [None]:
# Save the model
torch.save(vgg16.state_dict(), 'models/vgg16.pt')