<center>
    <img src='images/small-uai.jpeg'style="width: 300px;">
</center>

## Laboratorio S04: _Deep Learning_ - _Transfer Learning_

#### Curso: Aprendizaje profundo
   
<center>
    <img src='https://miro.medium.com/max/665/1*7Ip2_SeOz_BoruHEytEMlQ.png'style="width: 600px;">

    <sub><sup>https://www.viewnext.com/transfer-learning-y-redes-convolucionales/</sup></sub> 
</center>

**Profesor**: Dr. Juan Bekios Calfa

**Grado**: MIA

# Introducción

Para esta implementación, usaremos el **VGG-16**. La red funciona bien para esta tarea de clasificación y es más rápido de entrenar que otros modelos. 

El proceso para utilizar un modelo previamente entrenado consta de los siguentes pasos:

1.   Cargar pesos previamente entrenados de una red entrenada de un gran conjunto de datos.
2.   Congelar todos los pesos en las capas inferiores (convolucionales): las capas para congelar se ajustan según la similitud de la nueva tarea con el conjunto de datos original.
3.   Reemplazar las capas superiores de la red con un clasificador personalizado: el número de salidas debe establecerse igual al número de clases.
4.   Entrenar solo las capas de clasificador personalizado para la tarea, optimizando así el modelo para conjuntos de datos más pequeños

# Cargar los datos

In [42]:
import torch
import torchvision
from torchvision import transforms, datasets
import PIL

In [None]:
from google.colab import drive
drive.mount('/gdrive')
%cd /gdrive

In [None]:
# Transformaciones sobre las imágenes
data_transform = transforms.Compose([
        transforms.Resize((224, 224)),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406],
                             std=[0.229, 0.224, 0.225])
    ])

# Carga de las imágenes
gatos_perros_train = datasets.ImageFolder(root='/gdrive/My Drive/D-UCN/Classes/TecnicasAvanzadasAprendizajeAutomatico/Laboratorios/Laboratorio05.2:DeepLearning/dataset/training_set',
                                           transform=data_transform)
gatos_perros_valid = datasets.ImageFolder(root='/gdrive/My Drive/D-UCN/Classes/TecnicasAvanzadasAprendizajeAutomatico/Laboratorios/Laboratorio05.2:DeepLearning/dataset/valid_set',
                                           transform=data_transform)
gatos_perros_test = datasets.ImageFolder(root='/gdrive/My Drive/D-UCN/Classes/TecnicasAvanzadasAprendizajeAutomatico/Laboratorios/Laboratorio05.2:DeepLearning/dataset/test_set',
                                           transform=data_transform)

# Conjunto de entrenamiento
train_loader = torch.utils.data.DataLoader(gatos_perros_train,
                                             batch_size=32, shuffle=True,
                                             num_workers=2)

# Conjunto de validación
valid_loader = torch.utils.data.DataLoader(gatos_perros_valid,
                                             batch_size=32, shuffle=False,
                                             num_workers=2)

# Conjunto de pruebas
test_loader = torch.utils.data.DataLoader(gatos_perros_test,
                                             batch_size=32, shuffle=False,
                                             num_workers=2)

In [None]:
trainiter = iter(train_loader)
features, labels = next(trainiter)
features.shape, labels.shape

In [None]:
print('Hay ', len(gatos_perros_train.classes))

# 1. Cargar los pesos previamente entregados 

In [None]:
import torchvision.models as models

model = models.vgg16(pretrained=True)

model.classifier

# 2. Congelar todos los pesos de las capas convolucionales

In [None]:
# Congelar el modelo de pesos
for param in model.parameters():
    param.requires_grad = False

# 3. Reemplazar las capas superiores de la red con un clasificador personalizado

Agregamos nuestro propio clasificador personalizado con las siguientes capas:

* Red neuronal totalmente conectada con activación de ReLU, forma = (n_inputs, 256)
* Drop out 40% de probabilidad
* Totalmente conectado con la salida de log softmax, shape = (256, n_classes)

In [None]:
import torch.nn as nn

n_inputs = 4096
n_classes = 2
# Agregar nuevo clasificador
model.classifier[6] = nn.Sequential(
                      nn.Linear(n_inputs, 256), 
                      nn.ReLU(), 
                      nn.Dropout(0.4),
                      nn.Linear(256, n_classes),                   
                      nn.LogSoftmax(dim=1))

model

Cuando las capas adicionales se agregan al modelo, se configuran como entrenables de forma predeterminada `(require_grad = True)`. Para la VGG-16, solo cambiaremos la última capa original completamente conectada. Todos los pesos en las capas convolucionales y las primeras 5 capas completamente conectadas no se pueden entrenar.

In [None]:
# Solo entrenar el clasificador[6]
model.classifier

Calcular número de parámetros finales de la red

In [None]:
# Número total de parámetros y parámetros entrenables
total_params = sum(p.numel() for p in model.parameters())
print(f'{total_params:,} total de parámetros de la red.')
total_trainable_params = sum(
    p.numel() for p in model.parameters() if p.requires_grad)
print(f'{total_trainable_params:,} parametros entrenables.')

#4. Entrenar solo las capas de clasificador personalizado para la tarea

##4.1 Mover el modelo a la GPU

In [None]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

model = model.to(device)

## 4.2 Optimización

In [None]:
from torch import optim
# Definir función de perdida y optimización
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters())

##4.3 

In [None]:
def train(model,
          criterion,
          optimizer,
          train_loader,
          valid_loader,
          save_file_name,
          max_epochs_stop=3,
          n_epochs=20,
          print_every=2):
    """Train a PyTorch Model

    Params
    --------
        model (PyTorch model): cnn to train
        criterion (PyTorch loss): objective to minimize
        optimizer (PyTorch optimizier): optimizer to compute gradients of model parameters
        train_loader (PyTorch dataloader): training dataloader to iterate through
        valid_loader (PyTorch dataloader): validation dataloader used for early stopping
        save_file_name (str ending in '.pt'): file path to save the model state dict
        max_epochs_stop (int): maximum number of epochs with no improvement in validation loss for early stopping
        n_epochs (int): maximum number of training epochs
        print_every (int): frequency of epochs to print training stats

    Returns
    --------
        model (PyTorch model): trained cnn with best weights
        history (DataFrame): history of train and validation loss and accuracy
    """

    # Early stopping intialization
    epochs_no_improve = 0
    valid_loss_min = np.Inf

    valid_max_acc = 0
    history = []

    # Number of epochs already trained (if using loaded in model weights)
    try:
        print(f'Model has been trained for: {model.epochs} epochs.\n')
    except:
        model.epochs = 0
        print(f'Starting Training from Scratch.\n')

    overall_start = timer()

    # Main loop
    for epoch in range(n_epochs):

        # keep track of training and validation loss each epoch
        train_loss = 0.0
        valid_loss = 0.0

        train_acc = 0
        valid_acc = 0

        # Set to training
        model.train()
        start = timer()

        # Training loop
        for ii, (data, target) in enumerate(train_loader):
            # Tensors to gpu
            if train_on_gpu:
                data, target = data.to(device), target.to(device)
            # Clear gradients
            optimizer.zero_grad()
            # Predicted outputs are log probabilities
            output = model(data)

            # Loss and backpropagation of gradients
            loss = criterion(output, target)
            loss.backward()

            # Update the parameters
            optimizer.step()

            # Track train loss by multiplying average loss by number of examples in batch
            train_loss += loss.item() * data.size(0)

            # Calculate accuracy by finding max log probability
            _, pred = torch.max(output, dim=1)
            correct_tensor = pred.eq(target.data.view_as(pred))
            # Need to convert correct tensor from int to float to average
            accuracy = torch.mean(correct_tensor.type(torch.FloatTensor))
            # Multiply average accuracy times the number of examples in batch
            train_acc += accuracy.item() * data.size(0)

            # Track training progress
            print(
                f'Epoch: {epoch}\t{100 * (ii + 1) / len(train_loader):.2f}% complete. {timer() - start:.2f} seconds elapsed in epoch.',
                end='\r')

        # After training loops ends, start validation
        else:
            model.epochs += 1

            # Don't need to keep track of gradients
            with torch.no_grad():
                # Set to evaluation mode
                model.eval()

                # Validation loop
                for data, target in valid_loader:
                    # Tensors to gpu
                    if train_on_gpu:
                        data, target = data.to(device), target.to(device)

                    # Forward pass
                    output = model(data)

                    # Validation loss
                    loss = criterion(output, target)
                    # Multiply average loss times the number of examples in batch
                    valid_loss += loss.item() * data.size(0)

                    # Calculate validation accuracy
                    _, pred = torch.max(output, dim=1)
                    correct_tensor = pred.eq(target.data.view_as(pred))
                    accuracy = torch.mean(
                        correct_tensor.type(torch.FloatTensor))
                    # Multiply average accuracy times the number of examples
                    valid_acc += accuracy.item() * data.size(0)

                # Calculate average losses
                train_loss = train_loss / len(train_loader.dataset)
                valid_loss = valid_loss / len(valid_loader.dataset)

                # Calculate average accuracy
                train_acc = train_acc / len(train_loader.dataset)
                valid_acc = valid_acc / len(valid_loader.dataset)

                history.append([train_loss, valid_loss, train_acc, valid_acc])

                # Print training and validation results
                if (epoch + 1) % print_every == 0:
                    print(
                        f'\nEpoch: {epoch} \tTraining Loss: {train_loss:.4f} \tValidation Loss: {valid_loss:.4f}'
                    )
                    print(
                        f'\t\tTraining Accuracy: {100 * train_acc:.2f}%\t Validation Accuracy: {100 * valid_acc:.2f}%'
                    )

                # Save the model if validation loss decreases
                if valid_loss < valid_loss_min:
                    # Save model
                    torch.save(model.state_dict(), save_file_name)
                    # Track improvement
                    epochs_no_improve = 0
                    valid_loss_min = valid_loss
                    valid_best_acc = valid_acc
                    best_epoch = epoch

                # Otherwise increment count of epochs with no improvement
                else:
                    epochs_no_improve += 1
                    # Trigger early stopping
                    if epochs_no_improve >= max_epochs_stop:
                        print(
                            f'\nEarly Stopping! Total epochs: {epoch}. Best epoch: {best_epoch} with loss: {valid_loss_min:.2f} and acc: {100 * valid_acc:.2f}%'
                        )
                        total_time = timer() - overall_start
                        print(
                            f'{total_time:.2f} total seconds elapsed. {total_time / (epoch+1):.2f} seconds per epoch.'
                        )

                        # Load the best state dict
                        model.load_state_dict(torch.load(save_file_name))
                        # Attach the optimizer
                        model.optimizer = optimizer

                        # Format history
                        history = pd.DataFrame(
                            history,
                            columns=[
                                'train_loss', 'valid_loss', 'train_acc',
                                'valid_acc'
                            ])
                        return model, history

    # Attach the optimizer
    model.optimizer = optimizer
    # Record overall time and print out stats
    total_time = timer() - overall_start
    print(
        f'\nBest epoch: {best_epoch} with loss: {valid_loss_min:.2f} and acc: {100 * valid_acc:.2f}%'
    )
    print(
        f'{total_time:.2f} total seconds elapsed. {total_time / (epoch):.2f} seconds per epoch.'
    )
    # Format history
    history = pd.DataFrame(
        history,
        columns=['train_loss', 'valid_loss', 'train_acc', 'valid_acc'])
    return model, history

### Corremos el entrenamiento

In [None]:
import numpy as np
import pandas as pd
from timeit import default_timer as timer


model, history = train(
    model,
    criterion,
    optimizer,
    train_loader,
    valid_loader,
    save_file_name='model-1.model',
    max_epochs_stop=5,
    n_epochs=30,
    print_every=2)

In [None]:
plt.figure(figsize=(8, 6))
for c in ['train_loss', 'valid_loss']:
    plt.plot(
        history[c], label=c)
plt.legend()
plt.xlabel('Epoch')
plt.ylabel('Average Negative Log Likelihood')
plt.title('Training and Validation Losses')

In [None]:
plt.figure(figsize=(8, 6))
for c in ['train_acc', 'valid_acc']:
    plt.plot(
        100 * history[c], label=c)
plt.legend()
plt.xlabel('Epoch')
plt.ylabel('Average Accuracy')
plt.title('Training and Validation Accuracy')