# Práctico 1 - Parte 3 de 3

[Enunciado](https://github.com/DiploDatos/AprendizajeProfundo/blob/master/Practico.md) del trabajo práctico.

**Implementación de red neuronal [Perceptrón Multicapa](https://en.wikipedia.org/wiki/Multilayer_perceptron) (MLP).**

## Integrantes
- Mauricio Caggia
- Luciano Monforte
- Gustavo Venchiarutti
- Guillermo Robiglio

En esta tercera parte se arman los datasets, los dataloaders y se entrena y prueba el modelo.

## ⚠ IMPORTANTE ⚠

Por favor leer el archivo [Practico_1.md](https://github.com/grobiglio/deepleaning/blob/master/practico/Practico_1.md#deep-learning---trabajo-pr%C3%A1ctico-1) que se encuentra en el repositorio donde se puso este trabajo práctico.

## Importaciones

In [19]:
import mlflow
import tempfile
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from gensim import corpora
from tqdm.notebook import tqdm, trange
from sklearn.metrics import balanced_accuracy_score
from practico1_modulo import *

## Constantes

In [45]:
EPOCHS = 3
BATCH_SIZE = 100

## Carga de datos

Carga de datos de entrenamiento

In [3]:
X_train = torch.load('./data/X_train.pt')
y_train = torch.load('./data/y_train.pt')

In [4]:
# La reducción del dataset de entrenamiento es temporal
# Cuando compruebe que funciona se eliminará esta celda.
X_train = X_train[:1000000]
X_train.shape

torch.Size([1000000, 17])

In [5]:
# La reducción del dataset de entrenamiento es temporal
# Cuando compruebe que funciona se eliminará esta celda.
y_train = y_train[:1000000]
y_train.shape

torch.Size([1000000])

Carga de datos de prueba

In [6]:
X_test = torch.load('./data/X_test.pt')
y_test = torch.load('./data/y_test.pt')

In [7]:
# La reducción del dataset de prueba es temporal.
# Cuando compruebe que funciona se eliminará esta celda.
X_test = X_test[:500000]
X_test.shape

torch.Size([500000, 16])

In [8]:
# La reducción del dataset de prueba es temporal.
# Cuando compruebe que funciona se eliminará esta celda.
y_test = y_test[:500000]
y_test.shape

torch.Size([500000])

## Embedding de títulos

In [9]:
# https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html#torch.nn.Embedding
embeddings_matrix = torch.load('./data/embeddings_matrix.pt')
embeddings = nn.Embedding.from_pretrained(embeddings_matrix,
                                          padding_idx=0)

## Construcción del Dataset

In [10]:
train_dataset = MeLiChallengeDataset(X_train, y_train)
test_dataset = MeLiChallengeDataset(X_test, y_test)

train_loader = DataLoader(train_dataset,
                          batch_size=BATCH_SIZE,
                          shuffle=True,
                          drop_last=False)
i = 0
for data in tqdm(train_loader):
    i += 1
print(f'Recorrida exitosa de {i} batches de entrenamiento.')

test_loader = DataLoader(test_dataset,
                         batch_size=BATCH_SIZE,
                         shuffle=True,
                         drop_last=False)
i = 0
for data in tqdm(test_loader):
    i += 1
print(f'Recorrida exitosa de {i} batches de prueba.')

  0%|          | 0/10000 [00:00<?, ?it/s]

Recorrida exitosa de 10000 batches de entrenamiento.


  0%|          | 0/5000 [00:00<?, ?it/s]

Recorrida exitosa de 5000 batches de prueba.


## Construcción del Modelo

[Red Neuronal Recurrente Long Short Term Memory (LSTM)](https://en.wikipedia.org/wiki/Long_short-term_memory)

In [22]:
class MeLiChallengeLSTM(nn.Module):
    def __init__(self, embeddings):
        super(MeLiChallengeLSTM, self).__init__()
        output_size = 1
        # Create the Embeddings layer and add pre-trained weights
        self.embeddings = embeddings
        
        # Set our LSTM parameters
        embedding_size = 300
        hidden_layer = 32
        num_layers = 1
        bias = True
        dropout = 0
        bidirectional = False
        self.lstm_config = {'input_size': embedding_size,
                            'hidden_size': hidden_layer,
                            'num_layers': num_layers,
                            'bias': bias,
                            'batch_first': True,
                            'dropout': dropout,
                            'bidirectional': bidirectional}
        
        # Set our fully connected layer parameters
        self.linear_config = {'in_features': hidden_layer,
                              'out_features': output_size,
                              'bias': bias}
        
        # Instanciate the layers
        self.lstm = nn.LSTM(**self.lstm_config)
        self.classification_layer = nn.Linear(**self.linear_config)
        self.activation = nn.Sigmoid()

    def forward(self, inputs):
        emb = self.embeddings(inputs)
        # print(emb.shape)
        lstm_out, _ = self.lstm(emb)
        # print(lstm_out.shape)
        # Take last state of lstm, which is a representation of
        # the entire text
        lstm_out = lstm_out[:, -1, :].squeeze()
        # print(lstm_out.shape)
        predictions = self.activation(self.classification_layer(lstm_out))
        # print(prediction.shape)
        return predictions

In [23]:
modelo = MeLiChallengeLSTM(embeddings)
print(modelo)

MeLiChallengeLSTM(
  (embeddings): Embedding(50002, 300, padding_idx=0)
  (lstm): LSTM(300, 32, batch_first=True)
  (classification_layer): Linear(in_features=32, out_features=1, bias=True)
  (activation): Sigmoid()
)


## Algoritmo de Optimización

In [46]:
learning_rate = 0.001
loss_function = nn.BCELoss()
optimizer = optim.Adam(modelo.parameters(), learning_rate)

In [25]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f'Utilizando {device}')
modelo.to(device)

Utilizando cuda


MeLiChallengeLSTM(
  (embeddings): Embedding(50002, 300, padding_idx=0)
  (lstm): LSTM(300, 32, batch_first=True)
  (classification_layer): Linear(in_features=32, out_features=1, bias=True)
  (activation): Sigmoid()
)

## Entrenamiento y evaluación del modelo

In [48]:
def train(dataloader, model, loss_fn, optimizer):
    '''Entrenamiento de una red neuronal.
    
    Parámetros:
    -----------
    - dataloader: Iterador (objeto) de Pytorch construido en base al dataset basado en la clase MeLiChallengeDataset.
    - model: Modelo (objeto) basado en la clase MeLiChallengeClassifier.
    - loss_fn: Función de costo.
    - optimizer: Optimizador.
    
    Salidas:
    --------
    train_loss: Valor promedio de la función de costo minimizados de cada uno de los batches.
    
    '''
    size = len(dataloader.dataset)
    model.train()
    running_loss = []
    for batch, data in enumerate(tqdm(dataloader)):
        X, y = data['data'].to(device), torch.tensor(data['target'], dtype=torch.float32, device=device)
        optimizer.zero_grad()
        pred = model(X)
        loss = loss_fn(pred.squeeze(), y)
        # pred: Dimensión 100 x 632
        # y: Dimensión 100
        # Por eso esto no funciona 😡 ¿Cómo se soluciona? 😡
        loss.backward()
        optimizer.step()
        running_loss.append(loss.item())
        train_loss = sum(running_loss) / len(running_loss)
        
    return train_loss

In [49]:
def test(dataloader, model, loss_fn):
    '''Evaluación de una red neuronal.
    
    Parámetros:
    -----------
    - dataloader: Iterador (objeto) de Pytorch construido en base al dataset basado en la clase MeLiChallengeDataset.
    - model: Modelo (objeto) basado en la clase MeLiChallengeClassifier.
    - loss_fn: Función de costo.
    
    Salidas:
    --------
    - train_loss: Valor promedio de la función de costo minimizados de cada uno de los batches.
    - avp: Precisión.
    '''
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    running_loss = []
    targets = []
    predictions = []
    test_loss, correct = 0, 0
    with torch.no_grad():
        for data in tqdm(dataloader):
            X, y = data['data'].to(device), torch.tensor(data['target'], dtype=torch.float32, device=device)
            pred = model(X)
            running_loss.append(loss_function(output.squeeze(), y).item())
            targets.extend(y.cpu().detach().numpy())
            predictions.extend(output.squeeze().cpu().round().detach().numpy())
            
        test_loss = sum(running_loss) / len(running_loss)
        avp = balanced_accuracy_score(targets, predictions)
                                    
    return test_loss, avp

In [50]:
history = {
    'train_loss': [],
    'test_loss': [],
    'test_avp': []
}

for t in range(EPOCHS):
    print(f"Epoch {t+1}\n-------------------------------")
    train_loss = train(train_loader, modelo, loss_function, optimizer)
    print("\t Final train_loss", train_loss)
    history['train_loss'].append(train_loss)
    test_loss, avp = test(test_loader, modelo, loss_function)
    print("\t Final test_loss", test_loss)
    print("\t Final test_avp", avp)
    history['test_loss'].append(test_loss)
    history['test_avp'].append(avp)
print("!Listo!")

Epoch 1
-------------------------------


  0%|          | 0/10000 [00:00<?, ?it/s]

  X, y = data['data'].to(device), torch.tensor(data['target'], dtype=torch.float32, device=device)


	 Final train_loss -31227.0784


  0%|          | 0/5000 [00:00<?, ?it/s]

  X, y = data['data'].to(device), torch.tensor(data['target'], dtype=torch.float32, device=device)


	 Final test_loss -31219.0474
	 Final test_avp 0.0015822784810126582
Epoch 2
-------------------------------


  0%|          | 0/10000 [00:00<?, ?it/s]

  X, y = data['data'].to(device), torch.tensor(data['target'], dtype=torch.float32, device=device)


	 Final train_loss -31227.0784


  0%|          | 0/5000 [00:00<?, ?it/s]

  X, y = data['data'].to(device), torch.tensor(data['target'], dtype=torch.float32, device=device)


	 Final test_loss -31219.0474
	 Final test_avp 0.0015822784810126582
Epoch 3
-------------------------------


  0%|          | 0/10000 [00:00<?, ?it/s]

  X, y = data['data'].to(device), torch.tensor(data['target'], dtype=torch.float32, device=device)


	 Final train_loss -31227.0784


  0%|          | 0/5000 [00:00<?, ?it/s]

  X, y = data['data'].to(device), torch.tensor(data['target'], dtype=torch.float32, device=device)


	 Final test_loss -31219.0474
	 Final test_avp 0.0015822784810126582
!Listo!


## ¿Por qué sospechamos que no funciona?

Falta explicar porqué no funciona.