<a href="https://colab.research.google.com/github/DiploDatos/AprendizajeProfundo/blob/master/5_cnns.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Construyendo una red convolucional con PyTorch

## Librerías

In [1]:
import gzip
import mlflow
import numpy
import pandas as pd
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import tempfile

from gensim import corpora
from gensim.parsing import preprocessing
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, average_precision_score,confusion_matrix
from tqdm.notebook import tqdm, trange
from torch.utils.data import Dataset, DataLoader

## Red convolucional para imágenes

### Datos del CIFAR-10

Utilizamos los mismos datos que se usaron en el [notebook 1](https://github.com/DiploDatos/AprendizajeProfundo/blob/master/1_basic_mlp.ipynb).

In [2]:
CIFAR_CLASSES = ('plane', 'car', 'bird', 'cat', 'deer', 
                 'dog', 'frog', 'horse', 'ship', 'truck')
BATCH_SIZE = 128
EPOCHS = 2
transform = transforms.Compose( #Agrupa todas las transformaciones que le pedimos
    [transforms.ToTensor(),      # en este caso, convertimos a tensores y normalizamos los datos
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
## Train
trainset = torchvision.datasets.CIFAR10(root='./data', 
                                        train=True,
                                        download=True, 
                                        transform=transform)

trainloader = torch.utils.data.DataLoader(trainset, 
                                          batch_size=BATCH_SIZE,
                                          shuffle=True, 
                                          num_workers=2)
## Test
testset = torchvision.datasets.CIFAR10(root='./data', 
                                       train=False,
                                       download=True, 
                                       transform=transform)

testloader = torch.utils.data.DataLoader(testset, 
                                         batch_size=BATCH_SIZE,
                                         shuffle=False, 
                                         num_workers=2)

Files already downloaded and verified
Files already downloaded and verified


### Red convolucional

- La red convolucional se obtiene apilando capas [`torch.nn.Conv2d`](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html). 
    - En particular, este tipo de capas acepta matrices (a diferencia de la lineal que sólo acepta vectores). En las capas se definen lo canales de entrada y los de salida, además del tamaño del kernel (i.e. ventana). 
- También son comunes las capas [`torch.nn.MaxPool2d`](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html) que realizan una operación de max pooling, en 2 dimensiones. 
- La red se completa con algunas capas lineales [`torch.nn.Linear`](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html?highlight=nn%20linear#torch.nn.Linear) para poder llevarla a las 10 dimensiones de salida que vienen a representar las clases.

**Funciones auxiliares**:
-  [`Tensor.view`](https://pytorch.org/docs/stable/generated/torch.Tensor.view.html?highlight=view#torch.Tensor.view): Es similar a [`reshape()`](https://numpy.org/doc/stable/reference/generated/numpy.reshape.html) de numpy, cambia las dimensiones de un tensor sin guardarlo en memoria y sin cambiar los valores del input original. El parámetro `-1` se usa cuando no sabemos cuántas filas queremos pero conocemos la cantidad de columnas (esto se extrapola a más dimensiones).
- [`Optimizer.zero_grad`](https://pytorch.org/docs/stable/generated/torch.optim.Optimizer.zero_grad.html#:~:text=Sets%20the%20gradients%20of%20all,set%20the%20grads%20to%20None.): En cada paso de entrenamiento necesitamos resetear los valores del gradiente para que no se cuenten "2 veces". Para mayor detalle visitar este [link](https://stackoverflow.com/questions/48001598/why-do-we-need-to-call-zero-grad-in-pytorch#:~:text=384,of%20maximization%20objectives).

In [3]:
class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 6, 5) # (#canales de entrada, #canales de salida, #filtros)
        self.pool = nn.MaxPool2d(2, 2) # (kernel_size , stride)
        self.conv2 = nn.Conv2d(6, 16, 5) # (#canales de entrada, #canales de salida, #filtros)
        self.fc1 = nn.Linear(16 * 5 * 5, 120) # (input, output)
        self.fc2 = nn.Linear(120, 84) # (input, output)
        self.fc3 = nn.Linear(84, 10) # (input, output)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x))) #Aplicamos relu a la primera capa convolucional, luego pooling
        x = self.pool(F.relu(self.conv2(x))) #Aplicamos relu a la segunda capa convolucional, luego pooling
        x = x.view(-1, 16 * 5 * 5) # Necesitamos transformarlo en un vector para que sea input de la capa lineal
        x = F.relu(self.fc1(x)) # Finalmente relu a cada capa convolucional
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

model = CNN()
print(model)

CNN(
  (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)


### Entrenamiento

La red se entrena igual que el caso del perceptrón multicapa, solo que esta vez no requiere reacomodar la matriz de entrada.

In [4]:
loss_function = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001, weight_decay=1e-5)
model.train()
iters_per_epoch = len(trainloader)
for epoch in trange(EPOCHS):  # loop over the dataset multiple times
    pbar = tqdm(trainloader, desc="Train loss: NaN")
    for data in pbar:
        inputs, labels = data
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = loss_function(outputs, labels)
        loss.backward()
        optimizer.step()
        pbar.set_description(f"Train loss: {loss.item():.3f}")

  0%|          | 0/2 [00:00<?, ?it/s]

Train loss: NaN:   0%|          | 0/391 [00:00<?, ?it/s]

Train loss: NaN:   0%|          | 0/391 [00:00<?, ?it/s]

### Evaluación

Una vez más, la evaluación es similar al caso del perceptrón multicapa.

In [5]:
y_true = []
y_pred = []
with torch.no_grad():
    for data in tqdm(testloader):
        inputs, labels = data
        outputs = model(inputs)
        _, predicted = torch.max(outputs.data, 1)
        y_true.extend(labels.numpy())
        y_pred.extend(predicted.numpy())

print(classification_report(y_true, y_pred, target_names=CIFAR_CLASSES))

  0%|          | 0/79 [00:00<?, ?it/s]

              precision    recall  f1-score   support

       plane       0.53      0.62      0.57      1000
         car       0.66      0.61      0.64      1000
        bird       0.43      0.42      0.42      1000
         cat       0.35      0.40      0.37      1000
        deer       0.48      0.25      0.33      1000
         dog       0.50      0.35      0.42      1000
        frog       0.60      0.55      0.57      1000
       horse       0.53      0.64      0.58      1000
        ship       0.48      0.72      0.57      1000
       truck       0.57      0.52      0.54      1000

    accuracy                           0.51     10000
   macro avg       0.51      0.51      0.50     10000
weighted avg       0.51      0.51      0.50     10000



In [6]:
df = pd.DataFrame((confusion_matrix(y_true, y_pred)),columns=list(CIFAR_CLASSES),index=list(CIFAR_CLASSES))

In [7]:
import seaborn as sns
cm = sns.light_palette("green", as_cmap=True)
df.style.background_gradient(cmap=cm).set_precision(2)

  df.style.background_gradient(cmap=cm).set_precision(2)


Unnamed: 0,plane,car,bird,cat,deer,dog,frog,horse,ship,truck
plane,624,32,39,17,9,6,15,20,208,30
car,60,615,3,18,0,2,9,12,150,131
bird,98,23,418,105,73,81,61,69,49,23
cat,50,16,74,398,30,139,90,95,50,58
deer,70,16,218,86,250,33,118,159,36,14
dog,23,11,89,264,31,354,33,137,40,18
frog,16,13,99,127,68,20,547,33,32,45
horse,35,8,27,81,50,63,23,639,32,42
ship,145,50,6,18,5,5,4,10,719,38
truck,58,143,6,24,1,2,18,32,195,521


**Ejercicio:** Analizar los resultados anteriores, ¿vale la pena hacer algunos cambios al modelo?

## CNNs para Texto

### Datos IMDB

Similar al caso de CNN para imágenes, vamos a volver sobre el conjunto de datos que ya utilizamos anteriomente: el de reviews IMDB. Esta vez para compararlo contra el modelo de perceptrón multicapa utilizando la media de los embeddings.

In [8]:
class IMDBReviewsDataset(Dataset):
    def __init__(self, dataset, transform=None):
        self.dataset = dataset
        self.transform = transform
    
    def __len__(self):
        return self.dataset.shape[0]

    def __getitem__(self, item):
        if torch.is_tensor(item):
            item = item.to_list()
        
        item = {
            "data": self.dataset.loc[item, "review"],
            "target": self.dataset.loc[item, "sentiment"]
        }
        
        if self.transform:
            item = self.transform(item)
        
        return item

### Preprocesamiento

Aplicamos el mismo tipo de preprocesamiento.

**Algunas funciones auxiliares:**
-  [`gensim.parsing.preprocessing.preprocess_string`](https://radimrehurek.com/gensim/parsing/preprocessing.html#:~:text=gensim.parsing.preprocessing.preprocess_string(s%2C%20filters%3D%5B%3Cfunction%20%3Clambda%3E%3E%2C%20%3Cfunction%20strip_tags%3E%2C%20%3Cfunction%20strip_punctuation%3E%2C%20%3Cfunction%20strip_multiple_whitespaces%3E%2C%20%3Cfunction%20strip_numeric%3E%2C%20%3Cfunction%20remove_stopwords%3E%2C%20%3Cfunction%20strip_short%3E%2C%20%3Cfunction%20stem_text%3E%5D)): Apply list of chosen filters to a string.

- [`corpora.Dictionary`](https://radimrehurek.com/gensim/corpora/dictionary.html#:~:text=word%3C%2D%3Eid%20mappings-,corpora.dictionary%20%E2%80%93%20Construct%20word%3C%2D%3Eid%20mappings,of%20a%20Dictionary%20%E2%80%93%20a%20mapping%20between%20words%20and%20their%20integer%20ids.,-class): This module implements the concept of a Dictionary – a mapping between words and their integer ids.

-  [`Dictionary.filter_extremes`](https://radimrehurek.com/gensim/corpora/dictionary.html?highlight=filter_extremes#gensim.corpora.dictionary.Dictionary.filter_extremes:~:text=filter_extremes(no_below%3D5%2C%20no_above%3D0.5%2C%20keep_n%3D100000%2C%20keep_tokens%3DNone))

-  [`Dictionary.compactify`](https://radimrehurek.com/gensim/corpora/dictionary.html?highlight=compactify#gensim.corpora.dictionary.Dictionary.compactify:~:text=compactify(),shrinking%20any%20gaps.)

- [`Dictionary.patch_with_special_tokens`](https://radimrehurek.com/gensim/corpora/dictionary.html?highlight=patch_with_special_tokens#gensim.corpora.dictionary.Dictionary.patch_with_special_tokens:~:text=patch_with_special_tokens(special_token_dict)): Patch token2id and id2token using a dictionary of special tokens.

- [`Dictionary.doc2idx`](https://radimrehurek.com/gensim/corpora/dictionary.html#:~:text=doc2idx(document%2C%20unknown_word_index%3D%2D1)): Convert document (a list of words) into a list of indexes = list of token_id. Replace all unknown words i.e, words not in the dictionary with the index as set via unknown_word_index.

- [`__call__`](https://www.geeksforgeeks.org/__call__-in-python/): Permite que las instancias de una clase se comporten como funciones y puedan ser llamadas como funciones.

In [9]:
class RawDataProcessor:
    def __init__(self, 
                 dataset, 
                 ignore_header=True, 
                 filters=None, 
                 vocab_size=50000):
        if filters:
            self.filters = filters
        else:
            self.filters = [ #We set some filters
                lambda s: s.lower(),
                preprocessing.strip_tags,
                preprocessing.strip_punctuation,
                preprocessing.strip_multiple_whitespaces,
                preprocessing.strip_numeric,
                preprocessing.remove_stopwords,
                preprocessing.strip_short,
            ]
        # Create dictionary based on all the reviews (with corresponding preprocessing and filters)
        # The dictionary has idx as a key and word as a value
        # For example one element could be (0, 'accustomed') or (1, 'agenda')
        self.dictionary = corpora.Dictionary(
            dataset["review"].map(self._preprocess_string).tolist()
        )
        
        # Filter the dictionary and compactify it (make the indices continous)
        self.dictionary.filter_extremes(no_below=2, no_above=1, keep_n=vocab_size)
        self.dictionary.compactify()
        # Add a couple of special tokens
        self.dictionary.patch_with_special_tokens({
            "[PAD]": 0, #The padding token
            "[UNK]": 1  # The unknown token
        })
        
        self.idx_to_target = sorted(dataset["sentiment"].unique())
        self.target_to_idx = {t: i for i, t in enumerate(self.idx_to_target)}
        

    def _preprocess_string(self, string):
        return preprocessing.preprocess_string(string, filters=self.filters)

    def _sentence_to_indices(self, sentence):
        return self.dictionary.doc2idx(sentence, unknown_word_index=1)
    
    def encode_data(self, data):
        return self._sentence_to_indices(self._preprocess_string(data))
    
    def encode_target(self, target):
        return self.target_to_idx[target]
    
    def __call__(self, item):
        #Encodeamos tanto los datos como los targets, diferenciandos los casos cuando son strings y cuando no
        if isinstance(item["data"], str): 
            data = self.encode_data(item["data"])
        else:
            data = [self.encode_data(d) for d in item["data"]]
        
        if isinstance(item["target"], str):
            target = self.encode_target(item["target"])
        else:
            target = [self.encode_target(t) for t in item["target"]]
        
        return {
            "data": data,
            "target": target
        }

### Carga de datos

In [10]:
dataset = pd.read_csv("./data/imdb_reviews.csv.gz")
dataset.sample(5)

Unnamed: 0,review,sentiment
41033,"When it was announced the ""King of Pop"" was de...",positive
48582,Not every movie with lesbian chicks and vampir...,negative
16282,In keeping with Disney's well-known practice o...,negative
34524,I went to see this with my wife and 3 yr old s...,negative
12096,This movie is a window on the world of Britain...,negative


In [11]:
preprocess = RawDataProcessor(dataset)
#Separamos los datos de entrenamiento y test e instanciamos ambos conjuntos
train_indices, test_indices = train_test_split(dataset.index, test_size=0.2, random_state=42)
train_dataset = IMDBReviewsDataset(dataset.loc[train_indices].reset_index(drop=True), 
                                   transform=preprocess)
test_dataset = IMDBReviewsDataset(dataset.loc[test_indices].reset_index(drop=True), 
                                  transform=preprocess)

In [12]:
train_dataset.dataset.sample(10)

Unnamed: 0,review,sentiment
15517,This film is harmless escapist fun. Something ...,positive
15229,I read the comment of Chris_m_grant from Unite...,negative
1389,Let me just start out by saying that Tourist T...,positive
9828,*THIS REVIEW MAY CONTAIN SPOILERS... OR MAYBE ...,negative
29010,Duchess is a pretty white cat who lives with h...,positive
17680,From director Barbet Schroder (Reversal of For...,negative
35141,"After high-school graduation, best friends Ali...",positive
8242,"""The Godfather"", ""Citizen Kane"", ""Star Wars"", ...",positive
30241,"I rented this type of ""soft core"" before, but ...",negative
25259,"Greystoke: The Legend of Tarzan, Lord of the A...",positive


### Padding de secuencias

Dado que en este caso utilizaremos las secuencias completas sobre las que aplicaremos las convoluciones, necesitamos trabajar con dichas secuencias de manera que en un batch de datos tengan el tamaño correcto.

In [13]:
class PadSequences:
    def __init__(self, 
                 pad_value=0, 
                 max_length=None, 
                 min_length=1):
        
        assert max_length is None or min_length <= max_length #Sanity check
        self.pad_value = pad_value
        self.max_length = max_length
        self.min_length = min_length

    def __call__(self, items):
        data, target = list(zip(*[(item["data"], item["target"]) for item in items]))
        seq_lengths = [len(d) for d in data]

        if self.max_length:
            max_length = self.max_length
            seq_lengths = [min(self.max_length, l) for l in seq_lengths]
        else:
            # Si no tenemos max_lenght definido, tomamos el mínimo entre min_lenght y
            # la longitud de la máxima secuencia
            max_length = max(self.min_length, max(seq_lengths))
        
        
        # Para secuencias cuya longitud es menor que max_lenght rellenamos los valores
        # faltantes con 0 (pad_value)
        data = [d[:l] + [self.pad_value] * (max_length - l)
                for d, l in zip(data, seq_lengths)]
    
        return {
            "data": torch.LongTensor(data),
            "target": torch.FloatTensor(target)
        }

### DataLoaders

Una vez creada nuestra función para hacer padding de secuencia, definiremos los `DataLoader`s. Una cuestión importante, las redes convolucionales sobre text esperan que todas las secuencias sean al menos del tamaño de la convolución máxima (caso contrario ocurrirá un error por no poder realizar la convolución sobre un espacio más chico que el tamaño de la convolución). Es por eso que utilizamos el parámetro `min_length` esta vez.

In [14]:
EPOCHS = 2
FILTERS_COUNT = 100
FILTERS_LENGTH = [2, 3, 4]

#Instanciamos las clases
pad_sequences = PadSequences(min_length=max(FILTERS_LENGTH))
train_loader = DataLoader(train_dataset, 
                          batch_size=128, 
                          shuffle=True,
                          collate_fn=pad_sequences, #Aplicamos padding a las secuencias durante la creación del bath
                          drop_last=False)
test_loader = DataLoader(test_dataset, 
                         batch_size=128, 
                         shuffle=False,
                         collate_fn=pad_sequences, #Aplicamos padding a las secuencias durante la creación del bath
                         drop_last=False)

### Red convolucional sobre texto

Por último, tenemos la red convolucional sobre texto. Si bien arranca muy similar al caso del clasificador del perceptrón multicapa, vemos que en este caso hacemos uso de [`torch.nn.Conv1d`](https://pytorch.org/docs/stable/generated/torch.nn.Conv1d.html) dado que sólo nos desplazamos por una dimensión (i.e. la secuencia). En particular, como utilizamos *max pooling* global, no hacemos uso del módulo `torch.nn` para calcularlo, simplemente utilizamos el método `max()` del tensor.

**Algunas funciones auxiliares:**
- [`torch.nn.Embedding`](https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html): A simple lookup table that stores embeddings of a fixed dictionary and size.This module is often used to store word embeddings and retrieve them using indices. The input to the module is a list of indices, and the output is the corresponding word embeddings.

- [`nn.Embedding.from_pretrained`](https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html#:~:text=CLASSMETHOD%20from_pretrained(embeddings%2C%20freeze%3DTrue%2C%20padding_idx%3DNone%2C%20max_norm%3DNone%2C%20norm_type%3D2.0%2C%20scale_grad_by_freq%3DFalse%2C%20sparse%3DFalse)): Creates Embedding instance from given 2-dimensional FloatTensor.

- [`torch.nn.ModuleList`](https://pytorch.org/docs/stable/generated/torch.nn.ModuleList.html?highlight=modulelist#torch.nn.ModuleList:~:text=torch.nn.ModuleList(modules%3DNone)): Holds submodules in a list.

- [`torch.cat`](https://pytorch.org/docs/stable/generated/torch.cat.html?highlight=torch%20cat#torch.cat:~:text=torch.cat(tensors%2C%20dim%3D0%2C%20*%2C%20out%3DNone)%20%E2%86%92%20Tensor): Concatenates the given sequence of seq tensors in the given dimension. All tensors must either have the same shape (except in the concatenating dimension) or be empty.



In [15]:
class IMDBReviewsClassifier(nn.Module):
    def __init__(self, 
                 pretrained_embeddings_path, 
                 dictionary,
                 vector_size,
                 freeze_embedings):
        super().__init__()
        
        # Inicializamos la matriz de embeddings
        embeddings_matrix = torch.randn(len(dictionary), vector_size)
        embeddings_matrix[0] = torch.zeros(vector_size)
        
        #Trabajamos con los embeddings preentrenados
        with gzip.open(pretrained_embeddings_path, "rt") as fh:
            for line in fh:
                word, vector = line.strip().split(None, 1)
                if word in dictionary.token2id:
                    embeddings_matrix[dictionary.token2id[word]] =\
                        torch.FloatTensor([float(n) for n in vector.split()])
        
        # Los guardamos en la variable embeddings
        self.embeddings = nn.Embedding.from_pretrained(embeddings_matrix,
                                                       freeze=freeze_embedings,
                                                       padding_idx=0)
        self.convs = []
        for filter_lenght in FILTERS_LENGTH:
            self.convs.append(
                nn.Conv1d(vector_size, FILTERS_COUNT, filter_lenght) #(in_channels, out_channels, kernel_size)
            )
        self.convs = nn.ModuleList(self.convs)
        self.fc = nn.Linear(FILTERS_COUNT * len(FILTERS_LENGTH), 128)
        self.output = nn.Linear(128, 1)
        self.vector_size = vector_size
    
    @staticmethod
    def conv_global_max_pool(x, conv):
        return F.relu(conv(x).transpose(1, 2).max(1)[0])
    
    def forward(self, x):
        x = self.embeddings(x).transpose(1, 2)  
        x = [self.conv_global_max_pool(x, conv) for conv in self.convs]
        x = torch.cat(x, dim=1)
        x = F.relu(self.fc(x))
        x = torch.sigmoid(self.output(x))
        return x

### Experimento

El experimento de MLflow es prácticamente igual, salvo que cambiamos algunos de los parámetros a guardar.

In [16]:
mlflow.set_experiment("a_naive_experiment")

with mlflow.start_run():
    mlflow.log_param("model_name", "cnn")
    mlflow.log_param("freeze_embedding", True)
    mlflow.log_params({
        "filters_count": FILTERS_COUNT,
        "filters_length": FILTERS_LENGTH,
        "fc_size": 128
    })
    model = IMDBReviewsClassifier("./data/glove.6B.50d.txt.gz", preprocess.dictionary, 50, True)
    loss = nn.BCELoss()
    optimizer = optim.Adam(model.parameters(), lr=1e-3, weight_decay=1e-5)
    for epoch in trange(3):
        model.train()
        running_loss = []
        for idx, batch in enumerate(tqdm(train_loader)):
            optimizer.zero_grad()
            output = model(batch["data"])
            loss_value = loss(output, batch["target"].view(-1, 1))
            loss_value.backward()
            optimizer.step()
            running_loss.append(loss_value.item())        
        mlflow.log_metric("train_loss", sum(running_loss) / len(running_loss), epoch)
        
        model.eval()
        running_loss = []
        targets = []
        predictions = []
        for batch in tqdm(test_loader):
            output = model(batch["data"])
            running_loss.append(
                loss(output, batch["target"].view(-1, 1)).item()
            )
            targets.extend(batch["target"].numpy())
            predictions.extend(output.squeeze().detach().numpy())
        mlflow.log_metric("test_loss", sum(running_loss) / len(running_loss), epoch)
        mlflow.log_metric("test_avp", average_precision_score(targets, predictions), epoch)
    
    with tempfile.TemporaryDirectory() as tmpdirname:
        targets = []
        predictions = []
        for batch in tqdm(test_loader):
            output = model(batch["data"])
            targets.extend(batch["target"].numpy())
            predictions.extend(output.squeeze().detach().numpy())
        pd.DataFrame({"prediction": predictions, "target": targets}).to_csv(
            f"{tmpdirname}/predictions.csv.gz", index=False
        )
        mlflow.log_artifact(f"{tmpdirname}/predictions.csv.gz")

  0%|          | 0/3 [00:00<?, ?it/s]

  0%|          | 0/313 [00:00<?, ?it/s]

  0%|          | 0/79 [00:00<?, ?it/s]

  0%|          | 0/313 [00:00<?, ?it/s]

  0%|          | 0/79 [00:00<?, ?it/s]

  0%|          | 0/313 [00:00<?, ?it/s]

  0%|          | 0/79 [00:00<?, ?it/s]

  0%|          | 0/79 [00:00<?, ?it/s]

In [17]:
print(model)

IMDBReviewsClassifier(
  (embeddings): Embedding(50002, 50, padding_idx=0)
  (convs): ModuleList(
    (0): Conv1d(50, 100, kernel_size=(2,), stride=(1,))
    (1): Conv1d(50, 100, kernel_size=(3,), stride=(1,))
    (2): Conv1d(50, 100, kernel_size=(4,), stride=(1,))
  )
  (fc): Linear(in_features=300, out_features=128, bias=True)
  (output): Linear(in_features=128, out_features=1, bias=True)
)


In [18]:
mlflow.tracking.MlflowClient().list_experiments()


[<Experiment: artifact_location='file:///home/adrian/PycharmProjects/AprendizajeProfundo/mlruns/0', experiment_id='0', lifecycle_stage='active', name='Default', tags={}>,
 <Experiment: artifact_location='file:///home/adrian/PycharmProjects/AprendizajeProfundo/mlruns/1', experiment_id='1', lifecycle_stage='active', name='a_naive_experiment', tags={}>]

In [19]:
mlflow.search_runs().head(5)


Unnamed: 0,run_id,experiment_id,status,artifact_uri,start_time,end_time,metrics.test_loss,metrics.test_avp,metrics.train_loss,params.filters_length,...,params.fc_size,params.freeze_embedding,params.model_name,params.hidden1_size,params.embedding_size,params.hidden2_size,tags.mlflow.source.git.commit,tags.mlflow.source.name,tags.mlflow.user,tags.mlflow.source.type
0,4904522016014aa6ba867ddb32df0646,1,FINISHED,file:///home/adrian/PycharmProjects/Aprendizaj...,2022-10-26 04:11:23.156000+00:00,2022-10-26 04:13:49.300000+00:00,0.337537,0.92623,0.286678,"[2, 3, 4]",...,128.0,True,cnn,,,,1949c7ec3247aed4dd83cc64c892fcc39628c612,/home/adrian/PycharmProjects/AprendizajeProfun...,adrian,LOCAL
1,c1a56f32f7604db4a4bf4790e53215f1,1,FINISHED,file:///home/adrian/PycharmProjects/Aprendizaj...,2022-10-26 03:57:43.985000+00:00,2022-10-26 03:58:25.528000+00:00,0.513786,0.829681,0.510267,,...,,True,mlp,128.0,50.0,128.0,1949c7ec3247aed4dd83cc64c892fcc39628c612,/home/adrian/PycharmProjects/AprendizajeProfun...,adrian,LOCAL
2,e2117e91ce6d413ab89925c7faaf8117,1,FINISHED,file:///home/adrian/PycharmProjects/Aprendizaj...,2022-09-10 16:27:21.074000+00:00,2022-09-10 16:28:40.887000+00:00,0.511009,0.830107,0.509006,,...,,True,mlp,128.0,50.0,128.0,79a6fdd8b00d19f817b3ded7f50f5ce2d1ab1627,/home/adrian/PycharmProjects/AprendizajeProfun...,adrian,LOCAL
