<a href="https://colab.research.google.com/github/N34R20/DeepLearningHumai/blob/main/ejercicio_en_vivo_optimizadores.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Ejercicio en vivo clase 6: Optimizadores

# Consigna:

Dadas las siguientes funciones y el modelo base, utilice KFold para evaluar cuál de los siguientes optimizadores arroja los mejores resultados para el dataset de FashionMNIST: ["Adagrad", "RMSprop", "Adadelta", "Adam"]

In [1]:
import numpy as np
import torch
from torch import nn
from torchvision import transforms, datasets
from sklearn.model_selection import KFold

In [2]:
#### Funciones necesarias de otros notebooks

def reset_weights(m):
  if type(m) == nn.Linear:
      nn.init.normal_(m.weight, std=0.01)

def accuracy(y_hat, y):
    """Compute the number of correct predictions."""
    if len(y_hat.shape) > 1 and y_hat.shape[1] > 1:
        y_hat = y_hat.argmax(axis=1)
    cmp = y_hat.type(y.dtype) == y
    return float(cmp.type(y.dtype).sum())

def get_accuracy(fold, model, device, test_loader):
  TestAcc = 0.0
  N = 0
  for X, y in test_loader:
      X, y = X.to(device), y.to(device)
      N += y.numel()
      TestAcc += accuracy(model(X), y)
  print('\nFold {}:  Accuracy: {}/{} ({:.0f}%)'.format(
        fold, TestAcc, N,
        (100. * TestAcc) / N))
  return TestAcc / N

def train(fold, model, device, loss, train_loader, optimizer):

  for batch_idx, (data, target) in enumerate(train_loader):
      data, target = data.to(device), target.to(device)
      optimizer.zero_grad()
      l = loss(model(data), target).mean()
      l.backward()
      optimizer.step()

In [3]:
### Función que lleva adelante el proceso de kfold cross validation
def train_kfold(model, optimizer, dataset, n_fold, epochs):
  device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
  loss = torch.nn.CrossEntropyLoss(reduction='none')
  optimizer = optimizer
  batch_size=32
  folds=n_fold
  train_acc = []
  acc = []
  kfold=KFold(n_splits=n_fold,shuffle=True)
  for fold,(train_idx,test_idx) in enumerate(kfold.split(dataset)):
    print('------------fold no---------{}----------------------'.format(fold))
    train_subsampler = torch.utils.data.SubsetRandomSampler(train_idx)
    test_subsampler = torch.utils.data.SubsetRandomSampler(test_idx)

    trainloader = torch.utils.data.DataLoader(
                        dataset,
                        batch_size=batch_size, sampler=train_subsampler)
    testloader = torch.utils.data.DataLoader(
                        dataset,
                        batch_size=batch_size, sampler=test_subsampler)

    model.apply(reset_weights)

    fold_acc = 0
    model.to(device=device)
    for epoch in range(1, epochs + 1):
      train(fold, model, device, loss, trainloader, optimizer)
      fold_train_acc = get_accuracy(fold, model, device, trainloader)
      fold_acc =       get_accuracy(fold, model, device,  testloader)
    train_acc.append(fold_train_acc)
    acc.append(fold_acc)
  return train_acc, acc

In [4]:
# Estructura del modelo base
INPUT = 28 * 28
OUTPUT = 10

model = nn.Sequential(nn.Flatten(),
                      nn.Linear(INPUT, 256),
                      nn.LeakyReLU(0.01),
                      nn.Linear(256, 64),
                      nn.LeakyReLU(0.01),
                      nn.Linear(64, OUTPUT))

Aquí comienza el ejercicio:

In [6]:
from torch.optim import optimizer
INPUT = 28 * 28
OUTPUT = 10

def create_model(optimizer_name):
  # Complete el código aquí
  model = nn.Sequential(nn.Flatten(),
                      nn.Linear(INPUT, 256),
                      nn.LeakyReLU(0.01),
                      nn.Linear(256, 64),
                      nn.LeakyReLU(0.01),
                      nn.Linear(64, OUTPUT))
  optimizers = {
      'Adagrad': torch.optim.Adagrad(model.parameters(), lr=0.3),
      'RMSprop': torch.optim.RMSprop(model.parameters(), lr=0.3),
      'Adadelta': torch.optim.Adadelta(model.parameters(), lr=0.3),
      'Adam': torch.optim.Adam(model.parameters(), lr=0.3),
  }

  optimizer = optimizers[optimizer_name]
  return model, optimizer

data_iter = datasets.FashionMNIST(
        root="../data", train=False, transform=transforms.ToTensor(), download=True)

N_EPOCHS = 2
N_FOLDS = 2
optimizers = ["Adagrad", "RMSprop", "Adadelta", "Adam"]

results = {}
for opt_name in optimizers:
  # Complete el código aquí
  model, optimizer = create_model(opt_name)
  train_acc, validation_acc = train_kfold(model, optimizer, data_iter, N_FOLDS, N_EPOCHS)
  results[opt_name] = (np.array(train_acc).mean(), np.array(validation_acc).mean())


best_optimizer = max(results, key=lambda opt: results[opt][1])
print(f"El mejor optimizador es {best_optimizer} con una precisión de validación de {results[best_optimizer][1]}")

------------fold no---------0----------------------

Fold 0:  Accuracy: 2551.0/5000 (51%)

Fold 0:  Accuracy: 2566.0/5000 (51%)

Fold 0:  Accuracy: 2924.0/5000 (58%)

Fold 0:  Accuracy: 2954.0/5000 (59%)
------------fold no---------1----------------------

Fold 1:  Accuracy: 3173.0/5000 (63%)

Fold 1:  Accuracy: 3149.0/5000 (63%)

Fold 1:  Accuracy: 3526.0/5000 (71%)

Fold 1:  Accuracy: 3374.0/5000 (67%)
------------fold no---------0----------------------

Fold 0:  Accuracy: 2343.0/5000 (47%)

Fold 0:  Accuracy: 2352.0/5000 (47%)

Fold 0:  Accuracy: 2596.0/5000 (52%)

Fold 0:  Accuracy: 2557.0/5000 (51%)
------------fold no---------1----------------------

Fold 1:  Accuracy: 2323.0/5000 (46%)

Fold 1:  Accuracy: 2244.0/5000 (45%)

Fold 1:  Accuracy: 2123.0/5000 (42%)

Fold 1:  Accuracy: 2082.0/5000 (42%)
------------fold no---------0----------------------

Fold 0:  Accuracy: 2396.0/5000 (48%)

Fold 0:  Accuracy: 2394.0/5000 (48%)

Fold 0:  Accuracy: 3023.0/5000 (60%)

Fold 0:  Accuracy