# Ejercicios Clase 3

En este notebook vamos a usar MLPs para generar un modelo clasificador sobre FashionMNIST así que muchas de las funciones que usamos en los ejercicios de la clase 2 te serán muy útiles.

## Ejercicio 1:

Generar un modelo perceptron multicapa con 2 capas ocultas de 512 y 128 neuronas respectivamente para clasificación sobre el dataset FashionMNIST

In [3]:
import torch
from torch import nn
import torchvision
from IPython import display
from torchvision import transforms
from torch.utils import data

def load_data_fashion_mnist(batch_size, resize=None):
    trans = [transforms.ToTensor()]
    if resize:
        trans.insert(0, transforms.Resize(resize))
    trans = transforms.Compose(trans)
    mnist_train = torchvision.datasets.FashionMNIST(
        root="../data", train=True, transform=trans, download=True)
    mnist_test = torchvision.datasets.FashionMNIST(
        root="../data", train=False, transform=trans, download=True)
    return (data.DataLoader(mnist_train, batch_size, shuffle=True,
                            num_workers=1),
            data.DataLoader(mnist_test, batch_size, shuffle=False,
                            num_workers=1))

def init_weights(m):
  if type(m) == nn.Linear:
      nn.init.normal_(m.weight, std=0.01)

def accuracy(y_hat, y):
    """Compute the number of correct predictions."""
    if len(y_hat.shape) > 1 and y_hat.shape[1] > 1:
        y_hat = y_hat.argmax(axis=1)
    cmp = y_hat.type(y.dtype) == y
    return float(cmp.type(y.dtype).sum())

def train(net, train_iter, test_iter, loss, num_epochs, trainer):
  for epoch in range(num_epochs):
    L = 0.0
    N = 0
    Acc = 0.0
    TestAcc = 0.0
    TestN = 0
    for X, y in train_iter:
        l = loss(net(X) ,y)
        trainer.zero_grad()
        l.mean().backward()
        trainer.step()
        L += l.sum()
        N += l.numel()
        Acc += accuracy(net(X), y)
    for X, y in test_iter:
        TestN += y.numel()
        TestAcc += accuracy(net(X), y)
    print(f'epoch {epoch + 1}, loss {(L/N):f}\
          , train accuracy  {(Acc/N):f}, test accuracy {(TestAcc/TestN):f}')


In [5]:
net1 = nn.Sequential(
    nn.Flatten(),
    nn.Linear(28*28, 512),
    nn.ReLU(),
    nn.Linear(512,128),
    nn.ReLU(),
    nn.Linear(128,10)
)

## Ejercicio 2

Entrene el modelo por 10 épocas con un tamaño de lote de 256 y un learning rate de 0.3. (Le recomendamos reutilizar las funciones modularizadas de los ejercicios de la clase 2)

In [6]:
#ingresa tu código aquí
batch_size = 256
epochs = 10
lr = 0.3

net1.apply(init_weights)

train_iter, test_iter = load_data_fashion_mnist(batch_size)

loss = nn.CrossEntropyLoss(reduction='none')
trainer = torch.optim.SGD(net1.parameters(), lr=lr)

train(net1, train_iter, test_iter, loss, epochs, trainer)

epoch 1, loss 1.208882          , train accuracy  0.550983, test accuracy 0.744400
epoch 2, loss 0.572254          , train accuracy  0.804733, test accuracy 0.815300
epoch 3, loss 0.467516          , train accuracy  0.847033, test accuracy 0.804800
epoch 4, loss 0.421262          , train accuracy  0.865600, test accuracy 0.762800
epoch 5, loss 0.390806          , train accuracy  0.877067, test accuracy 0.833500
epoch 6, loss 0.367432          , train accuracy  0.885933, test accuracy 0.815900
epoch 7, loss 0.351476          , train accuracy  0.891417, test accuracy 0.858900
epoch 8, loss 0.333906          , train accuracy  0.899083, test accuracy 0.834500
epoch 9, loss 0.323901          , train accuracy  0.903867, test accuracy 0.847200
epoch 10, loss 0.311968          , train accuracy  0.908700, test accuracy 0.852200


## Ejercicio 3 :

A partir del modelo anterior, analice que ocurre si en lugar de entrenar 10 épocas, entrena 20

In [7]:
#ingresa tu código aquí

batch_size = 256
epochs = 20
lr = 0.3

net1.apply(init_weights)

train_iter, test_iter = load_data_fashion_mnist(batch_size)

loss = nn.CrossEntropyLoss(reduction='none')
trainer = torch.optim.SGD(net1.parameters(), lr=lr)

train(net1, train_iter, test_iter, loss, epochs, trainer)

epoch 1, loss 1.091739          , train accuracy  0.604700, test accuracy 0.739300
epoch 2, loss 0.551904          , train accuracy  0.815133, test accuracy 0.822500
epoch 3, loss 0.455712          , train accuracy  0.852700, test accuracy 0.794300
epoch 4, loss 0.411932          , train accuracy  0.868250, test accuracy 0.840000
epoch 5, loss 0.384573          , train accuracy  0.879683, test accuracy 0.843600
epoch 6, loss 0.359939          , train accuracy  0.888533, test accuracy 0.809600
epoch 7, loss 0.342818          , train accuracy  0.895900, test accuracy 0.823900
epoch 8, loss 0.327652          , train accuracy  0.901483, test accuracy 0.866600
epoch 9, loss 0.318711          , train accuracy  0.905667, test accuracy 0.869800
epoch 10, loss 0.307725          , train accuracy  0.910417, test accuracy 0.848700
epoch 11, loss 0.299697          , train accuracy  0.914433, test accuracy 0.871600
epoch 12, loss 0.287715          , train accuracy  0.917800, test accuracy 0.854000
e

## Ejercicio 4

Aumente el learning rate a 1 y entrene nuevamente. ¿Cómo puede explicar lo que pasó?

In [8]:
#ingresa tu código aquí

batch_size = 256
epochs = 20
lr = 1

net1.apply(init_weights)

train_iter, test_iter = load_data_fashion_mnist(batch_size)

loss = nn.CrossEntropyLoss(reduction='none')
trainer = torch.optim.SGD(net1.parameters(), lr=lr)

train(net1, train_iter, test_iter, loss, epochs, trainer)

epoch 1, loss 2.262566          , train accuracy  0.261433, test accuracy 0.289100
epoch 2, loss 1.523291          , train accuracy  0.430683, test accuracy 0.457400
epoch 3, loss 1.449324          , train accuracy  0.442750, test accuracy 0.436500
epoch 4, loss 1.638841          , train accuracy  0.352850, test accuracy 0.208900
epoch 5, loss 1.582961          , train accuracy  0.356183, test accuracy 0.407300
epoch 6, loss 1.536107          , train accuracy  0.378100, test accuracy 0.349400
epoch 7, loss 1.423726          , train accuracy  0.420400, test accuracy 0.431700
epoch 8, loss 1.286654          , train accuracy  0.477000, test accuracy 0.462500
epoch 9, loss 1.282499          , train accuracy  0.485750, test accuracy 0.536600
epoch 10, loss 1.090440          , train accuracy  0.557100, test accuracy 0.528700
epoch 11, loss 1.003532          , train accuracy  0.601600, test accuracy 0.469300
epoch 12, loss 1.524026          , train accuracy  0.390733, test accuracy 0.260900
e

## Ejercicio 5:

Analize el efecto de cambiar las funciones de activación en el accurracy

In [9]:
#ingresa tu código aquí

INPUT = 28 * 28
OUTPUT = 10
HIDDEN1 = 512
HIDDEN2 = 128


net2 = nn.Sequential(nn.Flatten(),
                    nn.Linear(INPUT, HIDDEN1),
                    nn.Sigmoid(),
                    nn.Linear(HIDDEN1, HIDDEN2),
                    nn.Sigmoid(),
                    nn.Linear(HIDDEN2, OUTPUT))
batch_size, lr, num_epochs = 256, 0.3, 10

net2.apply(init_weights);

train_iter, test_iter = load_data_fashion_mnist(batch_size)
loss = nn.CrossEntropyLoss(reduction='none')
trainer2 = torch.optim.SGD(net2.parameters(), lr=lr)
train(net2, train_iter, test_iter, loss, num_epochs, trainer2)

epoch 1, loss 2.310005          , train accuracy  0.130950, test accuracy 0.100000
epoch 2, loss 1.886963          , train accuracy  0.300233, test accuracy 0.415900
epoch 3, loss 1.085987          , train accuracy  0.611733, test accuracy 0.635000
epoch 4, loss 0.884283          , train accuracy  0.668750, test accuracy 0.628400
epoch 5, loss 0.804815          , train accuracy  0.705217, test accuracy 0.715300
epoch 6, loss 0.727692          , train accuracy  0.739750, test accuracy 0.721100
epoch 7, loss 0.665764          , train accuracy  0.765750, test accuracy 0.755200
epoch 8, loss 0.620813          , train accuracy  0.782300, test accuracy 0.763500
epoch 9, loss 0.585568          , train accuracy  0.795633, test accuracy 0.774600
epoch 10, loss 0.559369          , train accuracy  0.806733, test accuracy 0.796000


## Ejercicio 6:

Ahora genere un tercer modelo en donde ambas capas tengan 1024 neuronas. Analice si produjo algún cambio en los rendimientos.

In [10]:

INPUT = 28 * 28
OUTPUT = 10
HIDDEN1 = 1024
HIDDEN2 = 1024
batch_size, lr, num_epochs = 256, 0.3, 20

net3 = nn.Sequential(nn.Flatten(),
                    nn.Linear(INPUT, HIDDEN1),
                    nn.ReLU(),
                    nn.Linear(HIDDEN1, HIDDEN2),
                    nn.ReLU(),
                    nn.Linear(HIDDEN2, OUTPUT))

net3.apply(init_weights);

train_iter, test_iter = load_data_fashion_mnist(batch_size)
loss = nn.CrossEntropyLoss(reduction='none')
trainer3 = torch.optim.SGD(net3.parameters(), lr=lr)
train(net3, train_iter, test_iter, loss, num_epochs, trainer3)


epoch 1, loss 0.920348          , train accuracy  0.682167, test accuracy 0.753900
epoch 2, loss 0.508596          , train accuracy  0.832483, test accuracy 0.808100
epoch 3, loss 0.435078          , train accuracy  0.861217, test accuracy 0.804000
epoch 4, loss 0.398921          , train accuracy  0.874433, test accuracy 0.838800
epoch 5, loss 0.366759          , train accuracy  0.885517, test accuracy 0.848200
epoch 6, loss 0.349583          , train accuracy  0.891450, test accuracy 0.793800
epoch 7, loss 0.338359          , train accuracy  0.899600, test accuracy 0.866100
epoch 8, loss 0.319650          , train accuracy  0.904533, test accuracy 0.862700
epoch 9, loss 0.307662          , train accuracy  0.910050, test accuracy 0.841300
epoch 10, loss 0.299380          , train accuracy  0.913383, test accuracy 0.866400
epoch 11, loss 0.290028          , train accuracy  0.917100, test accuracy 0.859900
epoch 12, loss 0.277718          , train accuracy  0.921350, test accuracy 0.842400
e