# Modelo en Pytroch
En esta seccion importamos las funciones que se usaran en la creacion del modelo AlexNet desde scratch en Pytorch, asi como verificar si se puede usar la tarjeta grafica integrada de la PC. 

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision.transforms as transforms
import torchvision
import torch.optim as optim
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

## Creacion del modelo
En esta parte ya creamos el modelo AlexNet con los requisitos mencionados en calse, 5 convoluciones, 3 maxpoolings y 3 redes densas con activciones ReLu y dos funciones dropout para evitar el overfitting.

In [2]:
class AlexNet(nn.Module):
    def __init__(self):
        super(AlexNet, self).__init__()
        self.fc1 = nn.Linear(9216, 4096)
        self.fc2 = nn.Linear(4096, 4096)
        self.fc3 = nn.Linear(4096, 10)
        self.cn1 = nn.Conv2d(3, 64, 11, stride=4)
        self.cn2 = nn.Conv2d(64, 192, 5, stride=1)
        self.cn3 = nn.Conv2d(192, 384, 3, stride=1)
        self.cn4 = nn.Conv2d(384, 256, 3, stride=1)
        self.cn5 = nn.Conv2d(256, 256, 3, stride=1)
        self.dropout = nn.Dropout(p=0.5)

    def forward(self, x):
        x = F.relu(self.cn1(x))
        x = F.max_pool2d(x, (3, 3), stride = 2)
        x = F.relu(self.cn2(x))
        x = F.max_pool2d(x, (3, 3), stride = 2)
        x = F.relu(self.cn3(x))
        x = F.relu(self.cn4(x))
        x = F.relu(self.cn5(x))
        x = F.max_pool2d(x, (3, 3), stride = 2)
        x = F.adaptive_avg_pool2d(x, (6, 6))
        x = x.view(x.size(0), -1)
        x = F.relu(self.fc1(x))
        x = self.dropout(x)
        x = F.relu(self.fc2(x))
        x = self.dropout(x)
        x = self.fc3(x)
        
        return x

Asignamos el modelo a la GPU

In [3]:
model = AlexNet().to(device)

## Carga de datos
En esta seccion se usa el mismo codigo que se vio en clase para la carga de datos, pero con un batch size distinto, en este caso se uso 128 ya que asi lo permitio la GPU

In [4]:
# 1. Load CIFAR-10 dataset
transform = transforms.Compose([
    transforms.Resize(224),  # AlexNet expects 224x224 images
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=128,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=128,
                                         shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

Se defininen el criterio de optimizacion y la funcion de optimizacion.

In [5]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.005, momentum=0.9)

## Entrenamineto 
En esta celda se entrena por 20 epocas el modelo tardando de 15 a 20 minutos.

In [6]:
for epoch in range(20): 
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data[0].to(device), data[1].to(device)

        optimizer.zero_grad()

        outputs = model(inputs)

        labels = labels.long()
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
    print(f'[{epoch + 1}] loss: {running_loss / 2000:.3f}')
    running_loss = 0.0

print('Finished Training')

[1] loss: 0.445
[2] loss: 0.375
[3] loss: 0.311
[4] loss: 0.259
[5] loss: 0.225
[6] loss: 0.195
[7] loss: 0.172
[8] loss: 0.152
[9] loss: 0.133
[10] loss: 0.117
[11] loss: 0.102
[12] loss: 0.086
[13] loss: 0.072
[14] loss: 0.062
[15] loss: 0.052
[16] loss: 0.043
[17] loss: 0.038
[18] loss: 0.032
[19] loss: 0.029
[20] loss: 0.025
Finished Training


## Prueba del modelo
Se prueba el modelo con los datos de testing, y en general se obtinen una accuracy de 80%

In [7]:
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data[0].to(device), data[1].to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f'Accuracy of the network on the 10000 test images: {100 * correct / total:.2f} %')

Accuracy of the network on the 10000 test images: 75.33 %


In [8]:
torch.save(model.state_dict(), 'modelo_entrenado.pth')

In [9]:
torch.save(model, 'modelo_completo.pth')


Se evalua el modelo por categoria

In [10]:
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data[0].to(device), data[1].to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1

for i in range(10):
    print(f'Accuracy of {classes[i]:5s} : {100 * class_correct[i] / class_total[i]:.2f} %')

Accuracy of plane : 79.31 %
Accuracy of car   : 78.57 %
Accuracy of bird  : 72.73 %
Accuracy of cat   : 52.94 %
Accuracy of deer  : 70.37 %
Accuracy of dog   : 72.73 %
Accuracy of frog  : 80.56 %
Accuracy of horse : 72.00 %
Accuracy of ship  : 84.38 %
Accuracy of truck : 79.49 %


# Modelo pre entrenado en Pytorch de AlexNet
Cargamos el mdoelo preentrenado de AlexNet con solo 10 categorias

In [12]:
modelp = torchvision.models.alexnet(pretrained=True)
modelp.eval()
num_features = modelp.classifier[6].in_features
modelp.classifier[6] = nn.Linear(num_features, 10) # 10 output classes
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(modelp.parameters(), lr=0.001, momentum=0.9)
modelp.to(device)



AlexNet(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (4): ReLU(inplace=True)
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU(inplace=True)
    (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (9): ReLU(inplace=True)
    (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
  (classifier): Sequential(
    (0): Dropout(p=0.5, inplace=False)
    (1): Linear(in_features=9216, out_features=4096, bias=True)
 

## Entrenamiento del modelo preentrenado

In [13]:
for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data[0].to(device), data[1].to(device)

        optimizer.zero_grad()

        outputs = modelp(inputs)
        # Ensure labels are long type
        labels = labels.long()
        loss = criterion(outputs, labels) # Corrected order: outputs, labels
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
    print(f'[{epoch + 1}] loss: {running_loss / 2000:.3f}')
    running_loss = 0.0

print('Finished Training')

[1] loss: 0.127
[2] loss: 0.075
Finished Training


## Evaluacion del medoleo preentrenado

In [14]:
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data[0].to(device), data[1].to(device)
        outputs = modelp(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f'Accuracy of the network on the 10000 test images: {100 * correct / total:.2f} %')

Accuracy of the network on the 10000 test images: 86.38 %


In [15]:
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data[0].to(device), data[1].to(device)
        outputs = modelp(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1

for i in range(10):
    print(f'Accuracy of {classes[i]:5s} : {100 * class_correct[i] / class_total[i]:.2f} %')

Accuracy of plane : 82.76 %
Accuracy of car   : 92.86 %
Accuracy of bird  : 93.94 %
Accuracy of cat   : 82.35 %
Accuracy of deer  : 85.19 %
Accuracy of dog   : 66.67 %
Accuracy of frog  : 88.89 %
Accuracy of horse : 92.00 %
Accuracy of ship  : 93.75 %
Accuracy of truck : 97.44 %


# Coomparacion de los modelos en Pytorch
Es claro que el modelo preentrenado es  mmucho mas rapido y eficaz que el modleo que se creo desde cero, el modleo preentrenado sse tardo aproximadamente dos minutos y medio en completar tan solo dos epocas en cambio el nuevo se tardo veinte minutos para 20 epocas, en cuanto a epocas hablamso se tardan masomenos lo mismo. Un punto que creo   haber notaod es que en el nuevo modelo parece haber overfitting ya que la funcion de perdida disminuye muy rapido y ademas es incluso menor a la del modelo preentrenado, eso quiere decir que para los datos con los que se entrena parece funcionar demasiado bien pero al salir a conocer nuevos datos no resulta tan efectivo. Con metodos que traten de evitar el overfitting creo es posible mejorar aun mas el modelo.
Otro punto que me parece interesante mencionar es que a ambos modelos les cuesta trabajo reconocer a los gatos y a los perros, y el modleo preentrenado parece tener mas variacion en la precision de las predicciones, en cambio el nuevo es peor pero mas constante.