# Redes neuronales convolucionales**
**Proyecto: identificación de razas de perros**

---

### Roberto Alejandro Gutiérrez Guillén A01019608
### Emilio Hernández López A01336418
### Eduardo Badillo A01020716

Daremos los primeros pasos para desarrollar un algoritmo que pueda usarse como parte de una aplicación web o móvil.

Al final de este proyecto, su código aceptará cualquier imagen proporcionada por el usuario como entrada. Si se detecta un perro en la imagen, proporcionará una estimación de la raza del perro. Si se detecta un humano, proporcionará una estimación de la raza de perro que se parece más.

### El camino por delante

Dividimos la notebook en pasos separados.

    Paso 0: importar conjuntos de datos
    Paso 1: Detecta humanos
    Paso 2: detectar perros
    Paso 3: crea una CNN para clasificar las razas de perros (desde cero)
    Paso 4: Cree una CNN para clasificar las razas de perros (usando Transfer Learning)
    Paso 5: prueba tu algoritmo

### Paso 0: importar conjuntos de datos

    Descarga el conjunto de datos del perro. Descomprima la carpeta y colóquela en el directorio de inicio de este proyecto, en la ubicación / dogImages.
  

    Descarga el conjunto de datos humanos. Descomprima la carpeta y colóquela en el directorio de inicio, en location / lfw.

In [1]:
import numpy as np
from glob import glob

# download dog and human dataset
!mkdir dataset
!wget -O dataset/dog_dataset.zip https://s3-us-west-1.amazonaws.com/udacity-aind/dog-project/dogImages.zip
!wget -O dataset/human_dataset.zip https://s3-us-west-1.amazonaws.com/udacity-aind/dog-project/lfw.zip
!unzip -q dataset/dog_dataset.zip
!unzip -q dataset/human_dataset.zip

# load filenames for human and dog images
human_files = np.array(glob("lfw/*/*"))
dog_files = np.array(glob("dogImages/*/*/*"))

# print number of images in each dataset
print('There are %d total human images.' % len(human_files))
print('There are %d total dog images.' % len(dog_files))

--2020-11-24 23:06:16--  https://s3-us-west-1.amazonaws.com/udacity-aind/dog-project/dogImages.zip
Resolving s3-us-west-1.amazonaws.com (s3-us-west-1.amazonaws.com)... 52.219.120.112
Connecting to s3-us-west-1.amazonaws.com (s3-us-west-1.amazonaws.com)|52.219.120.112|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1132023110 (1.1G) [application/zip]
Saving to: ‘dataset/dog_dataset.zip’


2020-11-24 23:06:42 (41.0 MB/s) - ‘dataset/dog_dataset.zip’ saved [1132023110/1132023110]

--2020-11-24 23:06:42--  https://s3-us-west-1.amazonaws.com/udacity-aind/dog-project/lfw.zip
Resolving s3-us-west-1.amazonaws.com (s3-us-west-1.amazonaws.com)... 52.219.112.184
Connecting to s3-us-west-1.amazonaws.com (s3-us-west-1.amazonaws.com)|52.219.112.184|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 196739509 (188M) [application/zip]
Saving to: ‘dataset/human_dataset.zip’


2020-11-24 23:06:48 (38.2 MB/s) - ‘dataset/human_dataset.zip’ saved [196739509/1

### Paso 1: Detecta humanos

En esta sección, usaremos la implementación de OpenCV de clasificadores en cascada basados ​​en características de Haar para detectar rostros humanos en imágenes.

OpenCV proporciona muchos detectores de rostros previamente entrenados, almacenados como archivos XML en [github](https://github.com/opencv/opencv/tree/master/data/haarcascades).

In [2]:
import cv2 
# returns "True" if face is detected in image stored at img_path
face_cascade = cv2.CascadeClassifier('/content/haarcascade_frontalface_default.xml')

def face_detector(img_path):
    img = cv2.imread(img_path)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray)
    return len(faces) > 0

### Evaluar el detector de rostro humano

In [3]:
from tqdm import tqdm

human_files_short = human_files[:100]
dog_files_short = dog_files[:100]

def detect_face(images):
    cnt = 0
    for img in images:
        if face_detector(img):
            cnt += 1
    return cnt

In [4]:
import cv2                
import matplotlib.pyplot as plt                        
%matplotlib inline 
print("detect face in human_files: {} / {}".format(detect_face(human_files_short), len(human_files_short)))
print("detect face in dog_files: {} / {}".format(detect_face(dog_files_short), len(dog_files_short)))

detect face in human_files: 100 / 100
detect face in dog_files: 54 / 100


### Paso 2: detectar perros

En esta sección, usamos un modelo previamente entrenado para detectar perros en imágenes.

---

Usaremos el modelo VGG-16 previamente entrenado

La celda de código a continuación descarga el modelo VGG-16, junto con pesos que han sido entrenados en ImageNet, un conjunto de datos muy grande y muy popular que se utiliza para la clasificación de imágenes y otras tareas de visión. 

ImageNet contiene más de 10 millones de URL, cada una vinculada a una imagen que contiene un objeto de una de las 1000 categorías.

In [5]:
import torch
import torchvision.models as models

# check if CUDA is available
use_cuda = torch.cuda.is_available()
print("cuda available? {0}".format(use_cuda))

cuda available? True


Antes de escribir la función, entra al siguiente [link](https://pytorch.org/docs/stable/torchvision/models.html) y tomen el tiempo para aprender cómo preprocesar adecuadamente los tensores para modelos previamente entrenados en la documentación de PyTorch.



In [53]:
# define VGG16 model
VGG16 = models.vgg16(pretrained=True)
googlenet = models.googlenet(pretrained=True)
shufflenet = models.shufflenet_v2_x1_0(pretrained=True)
alexnet = models.alexnet(pretrained=True)
densenet = models.densenet161(pretrained=True)
resnext50_32x4d = models.resnext50_32x4d(pretrained=True)

# move model to GPU if CUDA is available
if use_cuda:
    VGG16 = VGG16.cuda()
    googlenet = googlenet.cuda()
    shufflenet = shufflenet.cuda()
    alexnet = alexnet.cuda()
    densenet = densenet.cuda()
    resnext50_32x4d = resnext50_32x4d.cuda()

In [54]:
from PIL import Image
import torchvision.transforms as transforms


def load_image(img_path):    
    image = Image.open(img_path)

    # resize to (244, 244) because VGG16 accept this shape
    in_transform = transforms.Compose(
                        [transforms.Resize(size=(244, 244)), transforms.ToTensor()])

    # discard the transparent, alpha channel (that's the :3) and add the batch dimension
    image = in_transform(image)[:3,:,:].unsqueeze(0)
    return image

In [55]:
def VGG16_predict(img_path):
    '''
    Use pre-trained VGG-16 model to obtain index corresponding to 
    predicted ImageNet class for image at specified path
    
    Args:
        img_path: path to an image
        
    Returns:
        Index corresponding to VGG-16 model's prediction
    '''
    ## Load and pre-process an image from the given img_path
    ## Return the *index* of the predicted class for that image
    img = load_image(img_path)
    if use_cuda:
        img = img.cuda()
    ret = VGG16(img)
    return torch.max(ret,1)[1].item() # predicted class index

In [56]:
# predict dog using ImageNet class
VGG16_predict(dog_files_short[0])

151

In [57]:
def googlenet_predict(img_path):
    img = load_image(img_path)
    if use_cuda:
        img = img.cuda()
    ret = googlenet(img)
    return torch.max(ret,1)[1].item() # predicted class index

In [58]:
googlenet_predict(dog_files_short[0])

515

In [59]:
def shufflenet_predict(img_path):
    img = load_image(img_path)
    if use_cuda:
        img = img.cuda()
    ret = shufflenet(img)
    return torch.max(ret,1)[1].item() # predicted class index

In [60]:
shufflenet_predict(dog_files_short[0])

845

In [61]:
def alexnet_predict(img_path):
    img = load_image(img_path)
    if use_cuda:
        img = img.cuda()
    ret = alexnet(img)
    return torch.max(ret,1)[1].item() # predicted class index

In [62]:
alexnet_predict(dog_files_short[0])

284

In [63]:
def densenet_predict(img_path):
    img = load_image(img_path)
    if use_cuda:
        img = img.cuda()
    ret = densenet(img)
    return torch.max(ret,1)[1].item() # predicted class index

In [64]:
densenet_predict(dog_files_short[0])

600

In [65]:
def resnext50_32x4d_predict(img_path):
    img = load_image(img_path)
    if use_cuda:
        img = img.cuda()
    ret = resnext50_32x4d(img)
    return torch.max(ret,1)[1].item() # predicted class index

In [66]:
resnext50_32x4d_predict(dog_files_short[0])

463

In [67]:
### returns "True" if a dog is detected in the image stored at img_path
def dog_detectorVG(img_path):
    pred = VGG16_predict(img_path)
    return pred >= 151 and pred <= 268 # true/false

In [68]:
def dog_detectorGoogle(img_path):
    pred = googlenet_predict(img_path)
    return pred >= 151 and pred <= 268 # true/false

In [69]:
def dog_detectorShuffle(img_path):
    pred = shufflenet_predict(img_path)
    return pred >= 151 and pred <= 268 # true/false

In [70]:
def dog_detectorAlexnet(img_path):
    pred = alexnet_predict(img_path)
    return pred >= 151 and pred <= 268 # true/false

In [71]:
def dog_detectorDensenet(img_path):
    pred = densenet_predict(img_path)
    return pred >= 151 and pred <= 268 # true/false

In [72]:
def dog_detectorResnext50_32x4d(img_path):
    pred = resnext50_32x4d_predict(img_path)
    return pred >= 151 and pred <= 268 # true/false

In [73]:
print("\n VG")
print(dog_detectorVG(dog_files_short[0]))
print(dog_detectorVG(human_files_short[0]))
print("\n Google")
print(dog_detectorGoogle(dog_files_short[0]))
print(dog_detectorGoogle(human_files_short[0]))
print("\n Shuffle")
print(dog_detectorShuffle(dog_files_short[0]))
print(dog_detectorShuffle(human_files_short[0]))
print("\n Alexnet")
print(dog_detectorAlexnet(dog_files_short[0]))
print(dog_detectorAlexnet(human_files_short[0]))
print("\n Densenet")
print(dog_detectorDensenet(dog_files_short[0]))
print(dog_detectorDensenet(human_files_short[0]))
print("\n Resnext50_32x4d")
print(dog_detectorResnext50_32x4d(dog_files_short[0]))
print(dog_detectorResnext50_32x4d(human_files_short[0]))


 VG
True
False

 Google
False
False

 Shuffle
False
False

 Alexnet
False
False

 Densenet
False
False

 Resnext50_32x4d
False
False


In [74]:
### Test the performance of the dog_detector function
### on the images in human_files_short and dog_files_short.
def dog_detector_testVG(files):
    cnt = 0
    for file in files:
        if dog_detectorVG(file):
            cnt += 1
    return cnt, len(files)

In [75]:
def dog_detector_testGoogle(files):
    cnt = 0
    for file in files:
        if dog_detectorGoogle(file):
            cnt += 1
    return cnt, len(files)

In [76]:
def dog_detector_testShuffle(files):
    cnt = 0
    for file in files:
        if dog_detectorShuffle(file):
            cnt += 1
    return cnt, len(files)

In [77]:
def dog_detector_testAlexnet(files):
    cnt = 0
    for file in files:
        if dog_detectorAlexnet(file):
            cnt += 1
    return cnt, len(files)

In [78]:
def dog_detector_testDensenet(files):
    cnt = 0
    for file in files:
        if dog_detectorDensenet(file):
            cnt += 1
    return cnt, len(files)

In [79]:
def dog_detector_testResnext50_32x4d(files):
    cnt = 0
    for file in files:
        if dog_detectorResnext50_32x4d(file):
            cnt += 1
    return cnt, len(files)

In [82]:
print("\n VG")
print("detect a dog in human_files: {} / {}".format(dog_detector_testVG(human_files_short)[0], dog_detector_testVG(human_files_short)[1]))
print("detect a dog in dog_files: {} / {}".format(dog_detector_testVG(dog_files_short)[0], dog_detector_testVG(dog_files_short)[1]))

print("\n Google")
print("detect a dog in human_files: {} / {}".format(dog_detector_testGoogle(human_files_short)[0], dog_detector_testGoogle(human_files_short)[1]))
print("detect a dog in dog_files: {} / {}".format(dog_detector_testGoogle(dog_files_short)[0], dog_detector_testGoogle(dog_files_short)[1]))

print("\n Shuffle")
print("detect a dog in human_files: {} / {}".format(dog_detector_testShuffle(human_files_short)[0], dog_detector_testShuffle(human_files_short)[1]))
print("detect a dog in dog_files: {} / {}".format(dog_detector_testShuffle(dog_files_short)[0], dog_detector_testShuffle(dog_files_short)[1]))

print("\n Alex")
print("detect a dog in human_files: {} / {}".format(dog_detector_testAlexnet(human_files_short)[0], dog_detector_testAlexnet(human_files_short)[1]))
print("detect a dog in dog_files: {} / {}".format(dog_detector_testAlexnet(dog_files_short)[0], dog_detector_testAlexnet(dog_files_short)[1]))

print("\n Densenet")
print("detect a dog in human_files: {} / {}".format(dog_detector_testDensenet(human_files_short)[0], dog_detector_testDensenet(human_files_short)[1]))
print("detect a dog in dog_files: {} / {}".format(dog_detector_testDensenet(dog_files_short)[0], dog_detector_testDensenet(dog_files_short)[1]))

print("\n Resnext50_32x4d")
print("detect a dog in human_files: {} / {}".format(dog_detector_testResnext50_32x4d(human_files_short)[0], dog_detector_testResnext50_32x4d(human_files_short)[1]))
print("detect a dog in dog_files: {} / {}".format(dog_detector_testResnext50_32x4d(dog_files_short)[0], dog_detector_testResnext50_32x4d(dog_files_short)[1]))


 VG
detect a dog in human_files: 0 / 100
detect a dog in dog_files: 90 / 100

 Google
detect a dog in human_files: 3 / 100
detect a dog in dog_files: 7 / 100

 Shuffle
detect a dog in human_files: 11 / 100
detect a dog in dog_files: 24 / 100

 Alex
detect a dog in human_files: 0 / 100
detect a dog in dog_files: 48 / 100

 Densenet
detect a dog in human_files: 0 / 100
detect a dog in dog_files: 0 / 100

 Resnext50_32x4d
detect a dog in human_files: 0 / 100
detect a dog in dog_files: 0 / 100


Sugerimos VGG-16 como una red potencial para detectar imágenes de perros en el algoritmo, pero pueden explorar otras redes previamente entrenadas (como Inception-v3, ResNet-50, etc.). 

Utilice la celda de código a continuación para probar otros modelos de PyTorch previamente entrenados. Realiza esta tarea opcional, muestra el rendimiento en human_files_short y dog_files_short.

### Paso 3: crea una CNN para clasificar las razas de perros (desde cero)

Ahora que tenemos funciones para detectar humanos y perros en imágenes, necesitamos una forma de predecir la raza a partir de imágenes. En este paso, crearán una CNN que clasifica las razas de perros. 

---
Deben crear su CNN desde cero (por lo tanto, ¡todavía no puede usar el aprendizaje por transferencia!), Y debe lograr una precisión de prueba de al menos el 10%. En el paso 4, tendrán la oportunidad de utilizar el aprendizaje por transferencia para crear una CNN que alcance una precisión mucho mayor.



In [23]:
import os
from torchvision import datasets
import torchvision.transforms as transforms
import torch
import numpy as np
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

## Specify appropriate transforms, and batch_sizes

batch_size = 20
num_workers = 0

data_dir = 'dogImages/'
train_dir = os.path.join(data_dir, 'train/')
valid_dir = os.path.join(data_dir, 'valid/')
test_dir = os.path.join(data_dir, 'test/')

In [24]:
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

In [25]:
data_transforms = {'train': transforms.Compose([transforms.RandomResizedCrop(224),
                                     transforms.RandomHorizontalFlip(),
                                     transforms.ToTensor(),
                                     normalize]),
                   'val': transforms.Compose([transforms.Resize(256),
                                     transforms.CenterCrop(224),
                                     transforms.ToTensor(),
                                     normalize]),
                   'test': transforms.Compose([transforms.Resize(size=(224,224)),
                                     transforms.ToTensor(), 
                                     normalize])
                  }

In [26]:
train_data = datasets.ImageFolder(train_dir, transform=data_transforms['train'])
valid_data = datasets.ImageFolder(valid_dir, transform=data_transforms['val'])
test_data = datasets.ImageFolder(test_dir, transform=data_transforms['test'])

In [27]:
train_loader = torch.utils.data.DataLoader(train_data,
                                           batch_size=batch_size, 
                                           num_workers=num_workers,
                                           shuffle=True)
valid_loader = torch.utils.data.DataLoader(valid_data,
                                           batch_size=batch_size, 
                                           num_workers=num_workers,
                                           shuffle=False)
test_loader = torch.utils.data.DataLoader(test_data,
                                           batch_size=batch_size, 
                                           num_workers=num_workers,
                                           shuffle=False)
loaders_scratch = {
    'train': train_loader,
    'valid': valid_loader,
    'test': test_loader
}

Se ha aplicado RandomResizedCrop y RandomHorizontalFlip a los datos de entrenamiento. 

---

Esto permite tener más imágenes usando técnicas de aumento de imagen. Generará más imágenes redimensionadas y volteadas. Mejorará el rendimiento del modelo y también ayudará a evitar el sobreajuste de los datos. Para los datos de validación, solo he aplicado las transformaciones de recorte de tamaño y centrado. Y, para los datos de prueba, solo se ha aplicado el cambio de tamaño de la imagen.

Crea una CNN para clasificar las razas de perros. Utilice la plantilla en la celda de código a continuación.

In [28]:
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

num_classes = 133 # total classes of dog breeds

In [29]:
import torch.nn as nn
import torch.nn.functional as F
import numpy as np

# define the CNN architecture
class Net(nn.Module):
    ### TODO: choose an architecture, and complete the class
    def __init__(self):
        super(Net, self).__init__()
        ## Define layers of a CNN
        self.conv1 = nn.Conv2d(3, 32, 3, stride=2, padding=1)
        self.conv2 = nn.Conv2d(32, 64, 3, stride=2, padding=1)
        self.conv3 = nn.Conv2d(64, 128, 3, padding=1)

        # pool
        self.pool = nn.MaxPool2d(2, 2)
        
        # fully-connected
        self.fc1 = nn.Linear(7 * 7 * 128, 512)
        self.fc2 = nn.Linear(512, num_classes) 
        
        # drop-out
        self.dropout = nn.Dropout(0.3)
    
    def forward(self, x):
        ## Define forward behavior
        x = F.relu(self.conv1(x))
        x = self.pool(x) ##  Pool compresses the images 
        x = F.relu(self.conv2(x))
        x = self.pool(x)
        x = F.relu(self.conv3(x))
        x = self.pool(x)
        
        # flatten
        x = x.view(-1, 7*7*128)
        
        x = self.dropout(x)
        x = F.relu(self.fc1(x))
        
        x = self.dropout(x)
        x = self.fc2(x)
        return x


# instantiate the CNN
model_scratch = Net()
print(model_scratch)

# move tensors to GPU if CUDA is available
if use_cuda:
    model_scratch.cuda()

Net(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
  (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
  (conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (fc1): Linear(in_features=6272, out_features=512, bias=True)
  (fc2): Linear(in_features=512, out_features=133, bias=True)
  (dropout): Dropout(p=0.3, inplace=False)
)


La primera capa de convolución tendrá un tamaño de kernel de 3 y stride 2, esto reducirá el tamaño de la imagen de entrada a la mitad. La segunda capa de convolución también tendrá el mismo tamaño, lo que reducirá el tamaño de la imagen de entrada a la mitad. La tercera capa de convolución tendrá un tamaño de kernel de 3.

Se ha aplicado la combinación máxima de stride 2 después de cada capa de convolución para reducir el tamaño de la imagen a la mitad. También se aplica la activación de Relu para cada una de las capas de convolución.

Luego, se "aplanó" (flattened) las entradas y se aplicó una capa de dropout con probabilidad de 0.3. Se aplican dos capas completamente conectadas con la activación de Relu y el dropout de 0.3 para producir el resultado final que predecirá las clases de razas de perros.



In [30]:
import torch.optim as optim

### select loss function
criterion_scratch = nn.CrossEntropyLoss()

### select optimizer
optimizer_scratch = optim.SGD(model_scratch.parameters(), lr = 0.005)

In [32]:
def train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path, last_validation_loss=None):
    """returns trained model"""
    # initialize tracker for minimum validation loss
    if last_validation_loss is not None:
        valid_loss_min = last_validation_loss
    else:
        valid_loss_min = np.Inf
    
    for epoch in range(1, n_epochs+1):
        # initialize variables to monitor training and validation loss
        train_loss = 0.0
        valid_loss = 0.0
        
        ###################
        # train the model #
        ###################
        model.train()
        for batch_idx, (data, target) in enumerate(loaders['train']):
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()        
            # initialize weights to zero
            optimizer.zero_grad()
            
            output = model(data)
            
            # calculate loss
            loss = criterion(output, target)
            
            # back prop
            loss.backward()
            
            # grad
            optimizer.step()
            
            ## find the loss and update the model parameters accordingly
            train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data - train_loss))
            
            if batch_idx % 100 == 0:
                print('Epoch %d, Batch %d loss: %.6f' %
                  (epoch, batch_idx + 1, train_loss))
            
        ######################    
        # validate the model #
        ######################
        model.eval()
        for batch_idx, (data, target) in enumerate(loaders['valid']):
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()
            ## update the average validation loss
            output = model(data)
            loss = criterion(output, target)
            valid_loss = valid_loss + ((1 / (batch_idx + 1)) * (loss.data - valid_loss))

            
        # print training/validation statistics 
        print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
            epoch, 
            train_loss,
            valid_loss
            ))
        
        if valid_loss < valid_loss_min:
            torch.save(model.state_dict(), save_path)
            print('Validation loss decreased ({:.6f} --> {:.6f}).  Saving model ...'.format(
            valid_loss_min,
            valid_loss))
            valid_loss_min = valid_loss
            
    # return trained model
    return model

In [36]:
# train the model
model_scratch = train(30, loaders_scratch, model_scratch, optimizer_scratch, 
                      criterion_scratch, use_cuda, 'model_scratch.pt')

Epoch 1, Batch 1 loss: 4.489655
Epoch 1, Batch 101 loss: 4.289608
Epoch 1, Batch 201 loss: 4.277520
Epoch 1, Batch 301 loss: 4.267110
Epoch: 1 	Training Loss: 4.268388 	Validation Loss: 4.153804
Validation loss decreased (inf --> 4.153804).  Saving model ...
Epoch 2, Batch 1 loss: 4.301424
Epoch 2, Batch 101 loss: 4.230970
Epoch 2, Batch 201 loss: 4.225623
Epoch 2, Batch 301 loss: 4.219593
Epoch: 2 	Training Loss: 4.222711 	Validation Loss: 4.097250
Validation loss decreased (4.153804 --> 4.097250).  Saving model ...
Epoch 3, Batch 1 loss: 3.992240
Epoch 3, Batch 101 loss: 4.160795
Epoch 3, Batch 201 loss: 4.161644
Epoch 3, Batch 301 loss: 4.146360
Epoch: 3 	Training Loss: 4.141198 	Validation Loss: 3.897532
Validation loss decreased (4.097250 --> 3.897532).  Saving model ...
Epoch 4, Batch 1 loss: 4.183294
Epoch 4, Batch 101 loss: 4.081383
Epoch 4, Batch 201 loss: 4.099196
Epoch 4, Batch 301 loss: 4.098660
Epoch: 4 	Training Loss: 4.097754 	Validation Loss: 3.950280
Epoch 5, Batch 1 l

In [83]:
# load the model that got the best validation accuracy
model_scratch.load_state_dict(torch.load('model_scratch.pt'))


<All keys matched successfully>

In [84]:
def test(loaders, model, criterion, use_cuda):

    # monitor test loss and accuracy
    test_loss = 0.
    correct = 0.
    total = 0.

    for batch_idx, (data, target) in enumerate(loaders['test']):
        # move to GPU
        if use_cuda:
            data, target = data.cuda(), target.cuda()
        # forward pass: compute predicted outputs by passing inputs to the model
        output = model(data)
        # calculate the loss
        loss = criterion(output, target)
        # update average test loss 
        test_loss = test_loss + ((1 / (batch_idx + 1)) * (loss.data - test_loss))
        # convert output probabilities to predicted class
        pred = output.data.max(1, keepdim=True)[1]
        # compare predictions to true label
        correct += np.sum(np.squeeze(pred.eq(target.data.view_as(pred))).cpu().numpy())
        total += data.size(0)
            
    print('Test Loss: {:.6f}\n'.format(test_loss))

    print('\nTest Accuracy: %2d%% (%2d/%2d)' % (
        100. * correct / total, correct, total))

# call test function    
test(loaders_scratch, model_scratch, criterion_scratch, use_cuda)

Test Loss: 3.347345


Test Accuracy: 21% (183/836)


### Paso 4: crea una CNN para clasificar las razas de perros (usando Transfer Learning)

Ahora utilizarán el aprendizaje por transferencia para crear una CNN que pueda identificar la raza de perro a partir de imágenes. Su CNN debe alcanzar al menos un 60% de precisión en el equipo de prueba.



In [85]:
loaders_transfer = loaders_scratch.copy()

Utilicen el aprendizaje por transferencia para crear una CNN para clasificar las razas de perros. Usen la celda de código a continuación y guarden su modelo inicializado como la variable model_transfer.



In [86]:
import torchvision.models as models
import torch.nn as nn

model_transfer = models.resnet50(pretrained=True)

Downloading: "https://download.pytorch.org/models/resnet50-19c8e357.pth" to /root/.cache/torch/hub/checkpoints/resnet50-19c8e357.pth


HBox(children=(FloatProgress(value=0.0, max=102502400.0), HTML(value='')))




In [87]:
for param in model_transfer.parameters():
    param.requires_grad = False

In [88]:
model_transfer.fc = nn.Linear(2048, 133, bias=True)

In [89]:
fc_parameters = model_transfer.fc.parameters()

In [90]:
for param in fc_parameters:
    param.requires_grad = True

In [91]:
model_transfer

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

In [93]:
if use_cuda:
    model_transfer = model_transfer.cuda()

Se ha seleccionado el modelo previamente entrenado ResNet50 porque tiene un buen rendimiento en la clasificación de imágenes. La idea principal de este modelo se llama "conexión de acceso directo de identidad" que omite una o más capas. Esto nos permite evitar el sobreajuste durante el entrenamiento. Finalmente, se agrega una capa final completamente conectada que generará las probabilidades de 133 clases de razas de perros.

------

Utilicen la siguiente celda de código para especificar una función de pérdida y un optimizador. Guarden la función de pérdida elegida como criticion_transfer y el optimizador como optimizer_transfer a continuación.

In [94]:
criterion_transfer = nn.CrossEntropyLoss()
optimizer_transfer = optim.SGD(model_transfer.fc.parameters(), lr=0.0005)

In [95]:
# train the model
# train(n_epochs, loaders_transfer, model_transfer, optimizer_transfer, criterion_transfer, use_cuda, 'model_transfer.pt')

def train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path):
    """returns trained model"""
    # initialize tracker for minimum validation loss
    valid_loss_min = np.Inf
    
    for epoch in range(1, n_epochs+1):
        # initialize variables to monitor training and validation loss
        train_loss = 0.0
        valid_loss = 0.0
        
        ###################
        # train the model #
        ###################
        model.train()
        for batch_idx, (data, target) in enumerate(loaders['train']):
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()

            # initialize weights to zero
            optimizer.zero_grad()
            
            output = model(data)
            
            # calculate loss
            loss = criterion(output, target)
            
            # back prop
            loss.backward()
            
            # grad
            optimizer.step()
            
            train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data - train_loss))
            
            if batch_idx % 100 == 0:
                print('Epoch %d, Batch %d loss: %.6f' %
                  (epoch, batch_idx + 1, train_loss))
        
        ######################    
        # validate the model #
        ######################
        model.eval()
        for batch_idx, (data, target) in enumerate(loaders['valid']):
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()
            ## update the average validation loss
            output = model(data)
            loss = criterion(output, target)
            valid_loss = valid_loss + ((1 / (batch_idx + 1)) * (loss.data - valid_loss))

            
        # print training/validation statistics 
        print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
            epoch, 
            train_loss,
            valid_loss
            ))
        
        ## TODO: save the model if validation loss has decreased
        if valid_loss < valid_loss_min:
            torch.save(model.state_dict(), save_path)
            print('Validation loss decreased ({:.6f} --> {:.6f}).  Saving model ...'.format(
            valid_loss_min,
            valid_loss))
            valid_loss_min = valid_loss
            
    # return trained model
    return model

In [96]:
train(30, loaders_transfer, model_transfer, optimizer_transfer, criterion_transfer, use_cuda, 'model_transfer.pt')

Epoch 1, Batch 1 loss: 5.029637
Epoch 1, Batch 101 loss: 4.917074
Epoch 1, Batch 201 loss: 4.901573
Epoch 1, Batch 301 loss: 4.887914
Epoch: 1 	Training Loss: 4.880989 	Validation Loss: 4.787185
Validation loss decreased (inf --> 4.787185).  Saving model ...
Epoch 2, Batch 1 loss: 4.856481
Epoch 2, Batch 101 loss: 4.788405
Epoch 2, Batch 201 loss: 4.771920
Epoch 2, Batch 301 loss: 4.757230
Epoch: 2 	Training Loss: 4.752950 	Validation Loss: 4.651410
Validation loss decreased (4.787185 --> 4.651410).  Saving model ...
Epoch 3, Batch 1 loss: 4.755992
Epoch 3, Batch 101 loss: 4.670811
Epoch 3, Batch 201 loss: 4.662946
Epoch 3, Batch 301 loss: 4.649692
Epoch: 3 	Training Loss: 4.645695 	Validation Loss: 4.525124
Validation loss decreased (4.651410 --> 4.525124).  Saving model ...
Epoch 4, Batch 1 loss: 4.592680
Epoch 4, Batch 101 loss: 4.570858
Epoch 4, Batch 201 loss: 4.556692
Epoch 4, Batch 301 loss: 4.543088
Epoch: 4 	Training Loss: 4.540322 	Validation Loss: 4.397254
Validation loss de

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

In [97]:
# load the model that got the best validation accuracy
model_transfer.load_state_dict(torch.load('model_transfer.pt'))

<All keys matched successfully>

In [98]:
test(loaders_transfer, model_transfer, criterion_transfer, use_cuda)

Test Loss: 2.200948


Test Accuracy: 69% (578/836)


In [99]:
# list of class names by index, i.e. a name can be accessed like class_names[0]
class_names = [item[4:].replace("_", " ") for item in loaders_transfer['train'].dataset.classes]

In [100]:
loaders_transfer['train'].dataset.classes[:10]

['001.Affenpinscher',
 '002.Afghan_hound',
 '003.Airedale_terrier',
 '004.Akita',
 '005.Alaskan_malamute',
 '006.American_eskimo_dog',
 '007.American_foxhound',
 '008.American_staffordshire_terrier',
 '009.American_water_spaniel',
 '010.Anatolian_shepherd_dog']

In [101]:
class_names[:10]

['Affenpinscher',
 'Afghan hound',
 'Airedale terrier',
 'Akita',
 'Alaskan malamute',
 'American eskimo dog',
 'American foxhound',
 'American staffordshire terrier',
 'American water spaniel',
 'Anatolian shepherd dog']

In [102]:
from PIL import Image
import torchvision.transforms as transforms

def load_input_image(img_path):    
    image = Image.open(img_path)
    prediction_transform = transforms.Compose([transforms.Resize(size=(224, 224)),
                                     transforms.ToTensor(), 
                                     normalize])

    # discard the transparent, alpha channel and add the batch dimension
    image = prediction_transform(image)[:3,:,:].unsqueeze(0)
    return image

In [103]:
def predict_breed_transfer(model, class_names, img_path):
    # load the image and return the predicted breed
    img = load_input_image(img_path)
    model = model.cpu()
    model.eval()
    idx = torch.argmax(model(img))
    return class_names[idx]

### Paso 5: prueba tu algoritmo

En esta sección, ¡probarán su nuevo algoritmo! ¿A qué tipo de perro cree el algoritmo que te pareces? Si tienes un perro, ¿predice con precisión la raza de tu perro? Si tienes un gato, ¿cree erróneamente que tu gato es un perro?

In [104]:
def run_app(img_path):
    ## handle cases for a human face, dog, and neither
    img = Image.open(img_path)
    plt.imshow(img)
    plt.show()
    if dog_detector(img_path) is True:
        prediction = predict_breed_transfer(model_transfer, class_names, img_path)
        print("Dogs Detected!\nIt looks like a {0}".format(prediction))  
    elif face_detector(img_path) > 0:
        prediction = predict_breed_transfer(model_transfer, class_names, img_path)
        print("Hello, human!\nIf you were a dog..You may look like a {0}".format(prediction))
    else:
        print("Error! Can't detect anything..")

In [None]:
for img_file in os.listdir('/content/my_images'):
    img_path = os.path.join('/content/my_images', img_file)
    run_app(img_path)

FileNotFoundError: ignored