## Define the Convolutional Neural Network

In this notebook and in `models.py`:
1. Define a CNN with images as input and keypoints as output
2. Construct the transformed FaceKeypointsDataset, just as before
3. Train the CNN on the training data, tracking loss
4. See how the trained model performs on test data
5. If necessary, modify the CNN structure and model hyperparameters, so that it performs well

## CNN Architecture

Recall that CNN's are defined by a few types of layers:
* Convolutional layers
* Maxpooling layers
* Fully-connected layers


In [1]:
import matplotlib.pyplot as plt
from workspace_utils import active_session
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.nn.functional as F
from models import Net

from torch.utils.data import DataLoader , Dataset
from torchvision import transforms , utils

In [2]:
cnn_model = Net()
print(cnn_model)

Net(
  (conv1): Conv2d(1, 32, kernel_size=(5, 5), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
  (conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1))
  (conv4): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1))
  (conv5): Conv2d(256, 512, kernel_size=(1, 1), stride=(1, 1))
  (fc1): Linear(in_features=18432, out_features=1024, bias=True)
  (fc2): Linear(in_features=1024, out_features=512, bias=True)
  (fc3): Linear(in_features=512, out_features=136, bias=True)
  (dropout): Dropout(p=0.25, inplace=False)
)


## create transform

In [3]:
from Load import FacialKeypointsDataset
from Load import Normalize , RandomCrop , Rescale , ToTensor
transform = transforms.Compose([
    Normalize(),
    Rescale(220),
    RandomCrop(170),
    ToTensor()
])

In [4]:
transformed_dataset = FacialKeypointsDataset(csv_file='data/training_frames_keypoints.csv',
                                             root_dir='data/training/',
                                             transform=transform)

print('Number of images: ', len(transformed_dataset))

# iterate through the transformed dataset and print some stats about the first few samples
for i in range(4):
    sample = transformed_dataset[i]
    print(i, sample['image'].size(), sample['keypoints'].size())

Number of images:  3462
0 torch.Size([1, 73, 109]) torch.Size([68, 2])
1 torch.Size([1, 79, 22]) torch.Size([68, 2])
2 torch.Size([1, 63, 28]) torch.Size([68, 2])
3 torch.Size([1, 18, 79]) torch.Size([68, 2])


## define train and test

In [5]:
train_leader = DataLoader(transformed_dataset
                          , batch_size=10
                          , shuffle=True
                          , num_workers=4)

In [9]:
test_dataset = FacialKeypointsDataset(csv_file='data/test_frames_keypoints.csv',
                                             root_dir='data/test/',
                                             transform=transform)
test_loader = DataLoader(test_dataset, 
                          batch_size=10,
                          shuffle=True, 
                          num_workers=4)

## define training

In [11]:
from torch import optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(cnn_model.parameters(),lr=0.002)