# TOC

  __Chapter 8 - Modern network architectures__

1. [Import](#Import)
1. [Modern network architectures](#Modern-network-architectures)
    1. [ResNet](#ResNet)
    1. [Train a ResNet model](#Train-a-ResNet-model)
        1. [Creating PyTorch datasets](#Creating-PyTorch-datasets)
        1. [Creating loaders for training and validation](#Creating-loaders-for-training-and-validation)
        1. [Creating a ResNet model](#Creating-a-ResNet-model)
        1. [Extracting convolutional features](#Extracting-convolutional-features)
1. [](#)
1. [](#)
1. [](#)
1. [](#)
1. [](#)
1. [](#)
1. [](#)
1. [](#)
1. [](#)

# Import

<a id = 'Import'></a>

In [None]:
# Standard libary and settings
import os
import sys
import importlib
import itertools
import warnings; warnings.simplefilter('ignore')
from IPython.core.display import display, HTML; display(HTML("<style>.container { width:95% !important; }</style>"))

# Data extensions and settings
import numpy as np
np.set_printoptions(threshold = np.inf, suppress = True)
import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.options.display.float_format = '{:,.6f}'.format

# pytorch tools
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision
from torch.autograd import Variable
from torchvision import datasets, models, transforms

# Visualization extensions and settings
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
sns.set_style('whitegrid')


# Modern network architecture

Adding layers to the model can add to its predictive abilities, but also introduces the possiblility of other problem, such as vanishing/exploding gradients. Modern architectures try to solve these problem by introducing different techniques.



<a id = 'Modern-network-architecture'></a>

## ResNet

ResNet approaches these issues by enabling layers in the network to fit to the residuals. In a typical network, we fit a model to find a function that maps the input $x$ to its output $H(x)$ by stacking different layers. ResNet, instead of trying to learn a mapping from $x$ to $H(x)$, tries to learn the difference between the two (aka the residual). To calculate $H(x)$, we add the residual to the input. If the residual is $F(x) = H(x) - x$, then we don't need to learn $H(x)$ directly. Instead, we try to learn $F(x) + x$.

Each ResNet block is comprised of several layers an da shortcut connection that adds the input of the block to the output of the block. The addition operation is performed element-wise, so the inputs and outputs need to be the same size. If the objects are not the same size naturally then we can use padding.

In the example below, the init method initializes all of the different layers, and the forward method is very similar to implementation seen so far, except that the input is being adding back to the layer's output before returning it.



<a id = 'ResNet'></a>

In [None]:
# ResNet block demonstration
class ResNetBasicBlock(nn.Module):
    def __init__(self, in_channels, out_channels, stride):
        super().__init__()
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size = 3, stride = stride, padding = 1, bias = False)
        self.bn1 = nn.atchNorm2d(out_channels)
        self.conv2 = nn.Conv2d(in_channels, out_channels, kernel_size = 3, stride = stride, padding = 1, bias = False)
        self.bn2 = nn.atchNorm2d(out_channels)
        self.stride = stride
        
    def forward(self, x):
        residuals = x
        out = self.conv1(x)
        out = F.relu(self.bn1(out), inplace = True)
        out = self.conv2(out)
        out = self.bn2(out)
        out += residual
        return F.relu(out)
    

## Train a ResNet model



<a id = 'Train-a-ResNet-model'></a>

### Creating PyTorch datasets



<a id = 'Creating-PyTorch-datasets'></a>

In [None]:
# 
data_transform = transforms.Compose([
        transforms.Resize((299,299)),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])

train_data = ImageFolder('../../kaggleDogsVsCats/data/train', transforms = data_transform)
val_data = ImageFolder('../../kaggleDogsVsCats/data/valid', transforms = data_transform)
classes = 2


### Creating loaders for training and validation

The exact sequence of the data need to be maintained in order to facilitate calculating the pre-convoluted features. If the data gets shuffled, then the labels are not maintained. Therefore, it is important to ensure that the shuffle argument is set to False.



<a id = 'Creating-loaders-for-training-and-validation'></a>

In [None]:
# 
train_loader = DataLoader(train_dset, batch_size = 32, shuffle = False, num_workers = 3)
val_loader = DataLoader(val_dset, batch_size = 32, shuffle = False, num_workers = 3)


### Creating a ResNet model

The nn.Sequential instance enables the rapid creation of a model using a set of PyTorch layers. It is important to set requires_grad to False.



<a id = 'Creating-a-ResNet-model'></a>

In [None]:
# 
resnetModel = models.resnet34(pretrained = True)

if is_cudda:
    resnetModel = resnetModel.cuda()

# discard the last linear layer
resnetModel = nn.Sequential(*list(resnetModel.children())[:-1])

for p in resnetModel.parameters():
    p.requires_grad = False


### Extracting convolutional features

Calculating the pre-convouted features can save substantial time in the model training stage. This avoids having to calcualte the features in every iteration.



<a id = 'Extracting-convolutional-features'></a>

In [None]:
# 
# store the training data labels
trn_labels = []

# store the pre-convoluted features of the training data
trn_featuers = []

# iterate through training data, store the calculated featuers and the lables
for d, la in train_loader:
    o = m(Variable(d.cuda()))
    o = o.view(o.size(0), -1)
    trn_labels.extend(la)
    trn_featuers.extend(o.cpu().data)

# iterate through validation data, store the calculated featuers and the lables
val_labels = []
val_featuers = []
for d, la in val_loader:
    o = m(Variable(d.cuda()))
    o = o.view(o.size(0), -1)
    val_labels.extend(la)
    val_featuers.extend(o.cpu().data)


### A

With the pre-convoluted features in hand, we need to create a custom data set that can select from the pre-convoluted features.



<a id = ''></a>

In [None]:
# 



# A



<a id = ''></a>

In [None]:
# 

