# Recap for Part 5 to Part 8

The following will be divided in different cores, each core will present the main take-overs of each parts. 

## INFERENCE

This is the ability to predict. In order to have a good prediction we need to create a model that repsent the real behaviour as much as possible, any bias from the training data (overfitting) need to be avoided. A method to avoid overfitting is to use Dropout. Before we need to evaluate the  model on the test data, in other to compare the test loss with the training loss.

In [None]:
    for images, labels in testloader:
        
        log_ps = model(images)
        test_loss += criterion(log_ps, labels)
        
        ps = torch.exp(log_ps)
        top_k, top_class = ps.topk(1, dim=1)
        equals = top_class == labels.view(top_class.shape)
        accuracy += torch.mean(equals.type(torch.FloatTensor))

`topk` extract from the test data result, the index of the top class, and witch is the top class.
`equals` create an array of boolean, with a 1 in case of label and prediction matching.
`accuracy` is the mean of the equals array, that is converted into a float before making the mean.

From this 3 parameters we can obtain the model accuracy with some test data

In [None]:
## TODO: Define your model with dropout added
class Classifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 256)
        self.fc2 = nn.Linear(256, 128)
        self.fc3 = nn.Linear(128, 64)
        self.fc4 = nn.Linear(64, 10)
        
        self.dropout = nn.Dropout(p=0.2)
        
    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)
        
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))
        x = F.log_softmax(self.fc4(x), dim=1)
        
        return x

Before making the evaluation on the test data we need to stop the update of the parameters (weights/bias) in the model. we insert the line: `with torch.no_grad():`.

During the evaluation process we want the model to work in to the full potential, without any dropout. Before injecting the test data we insert the following line: `model_name.eval()`, when the evaluation process is over we set the training mode again with: `model_name.train()`

In [None]:
  with torch.no_grad():
      
      model.eval()
      for images, labels in testloader:
          log_ps = model(images)
          test_loss += criterion(log_ps, labels)
          
          ps = torch.exp(log_ps)
          top_k, top_class = ps.topk(1, dim=1)
          equals = top_class == labels.view(top_class.shape)
          accuracy += torch.mean(equals.type(torch.FloatTensor))
          

## DATA AUGMENTATION

This process is needed in order to prepare the input data (image, video, ecc.) to be analysed. We do that thanks to the transform command. We aim to create a model that generalize the knowledge that is teach, what we do with the trasform command is to introduce some randomnes in the data, by rotating, resizing, fliping. Moreover we want to normalize the data, normalizing helps keep the network work weights near zero which in turn makes backpropagation more stable.

In [None]:
train_transforms = transforms.Compose([transforms.RandomRotation(30),
                                       transforms.RandomResizedCrop(25),
                                       transforms.RandomHorizontalFlip(),
                                       transforms.ToTensor(),
                                       transforms.Normalize([0.5, 0.5, 0.5], 
                                                            [0.5, 0.5, 0.5])])

## TRANSFER LEARNING

Big models can be exploited in order to train our own model. Torchvision have some preconfigured models inside [(Torchvision documentation)](http://pytorch.org/docs/0.3.0/torchvision/models.html), the main structure of this models can be exploited, we just modify the final layers. If we already have a model that does images classification we can exploit it in order to do classification of a new label.

Since during the training we want to update just the last layers, we need to freeze the initial layers, and backpropagate just through the lasts. We add the line: `param.requires_grad = False`.

In the example below, the last layer is called `classifier`, notice how it is modified. 

In [5]:
from torchvision import models
from torch import nn

model = models.densenet121(pretrained=True)

# Freeze parameters so we don't backprop through them
for param in model.parameters():
    param.requires_grad = False

classifier = nn.Sequential(nn.Linear(2048, 512),
                          nn.ReLU(),
                          nn.Dropout(p=0.2),
                          nn.Linear(512, 2),
                          nn.LogSoftmax(dim=1))

model.classifier = classifier

model

DenseNet(
  (features): Sequential(
    (conv0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu0): ReLU(inplace=True)
    (pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (denseblock1): _DenseBlock(
      (denselayer1): _DenseLayer(
        (norm1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU(inplace=True)
        (conv1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU(inplace=True)
        (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      )
      (denselayer2): _DenseLayer(
        (norm1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu