### VGG16

In the previous two tutorials we have learned the basics of using two of TensorFlow's high-level APIs, <code>tf.keras</code> and <code>tf.estimator</code>, to build, train and test neural networks. In this tutorial we are going to learn another major deep learning library PyTorch. The neural network that we are going to build is another convolutional neural network, VGG16. VGG stands for visual geometry group, and was created in 2014 and won the first place in the ImageNet object detection challenge and second place in the image classification challenge that year. 

The key feature of VGG is that it uses small 3 by 3 convolutional kernels instead of larger kernels throughout the architecture, but has more layers stacked upon each other. The success of VGG shows for the first time that deeper networks can increase model performance. The architecture is shown as follows:

<img src="./files/VGG16.png">

Now let's write our first PyTorch model. PyTorch is a library not only for deep learning, but also for efficient tensor computations with GPUs in general. Most functionalities concerning neural networks are located in <code>torch.nn</code>. For example, <code>torch.nn.Module</code> provides a base class for neural network models. Therefore, we're going to write our VGG16 model as a subclass of this class. To build a neural network architecture, there are two main things that we need to implement. The first one is the constructor of our model subclass. Here we will call the constructor of the base class first, and then define all the layers. And the second one is a forward pass method, which defines how input data will be processed through all the layers in a forward pass.

Layers are usually located in <code>torch.nn</code>, and activation functions are actually available at <code>torch.nn.functional</code>. Let's look at the code.

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F

print(torch.__version__)

In [None]:
class VGG16(nn.Module):
    def __init__(self):
        # Call base class instructor
        super(VGG16, self).__init__()
        
        # Conv1: 3×3 kernel, 1×1 stride, 64 channels, SAME padding.
        # Note that in PyTorch we need to specify the number of padding to be added on each side.
        # Here, to keep the shape of the input we need to add 1 padding on each side. For padding,
        # stride and kernel_size, if we enter an integer it will be applied to both the width and
        # the height dimensions.
        # Also note that there is no argument for activation function. We will use the activation
        # functions when we write the method for forward pass. Here we are just setting up the
        # layers for our model class.
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1, padding=1)
        
        # Conv2: 3×3 kernel, 1×1 stride, 64 channels, SAME padding.
        self.conv2 = nn.Conv2d(3, 64, 3, stride=1, padding=1)
        
        # Pool1: 2×2 kernel, 2×2 stride.
        # In MaxPool2d, stride defaults to kernel_size if not specified.
        self.pool1 = nn.MaxPool2d(kernel_size=2)
        
        # Conv3-4: 3×3 kernel, 1×1 stride, 128 channels, SAME padding.
        self.conv3 = nn.Conv2d(3, 128, 3, stride=1, padding=1)
        self.conv4 = nn.Conv2d(3, 128, 3, stride=1, padding=1)
        
        # Pool2: 2×2 kernel, 2×2 stride.
        self.pool2 = nn.MaxPool2d(kernel_size=2)
        
        # Conv5-7: 3×3 kernel, 1×1 stride, 256 channels, SAME padding.
        self.conv5 = nn.Conv2d(3, 256, 3, stride=1, padding=1)
        self.conv6 = nn.Conv2d(3, 256, 3, stride=1, padding=1)
        self.conv7 = nn.Conv2d(3, 256, 3, stride=1, padding=1)
        
        # Pool3: 2×2 kernel, 2×2 stride.
        self.pool3 = nn.MaxPool2d(kernel_size=2)
        
        # Conv8-10: 3×3 kernel, 1×1 stride, 512 channels, SAME padding.
        self.conv8 = nn.Conv2d(3, 512, 3, stride=1, padding=1)
        self.conv9 = nn.Conv2d(3, 512, 3, stride=1, padding=1)
        self.conv10 = nn.Conv2d(3, 512, 3, stride=1, padding=1)
        
        # Pool4: 2×2 kernel, 2×2 stride.
        self.pool4 = nn.MaxPool2d(kernel_size=2)
        
        # Conv8-13: 3×3 kernel, 1×1 stride, 512 channels, SAME padding.
        self.conv11 = nn.Conv2d(3, 512, 3, stride=1, padding=1)
        self.conv12 = nn.Conv2d(3, 512, 3, stride=1, padding=1)
        self.conv13 = nn.Conv2d(3, 512, 3, stride=1, padding=1)
        
        # Pool5: 2×2 kernel, 2×2 stride.
        self.pool5 = nn.MaxPool2d(kernel_size=2)
        
        # In TensorFlow we can put a flatten layer here. But in PyTorch there is no flatten layer.
        # Therefore, we perform the flatten operation when we write the forward pass method.
        
        # FullyConnected1: 4096 neurons.
        # In PyTorch the dense or fully-connected layer is nn.Linear.
        self.fc1 = nn.Linear(in_features=7*7*512, out_features=4096)
        
        # Dropout1: dropout probability of 0.5.
        self.dropout1 = nn.Dropout(p=0.5)
        
        # FullyConnected2: 4096 neurons
        self.fc2 = nn.Linear(4096, 4096)
        
        # Dropout2: dropout probability of 0.5.
        self.dropout2 = nn.Dropout(p=0.5)
        
        # FullyConnected3: We will use the same dog-vs-cat dataset here. So there will only be
        # 2 classes here.
        self.fc3 = nn.Linear(4096, 2)
        
    
    # Next we write the forward pass function with input data as an argument. Here we specify how
    # the data is transformed by the different layers.
    def forward(self, x):
        x = F.relu(self.conv1(x)) # Activation function is applied here.
        x = F.relu(self.conv2(x))
        x = self.pool1(x)
        x = F.relu(self.conv3(x))
        x = F.relu(self.conv4(x))
        x = self.pool2(x)
        x = F.relu(self.conv5(x))
        x = F.relu(self.conv6(x))
        x = F.relu(self.conv7(x))
        x = self.pool3(x)
        x = F.relu(self.conv8(x))
        x = F.relu(self.conv9(x))
        x = F.relu(self.conv10(x))
        x = self.pool4(x)
        x = F.relu(self.conv11(x))
        x = F.relu(self.conv12(x))
        x = F.relu(self.conv13(x))
        x = self.pool5(x) # Now x should have size [batch_size, 7, 7, 512]
        
        # The counterpart of tf.reshape is the view method for PyTorch tensors.
        x = x.view(-1, 7*7*512) # Similar to TensorFlow, -1 means inferred from other dimensions.
        
        x = F.relu(self.fc1(x))
        x = self.dropout1(x)
        x = F.relu(self.fc2(x))
        x = self.dropout2(x)
        x = self.fc3(x) # We keep the logits
        return x
    
# Create a VGG16 instance. Trainable variables are instantly initialized.
# The cuda() method transfer the object to GPU.
model = VGG16().cuda()

### Loading data from files

Similar to TensorFlow, PyTorch provides convenient ways of preprocessing and loading datasets. There are two important classes at work here: <code>torch.utils.data.Dataset</code> and <code>torch.utils.data.DataLoader</code>. 