# Basic operations in pytorch to build a CNN

## Agenda
*   Basic operations
  *    Convolution
  *    Activation
  *    Pooling
  *    Batch normalization
  *    Skip conection
  *    Linear
  *    Dropout







## Basic operations

PyTorch has a module called `nn` that provides a lot of useful tools to build neural networks. This module defines most commonly used operations, such as convolution, pooling and activation. We will see how to use some of these operations using module `nn`: 

*   Convolution
*   Activation
*   Pooling
*   Batch normalization
*   Skip connection
*   Linear transformation (dense layer) 
*   Dropout

In [1]:
#import module nn
import torch
from torch import nn as nn

### Convolution

You must provide the number of input channels, the number of output channels (i.e., number of kernels), and the kernel sizes. To preserve image dimensions, you must provide the padding size. 

CLASS torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)

In [2]:
#it defines a convolution with 3 kernels of sizes 5 x 3
conv = nn.Conv2d(in_channels=2, out_channels=3, kernel_size=(5, 3), padding=(2,1))

The weights are stored in a tensor $K_{N, M, P, Q}$, such that
$N$ is the number of kernels, $M$ is the number of input channels, $P$ and $Q$ are the input dimensions (height, width). 



In [3]:
print(conv.weight.data.shape)

torch.Size([3, 2, 5, 3])


The input is a tensor $I_{B, M, P, Q}$, such that $B$ is the batch size, $M$ is the number of channels, and $P$ and $Q$ are the input dimensions (height, width). 

In [4]:
#creating a random input
x = torch.rand(1, 2, 10, 8)
print("x = ", x)
print(x.size())

x =  tensor([[[[0.2747, 0.5139, 0.9980, 0.1989, 0.2224, 0.2705, 0.5520, 0.3759],
          [0.2472, 0.1175, 0.4607, 0.7245, 0.8898, 0.8048, 0.3509, 0.7238],
          [0.2432, 0.0657, 0.6150, 0.6342, 0.0737, 0.7929, 0.6359, 0.4524],
          [0.8561, 0.0466, 0.6501, 0.2496, 0.8134, 0.0802, 0.7572, 0.2299],
          [0.6945, 0.1838, 0.5563, 0.5890, 0.9777, 0.9087, 0.9511, 0.5522],
          [0.7854, 0.8534, 0.9255, 0.8846, 0.7679, 0.3528, 0.8246, 0.0080],
          [0.6089, 0.5250, 0.5178, 0.0272, 0.4649, 0.3583, 0.2552, 0.6432],
          [0.1629, 0.3228, 0.7963, 0.7595, 0.3518, 0.0852, 0.9543, 0.1898],
          [0.5945, 0.1053, 0.3386, 0.7561, 0.3266, 0.3605, 0.2470, 0.5566],
          [0.7843, 0.4768, 0.6863, 0.1061, 0.8538, 0.1226, 0.3664, 0.9585]],

         [[0.5546, 0.0042, 0.0342, 0.6536, 0.1061, 0.2885, 0.8680, 0.3073],
          [0.2265, 0.1353, 0.8443, 0.9173, 0.8051, 0.7110, 0.3702, 0.1861],
          [0.1571, 0.3355, 0.4285, 0.3836, 0.4881, 0.6915, 0.4885, 0.2060],
     

In [5]:
#Convolving x with the three kernels that have already been randomly initialized
y_conv = conv(x)
print("y_conv = ", y_conv)
print(y_conv.shape)

y_conv =  tensor([[[[-0.0753,  0.0723, -0.0450, -0.0397, -0.0630,  0.0128, -0.0710,
           -0.1065],
          [-0.0521,  0.0519,  0.0249,  0.2887,  0.2359,  0.2658,  0.1755,
           -0.1148],
          [ 0.0489, -0.0487,  0.2979,  0.0810,  0.2983, -0.0079,  0.0636,
           -0.3524],
          [ 0.0613,  0.0467,  0.1490, -0.0018, -0.0293,  0.1407, -0.0218,
           -0.1965],
          [ 0.0446,  0.2361, -0.0519,  0.2742,  0.2436,  0.3737,  0.1018,
           -0.0897],
          [ 0.1584,  0.1284,  0.3471,  0.2345,  0.2385,  0.2761, -0.0151,
           -0.2450],
          [ 0.0385,  0.0678,  0.2113,  0.1713,  0.0109,  0.0700, -0.0316,
           -0.3056],
          [ 0.1268, -0.0824, -0.0295,  0.1803,  0.1304, -0.0103, -0.1066,
           -0.2881],
          [ 0.0471,  0.0346,  0.0873,  0.0589,  0.1668,  0.1306,  0.2361,
           -0.2836],
          [ 0.1287,  0.2169,  0.1114,  0.0583,  0.0617,  0.3635,  0.3490,
           -0.0523]],

         [[-0.2016, -0.2361, -0.0240, 

We can also set the kernel weights as we want.

In [6]:
#initializing weights and biases
conv.weight.data = torch.rand(3, 2, 5, 3, requires_grad=True) - 0.5
conv.bias.data   = torch.rand(3, requires_grad=True) - 0.5

In [7]:
#Convolving x with our new kernel bank of three kernels
y_conv = conv(x)
print("y_conv = ", y_conv)
print(y_conv.shape)

y_conv =  tensor([[[[ 5.2458e-01,  9.6723e-01,  1.1288e+00,  1.6674e+00,  1.2163e+00,
            1.0846e+00,  1.5027e+00,  1.0782e+00],
          [ 9.9878e-01,  7.5285e-01,  5.8623e-01,  9.8847e-01,  2.0301e+00,
            1.1511e+00,  1.5072e+00,  9.9631e-01],
          [ 1.1690e+00,  8.4741e-01,  5.6139e-01,  1.0843e+00,  6.2335e-01,
            1.3435e+00,  1.7858e+00,  8.9234e-01],
          [ 2.8727e-01,  1.0836e+00,  1.5026e+00,  1.4574e+00,  1.6782e+00,
            1.0704e+00,  1.3914e+00,  1.0564e+00],
          [ 3.3891e-01,  1.7468e+00,  1.8551e+00,  1.2152e+00,  9.1561e-01,
            2.2041e+00,  1.5428e-01,  7.9786e-01],
          [ 3.4737e-01,  8.5049e-01,  7.0544e-01,  1.5567e+00,  1.0281e+00,
            2.0754e-01,  1.1201e+00,  5.9992e-01],
          [ 8.8233e-01,  8.2734e-01,  4.3024e-01,  7.9497e-01,  1.2342e+00,
            1.3178e+00,  1.9594e-01,  1.1652e+00],
          [ 8.7477e-01,  7.8832e-01,  8.5719e-01,  3.1546e-01,  1.5816e+00,
            6.5240e-01,  


### Activation

Now we define a ReLU activation function.

CLASS torch.nn.ReLU(inplace=False)

In [8]:
#defining relu function

relu = nn.ReLU()

In [9]:
#applying the relu function to output of a convolution
y_relu = relu(y_conv)
print("y_relu = ",y_relu)

y_relu =  tensor([[[[5.2458e-01, 9.6723e-01, 1.1288e+00, 1.6674e+00, 1.2163e+00,
           1.0846e+00, 1.5027e+00, 1.0782e+00],
          [9.9878e-01, 7.5285e-01, 5.8623e-01, 9.8847e-01, 2.0301e+00,
           1.1511e+00, 1.5072e+00, 9.9631e-01],
          [1.1690e+00, 8.4741e-01, 5.6139e-01, 1.0843e+00, 6.2335e-01,
           1.3435e+00, 1.7858e+00, 8.9234e-01],
          [2.8727e-01, 1.0836e+00, 1.5026e+00, 1.4574e+00, 1.6782e+00,
           1.0704e+00, 1.3914e+00, 1.0564e+00],
          [3.3891e-01, 1.7468e+00, 1.8551e+00, 1.2152e+00, 9.1561e-01,
           2.2041e+00, 1.5428e-01, 7.9786e-01],
          [3.4737e-01, 8.5049e-01, 7.0544e-01, 1.5567e+00, 1.0281e+00,
           2.0754e-01, 1.1201e+00, 5.9992e-01],
          [8.8233e-01, 8.2734e-01, 4.3024e-01, 7.9497e-01, 1.2342e+00,
           1.3178e+00, 1.9594e-01, 1.1652e+00],
          [8.7477e-01, 7.8832e-01, 8.5719e-01, 3.1546e-01, 1.5816e+00,
           6.5240e-01, 1.4775e+00, 6.2472e-01],
          [5.1927e-01, 5.8592e-01, 0.0

### Pooling

We may define now max poolin. Note that strides greater than 1 will reduce the input size. 

CLASS torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)

In [13]:
pool = nn.MaxPool2d(kernel_size=(3,3), stride=(2,2), padding=(1,1))

In [14]:
#applying max pooling to output of ReLu
y_pool = pool(y_relu)
print("y_pool = ", y_pool)
print(y_pool.shape)

y_pool =  tensor([[[[9.9878e-01, 1.6674e+00, 2.0301e+00, 1.5072e+00],
          [1.1690e+00, 1.5026e+00, 2.0301e+00, 1.7858e+00],
          [1.7468e+00, 1.8551e+00, 2.2041e+00, 2.2041e+00],
          [8.8233e-01, 1.5567e+00, 1.5816e+00, 1.4775e+00],
          [8.7477e-01, 8.5719e-01, 1.5816e+00, 1.4775e+00]],

         [[1.4342e-03, 5.2186e-01, 6.0958e-01, 6.0958e-01],
          [7.7914e-01, 3.7380e-01, 6.1597e-01, 6.0958e-01],
          [7.7914e-01, 3.3674e-01, 1.9778e-01, 1.0373e+00],
          [3.4778e-01, 3.3674e-01, 2.2654e-01, 9.2878e-01],
          [3.4778e-01, 2.2654e-01, 4.7604e-01, 9.2878e-01]],

         [[1.0419e+00, 1.0419e+00, 7.2550e-01, 8.2108e-01],
          [1.4543e+00, 2.5417e+00, 2.5417e+00, 1.9998e+00],
          [1.3602e+00, 2.5417e+00, 2.5417e+00, 1.9879e+00],
          [2.0465e+00, 2.0465e+00, 1.8320e+00, 1.8320e+00],
          [2.0465e+00, 2.0465e+00, 1.8320e+00, 1.8320e+00]]]],
       grad_fn=<MaxPool2DWithIndicesBackward>)
torch.Size([1, 3, 5, 4])


### Batch normalization

Now we may define batch normalization to normalize batches in the following way $$y = \frac{x - \mathbb{E}[x]}{\sqrt{\mathrm{Var}[x]}+\epsilon}\gamma + \beta.$$ You must indicate the number of input channels (num_features).


CLASS torch.nn.BatchNorm1d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True, device=None, dtype=None)

In [15]:
#defining batch normalization layer
norm = nn.BatchNorm2d(num_features=3)

In [17]:
#applying batch normalization
y_norm = norm(y_pool)
print("y_norm = ", y_norm)

y_norm =  tensor([[[[-1.3426,  0.2873,  1.1715, -0.1031],
          [-0.9276, -0.1143,  1.1715,  0.5761],
          [ 0.4809,  0.7450,  1.5958,  1.5958],
          [-1.6264,  0.0176,  0.0782, -0.1756],
          [-1.6449, -1.6877,  0.0782, -0.1756]],

         [[-1.8967,  0.0270,  0.3513,  0.3513],
          [ 0.9780, -0.5202,  0.3749,  0.3513],
          [ 0.9780, -0.6572, -1.1709,  1.9324],
          [-0.6164, -0.6572, -1.0646,  1.5312],
          [-0.6164, -1.0646, -0.1423,  1.5312]],

         [[-1.3832, -1.3832, -1.9562, -1.7832],
          [-0.6364,  1.3330,  1.3330,  0.3516],
          [-0.8068,  1.3330,  1.3330,  0.3300],
          [ 0.4362,  0.4362,  0.0476,  0.0476],
          [ 0.4362,  0.4362,  0.0476,  0.0476]]]],
       grad_fn=<NativeBatchNormBackward>)


### Skip connection

To skip layers without passing through previus layers, we may concatenate the output of two layers along a given axis. For instance, we may concatenate the input and output of relu as follows.

torch.cat(tensors, dim=0, *, out=None) → Tensor

In [14]:
#concatenating tensors
concat = torch.cat((x, y_relu), dim=1)
print("concat = ", concat)
print(concat.shape)

concat =  tensor([[[[0.0119, 0.1887, 0.8797, 0.7958, 0.2105, 0.3933, 0.3611, 0.2529],
          [0.1982, 0.5942, 0.2462, 0.4993, 0.3310, 0.6911, 0.3058, 0.5796],
          [0.5084, 0.7411, 0.8538, 0.5789, 0.7895, 0.8870, 0.4857, 0.3698],
          [0.4434, 0.7200, 0.4169, 0.2557, 0.4328, 0.0125, 0.7869, 0.9360],
          [0.0312, 0.2187, 0.5572, 0.3212, 0.5839, 0.2899, 0.6221, 0.9115],
          [0.4448, 0.1916, 0.2809, 0.9075, 0.8867, 0.9891, 0.0174, 0.9108],
          [0.7690, 0.3055, 0.9843, 0.8632, 0.9101, 0.4176, 0.6827, 0.8616],
          [0.4205, 0.2882, 0.3889, 0.0679, 0.0190, 0.2159, 0.4348, 0.7251],
          [0.4136, 0.5189, 0.5231, 0.2470, 0.7173, 0.6770, 0.7782, 0.3758],
          [0.7322, 0.0176, 0.5811, 0.6094, 0.9380, 0.5153, 0.1153, 0.6833]],

         [[0.3919, 0.9493, 0.9453, 0.6607, 0.5304, 0.4389, 0.2849, 0.4201],
          [0.3739, 0.9339, 0.3109, 0.5884, 0.2848, 0.6622, 0.5391, 0.0582],
          [0.9036, 0.6026, 0.3229, 0.5628, 0.1724, 0.6641, 0.2110, 0.9666],


### Linear

Dense layers can be implemented by flattening output activations followed by a linear transformation. We must indicate the input and output sizes.

torch.flatten(input, start_dim=0, end_dim=-1) → Tensor

CLASS torch.nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)

In [18]:
#transforming it into a 2D tensor
y_flatten = y_pool.flatten(start_dim=1)
print(y_flatten.shape)

torch.Size([1, 60])


In [19]:
#defining the linear layer
linear = nn.Linear(in_features=60, out_features=2, bias=True)

In [20]:
#applying linear the transformation
y_linear = linear(y_flatten)
print(y_linear)
print(y_linear.shape)

tensor([[-0.2201,  1.0148]], grad_fn=<AddmmBackward>)
torch.Size([1, 2])


### Dropout

For regularization, we may also add a dropout layer to randomly disconsider elements of the input tensor (it sets zeroes) with probability p by following a Bernoulli distribution. The outputs are scaled by a factor $\frac{1}{1-p}$ during training. 

CLASStorch.nn.Dropout(p=0.5, inplace=False)

In [21]:
drop = nn.Dropout(0.5)

In [26]:
x = torch.randn((5,5))

In [23]:
y = drop(x)

In [24]:
print("Before dropout: ", x)

Before dropout:  tensor([[-0.6510,  0.9346,  0.4056,  0.8617, -0.4832],
        [-1.4813, -0.7307, -1.4605, -2.3998, -1.3143],
        [-1.9054,  0.1042, -0.3983, -0.3800, -0.2123],
        [-0.1725,  1.0847, -0.1320, -0.1492, -0.6975],
        [ 0.0759, -0.3639, -3.0578,  0.9329,  1.1660]])


In [25]:
print("After dropout", y)

After dropout tensor([[-1.3020,  0.0000,  0.8111,  1.7235, -0.9664],
        [-2.9626, -1.4614, -2.9210, -0.0000, -2.6286],
        [-0.0000,  0.0000, -0.0000, -0.0000, -0.0000],
        [-0.0000,  0.0000, -0.2639, -0.2985, -1.3949],
        [ 0.0000, -0.0000, -6.1156,  1.8658,  0.0000]])
