# Layers

A `Linear` model can be seen as a **layer** in a neural network.

In [1]:
import torch
import torch.nn as nn

device = 'cuda' if torch.cuda.is_available() else 'cpu'

torch.manual_seed(42)
# Building the model from the figure above
model = nn.Sequential(nn.Linear(3, 5), nn.Linear(5, 1)).to(device)
print(model.state_dict())

OrderedDict([('0.weight', tensor([[ 0.4414,  0.4792, -0.1353],
        [ 0.5304, -0.1265,  0.1165],
        [-0.2811,  0.3391,  0.5090],
        [-0.4236,  0.5018,  0.1081],
        [ 0.4266,  0.0782,  0.2784]])), ('0.bias', tensor([-0.0815,  0.4451,  0.0853, -0.2695,  0.1472])), ('1.weight', tensor([[-0.2060, -0.0524, -0.1816,  0.2967, -0.3530]])), ('1.bias', tensor([-0.2062]))])


## Naming Layers

Since this sequential model does not have attributes names, `state_dict()` uses **numeric prefixes**.

You can also use a model's `add_module()` method to be able to name the layers:

In [2]:
import torch
import torch.nn as nn

device = 'cuda' if torch.cuda.is_available() else 'cpu'

torch.manual_seed(42)
# Building the model from the figure above
model = nn.Sequential()
model.add_module('layer1', nn.Linear(3, 5))
model.add_module('layer2', nn.Linear(5, 1))
print(model.to(device))

Sequential(
  (layer1): Linear(in_features=3, out_features=5, bias=True)
  (layer2): Linear(in_features=5, out_features=1, bias=True)
)


### Types of layers

#### Convolution layers

conv1d layer: `nn.Conv1d()` applies 1D convolution over the input. `nn.Conv1d()` expects the input to be of the shape `[batch_size, input_channels, signal_length]`

**Required Parameters**
- `in_channels`: Number of channels in input data 
- `out_channels`: Number of channels produced after convolution
- `kernel_size`: Size of convolving kernel

In [3]:
input_1d = torch.tensor([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], dtype=torch.float)
input_1d.shape

torch.Size([10])

Changing the shape of input tensor so that it matches with dimension of `batch_size=1`, `input_channels=1` and `signal_length=10`

In [4]:
input_1d = input_1d.view(1, 1, 10)

Keeping `output_channels=1`, `kernel_size=3` and `stride=1`

In [5]:
cnn1d = nn.Conv1d(in_channels=1, out_channels=1, kernel_size=3, stride=1)
out_1d = cnn1d(input_1d)
print(out_1d.shape)

torch.Size([1, 1, 8])


<img src="/Users/mayankanand/Documents/pytorch/images/conv_operation.png" alt="Convolution Operation Formula" height="400" width="600">

In [6]:
signal_2d = torch.randn(2, 2, 10, dtype=torch.float)
print(signal_2d.shape)

torch.Size([2, 2, 10])


Add a convolution 1-d layer with `out_channels=1`, `kernel_size=3`, `stride=1`

In [7]:
conv1d = nn.Conv1d(in_channels=2, out_channels=1, kernel_size=3, stride=1)
out_signal_2d = conv1d(signal_2d)
print(out_signal_2d.shape)

torch.Size([2, 1, 8])


Applying convolution for `out_channels=5`, `kernel_size=3` and `stride=1`

In [8]:
conv1d = nn.Conv1d(in_channels=2, out_channels=5, kernel_size=3, stride=1)
out_signal_2d = conv1d(signal_2d)
print(out_signal_2d.shape)

torch.Size([2, 5, 8])


**conv2d layer**: `nn.Conv2d()` applies convolution over the input. `nn.Conv2d()` expects the input to be of the shape `[batch_size, input_channels, input_height, input_width]`.

**Required Parameters**:
- `in_channels`: Number of channels in the 2d input eg. image.
- `out_channels`: Number of channels produced by the convolution
- `kernel_size`: size of the convolving layer

In [9]:
# This is generating image with batch size of 3, input_channel of 3, height=25 and width=25
input_img_2d = torch.randn(3, 3, 25, 25, dtype=torch.float)

Applying convolution with `in_channels=3`, `out_channels=1`, `kernel_size=3`, `stride=1`

In [10]:
conv2d = nn.Conv2d(in_channels=3, out_channels=1, kernel_size=3, stride=1)
output_img_2d = conv2d(input_img_2d)
print(output_img_2d.shape)

torch.Size([3, 1, 23, 23])


#### Pooling Layers

Pooling is commonly used in Convolution Neural Networks (CNNs), It is used to downsample the input data, reducing it's size while retaining it's important features. There are few types of pooling layers, but among them two popular one are:

- **Max Pooling**: In Max pooling layer, the kernel computes maximum value from the input data while it convolve and maps it to feature map.
- **Average Pooling**: In Average pooling layer, the kernel computes average and maps to feature map.

In [11]:
# 1D signal input
signal_1D = torch.randn(3, 2, 10) # Here arguments are, batch_size=3, input_channels=2 and signal_length=10
print(signal_1D.shape)

torch.Size([3, 2, 10])


In [12]:
max_pool_1d = nn.MaxPool1d(kernel_size=3)
pooled_signal_1d = max_pool_1d(signal_1D)
pooled_signal_1d.shape

torch.Size([3, 2, 3])

In [13]:
pooled_signal_1d

tensor([[[ 0.5328,  2.7864,  0.7368],
         [-0.8156,  0.4567,  0.9101]],

        [[ 1.1840,  1.4583,  0.4836],
         [ 2.2199, -0.0557, -0.1635]],

        [[ 0.3526,  2.2495,  2.0145],
         [ 0.7511,  0.9250,  0.8122]]])

In [14]:
image_2d = torch.randn(5, 3, 25, 25) # batch_size=5, input_channels=3, height=25 and width=25
print(image_2d.shape)

torch.Size([5, 3, 25, 25])


In [15]:
maxpool_2d = nn.MaxPool2d(kernel_size=3)
output_img_2d = maxpool_2d(image_2d)
print(output_img_2d.shape)

torch.Size([5, 3, 8, 8])


Likewise average pooling will also work, instead of taking maximum out of from kernel window, it will compute average and map it to feature map.

#### Padding layers

Padding layers are used to pad the input sequence along the dimensions. For example in 1-dimension, layer will add values along the signal length of input data on both sides.

##### `nn.ReflectionPad1d`
- Performs reflection padding. Reflection padding mirrors the input tensor at the boundaries, effectively creating a reflection of the input data. This can be useful when you want to maintain the overall structure of the data while avoiding the introduction of artificial values during padding.

In [16]:
input = torch.arange(8, dtype=torch.float).view(1, 2, 4)
print(input.shape)
ref_padding_1d = nn.ReflectionPad1d(2)
output = ref_padding_1d(input)
print(output.shape)

torch.Size([1, 2, 4])
torch.Size([1, 2, 8])


In [18]:
input, output

(tensor([[[0., 1., 2., 3.],
          [4., 5., 6., 7.]]]),
 tensor([[[2., 1., 0., 1., 2., 3., 2., 1.],
          [6., 5., 4., 5., 6., 7., 6., 5.]]]))

If we want to padd the input data with different lengths on each side then, we can provide information while defining the layer.

In [19]:
ref_padding_1d = nn.ReflectionPad1d((3, 2))
output = ref_padding_1d(input)
print(output)

tensor([[[3., 2., 1., 0., 1., 2., 3., 2., 1.],
         [7., 6., 5., 4., 5., 6., 7., 6., 5.]]])


##### `nn.ReflectionPad2d`

In [21]:
# input image
input = torch.arange(120).view(2, 3, 4, 5)
print(input[0])

tensor([[[ 0,  1,  2,  3,  4],
         [ 5,  6,  7,  8,  9],
         [10, 11, 12, 13, 14],
         [15, 16, 17, 18, 19]],

        [[20, 21, 22, 23, 24],
         [25, 26, 27, 28, 29],
         [30, 31, 32, 33, 34],
         [35, 36, 37, 38, 39]],

        [[40, 41, 42, 43, 44],
         [45, 46, 47, 48, 49],
         [50, 51, 52, 53, 54],
         [55, 56, 57, 58, 59]]])


In [22]:
ref_padding_2d = nn.ReflectionPad2d(2)
output = ref_padding_2d(input)
print(output[0])

tensor([[[12, 11, 10, 11, 12, 13, 14, 13, 12],
         [ 7,  6,  5,  6,  7,  8,  9,  8,  7],
         [ 2,  1,  0,  1,  2,  3,  4,  3,  2],
         [ 7,  6,  5,  6,  7,  8,  9,  8,  7],
         [12, 11, 10, 11, 12, 13, 14, 13, 12],
         [17, 16, 15, 16, 17, 18, 19, 18, 17],
         [12, 11, 10, 11, 12, 13, 14, 13, 12],
         [ 7,  6,  5,  6,  7,  8,  9,  8,  7]],

        [[32, 31, 30, 31, 32, 33, 34, 33, 32],
         [27, 26, 25, 26, 27, 28, 29, 28, 27],
         [22, 21, 20, 21, 22, 23, 24, 23, 22],
         [27, 26, 25, 26, 27, 28, 29, 28, 27],
         [32, 31, 30, 31, 32, 33, 34, 33, 32],
         [37, 36, 35, 36, 37, 38, 39, 38, 37],
         [32, 31, 30, 31, 32, 33, 34, 33, 32],
         [27, 26, 25, 26, 27, 28, 29, 28, 27]],

        [[52, 51, 50, 51, 52, 53, 54, 53, 52],
         [47, 46, 45, 46, 47, 48, 49, 48, 47],
         [42, 41, 40, 41, 42, 43, 44, 43, 42],
         [47, 46, 45, 46, 47, 48, 49, 48, 47],
         [52, 51, 50, 51, 52, 53, 54, 53, 52],
         

In above example we can observe that numbers are added on both dimension of height and width on the image

**Likewise zero padding will also work, instead of adding numbers with considering boundary values are mirror, it will add padding values as zero**