## Residual block
- A residual block is defined as
  - y=σ(F(x)+G(x))
- where x and y  represent the input and output tensors of the block, σ is the ReLU activation function, F is the residual function to be learned and G is a projection shortcut used to match dimensions between F(x) and x.

- Your code needs to define a ResidualBlock class (inherited from nn.Module) which implements a residual block. In your code, F will be implemented with two convolutional layers with a ReLU non-linearity between them, i.e. F=conv2(σ(conv1(x))). Batch normalization will also be adopted right after each convolution operation.

- The constructor of the ResidualBlock class needs to take the following arguments as input:

  - inplanes, the number of channels of x;
  - planes, the number of output channels of conv1 and conv2;
  - stride, the stride of conv1;
- If the shapes of F(x)and x do not match (either because inplanes != planes, or because stride > 1) ResidualBlock also needs to apply a projection shortcut G, composed of a convolutional layer with kernel size 1×1, no bias, no padding and stride stride, followed by a batch normalization. Otherwise G
is simply the identity function.

- The forward method of ResidualBlock will take as input the input tensor x and return the output tensor y, after performing all the operations of a Residual block.

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F

class ResidualBlock(nn.Module):
    def __init__(self, inplanes, planes, stride):
        super(ResidualBlock, self).__init__()

        # Define the convolutional layers for F
        self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(planes)

        # Define the shortcut connection G
        if stride != 1 or inplanes != planes:
            self.shortcut = nn.Sequential(
                nn.Conv2d(inplanes, planes, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(planes)
            )
        else:
            self.shortcut = nn.Identity()

    def forward(self, x):
        # F(x) computation
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))

        # Shortcut connection
        shortcut = self.shortcut(x)

        # Residual connection
        out += shortcut
        out = F.relu(out)

        return out