# Creating Models in PyTorch

This notebook is a tutorial accompying the manuscript "Perspectives: Comparison of Deep Learning Based Segmentation Models on Typical Biophysics and Biomedical Data" by JS Bryan IV, M Tavakoli, and S Presse. In this tutorial, we will learn the basics of using the `nn.Module` class in PyTorch with prebuilt and custom layers.

**Before reading this tutorial, make sure you have properly installed PyTorch and downloaded the data as explained in this repository's README.**

## Introduction

Welcome to the tutorial on models in PyTorch! In this tutorial, we will learn how to create models in PyTorch using the `nn.Module` class. We will also learn how to use prebuilt layers and create custom layers. The specific aim of this tutorial is the explain the models used in the accompanying manuscript, which can be found in the `models/` directory of this repository.

Models in PyTorch are a convenient way to package together a neural network architecture. The `nn.Module` class is the base class for all models in PyTorch. It provides a convenient way to define the forward pass of a neural network, which is the process of passing input data through the network to get an output. The `nn.Module` class also provides a way to define the parameters of the network, which are the weights and biases that are learned during training.

### Importing libraries

Before we start, let's import the necessary libraries. We will be using the `torch` and `torch.nn` modules from PyTorch.

In [2]:
# Import libraries
import torch
import torch.nn as nn

## Basics of nn.Module

The Module class is the core of model creation in PyTorch. It provides a simple way to set up the parameters of the network and define the operations that are performed on the input data. The Module class has two main methods that need to be implemented: `__init__` and `forward`. The `__init__` method is used to define the layers and parameters of the network, while the `forward` method specifies how the input data is processed through these layers to produce the output.

Lets start by creating a very simple model that simply scales and shifts the input data. To do this we will create a custom module called `ScaleShift` that inherits from `nn.Module`. In the `__init__` method, we will initialize the parent class of `ScaleShift` then define two parameters `scale` and `shift` that will be learned during training. Notice that to define parameters in a module, we use the `nn.Parameter` class. The `nn.Parameter` class is a wrapper around a tensor that tells PyTorch that this tensor should be treated as a parameter of the network during training. In the `forward` method, we will apply the scaling and shifting operations to the input data. Lastly we apply our custom module to some input data to see how it works.

In [10]:
## Create simple model that simply scales and shifts data
class ScaleShift(nn.Module):
    def __init__(self):
        super(ScaleShift, self).__init__()         # Call parent class constructor, this is required for model
        self.scale = nn.Parameter(torch.rand(1))   # Create scale parameter, initialized to random value
        self.shift = nn.Parameter(torch.randn(1))  # Create shift parameter, initialized to random value
    def forward(self, x):
        return x * self.scale + self.shift         # Return scaled and shifted data
    
# Instantiate model and data
model = ScaleShift()
data = torch.linspace(1, 10, 10)

# Run model on data
output = model(data)

# Print results
print("Data: ", data)
print("Output: ", output)
print("Scale: ", model.scale)
print("Shift: ", model.shift)

Data:  tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])
Output:  tensor([-0.3858, -0.2925, -0.1991, -0.1058, -0.0125,  0.0808,  0.1741,  0.2674,
         0.3607,  0.4541], grad_fn=<AddBackward0>)
Scale:  Parameter containing:
tensor([0.0933], requires_grad=True)
Shift:  Parameter containing:
tensor([-0.4791], requires_grad=True)


The convenient part of using the `nn.Module` class is that it automatically handles the backpropagation of gradients through the network. This means that we don't have to manually calculate the gradients of the loss function with respect to the parameters of the network. PyTorch will automatically calculate these gradients for us using the `autograd` module.

Additionally, the `nn.Module` we can call the forward method of the module directly on input data to get the output of the network. This is done by simply calling the module as if it were a function. For example we can call `output = model(input)` to get the output of the model on the input data.

## Using Prebuilt Layers

PyTorch provides a rich set of prebuilt layers in the torch.nn module, which makes it easy to build complex neural network architectures. These layers include convolutional layers, pooling layers, activation functions, and more. Using these prebuilt layers can save you time and ensure that your model components are optimized for performance. In this section, we will build a simple convolutional neural network (CNN) for image segmentation.

Before we start, it's important to note that PyTorch models expect input tensors to have a batch dimension. This means that even if we are processing a single monocolor image, we need to structure our input as a 4D tensor with dimensions (batch, channel, height, width). This is important to keep in mind especially when loading data from files, where the channel dimension may be in a different order. Each layer has a different expected input shape, and it is important to read the [documentation](https://pytorch.org/docs/stable/index.html) for each layer to understand how to properly structure the input data.

Now that we understand the basics of Modules and inputs, let us move onto creating our CNN. A simple model will consist of layers and blocks. Layers are individual components of the network, such as convolutional layers, pooling layers, and activation functions. Blocks are groups of layers that are repeated multiple times in the network. Our network will have three main blocks: an input block, a convolutional block, and an output block. Let us go over these in detail.

### Attributes

Our model will have several attributes that define the architecture of the network. These attributes include the number of input channels, the number of feature channels, and the number of output channels. The number of input channels is the number of channels in the input data, which is typically 1 for grayscale images and 3 for color images. The number of feature channels is the number of channels in the intermediate feature maps of the network, which is a hyperparameter that can be tuned to control the capacity of the network. The number of output channels is the number of channels in the output data, which is typically the number of classes in a segmentation task.

```python

        # Set up attributes
        self.in_channels = 3
        self.out_channels = 2
        self.n_features = 8

```

We specify 3 input channels for RGB images, 2 output channels for binary segmentation, and 8 feature channels for the intermediate feature maps of the network. These values can be adjusted depending on the specific task and dataset.

### Input Block

The input block is the first part of the network that preprocesses the input data. For our purposes, we need to normalize the input data to have zero mean and unit variance, and then add in additional feature channels so that the input data has the correct number of channels for the network. To do this we will use `nn.GroupNorm` to normalize the input data, and `nn.Conv2d` to add additional feature channels.

```python

        # Set up input block
        self.input_block = nn.Sequential(
            nn.GroupNorm(1, in_channels, affine=False),  # Normalize input
            nn.Conv2d(in_channels, n_features, kernel_size=3, padding=1),
        )

```

The `nn.GroupNorm` layer normalizes the input data to have zero mean and unit variance. The `nn.Conv2d` layer adds additional feature channels to the input data by applying a set of filters to the input data. The `kernel_size` parameter specifies the size of the filters, and the `padding` parameter specifies the amount of zero padding to add to the input data.

Notice that we group layers together using the `nn.Sequential` class. This is a convenient way to define a sequence of layers in PyTorch. We can then call the `forward` method of the `nn.Sequential` object to apply the layers in sequence to the input data.

### Convolutional Block

The convolutional block is the main part of the network that processes the input data. A standard convolutional block consists of a convolutional layer, followed by a normalization layer, and finally an activation layer. The convolutional layer applies a set of filters to the input data to extract features. The normalization layer normalizes the output of the convolutional layer to stabilize the training process. The activation layer applies a non-linear function to the output of the normalization layer to introduce non-linearity into the network.

```python

        # Set up convolutional block
        self.conv_block = nn.Sequential(
            nn.Conv2d(n_features, n_features, kernel_size=3, padding=1),
            nn.InstanceNorm2d(n_features),
            nn.ReLU(),
        )

```

When we specify the parameters of `nn.Conv2d` the argument convention is `(in_channels, out_channels, kernel_size, padding)`. The `in_channels` parameter specifies the number of input channels to the convolutional layer, the `out_channels` parameter specifies the number of output channels, the `kernel_size` parameter specifies the size of the filters, and the `padding` parameter specifies the amount of zero padding to add to the input data.

Notice that there are many different choices for normalization. Typically in the literature, batch normalization is used. However, in this model we use instance normalization. Instance normalization normalizes the output of the convolutional layer for each individual sample in the batch, which can be useful for style transfer and other tasks where the statistics of the output need to be preserved (i.e. where we do not want the output of the network to be affected by the statistics of the entire batch).

Lastly, we use a Rectified Linear Unit (ReLU) activation function to introduce non-linearity into the network. The ReLU function applies the function `f(x) = max(0, x)` to the output of the normalization layer, which ensures that the output of the network is always positive. There are many other activation functions available in PyTorch, such as the sigmoid and tanh functions, but ReLU is the most commonly used activation function in deep learning.

### Output Block

The output block is the final part of the network that produces the output data. For our purposes, we need to convert the intermediate feature maps of the network into the final output data. To do this we will use `nn.Conv2d` to ensure that the output data has the correct number of channels.

```python

        # Set up output block
        self.output_block = nn.Sequential(
            nn.Conv2d(n_features, out_channels, kernel_size=3, padding=1),
        )

```

**One big consideration** There are many conventions for outputs of segmentation networks. We can choose to output the probability of each class for each pixel, or the logits of each class, or the class with the highest probability for each pixel. In this model, we output the logits of each class for each pixel. The logits are the raw output of the network before applying the softmax function, which converts the logits into probabilities. This is a common choice for segmentation networks, as it allows us to use the cross-entropy loss function to train the network. However, we must keep in mind that when we want to use our model for evaluation, we must apply the softmax function to the output to get the probabilities of each class.

Lastly, one thing we should point out about output blocks is that in the code in the `models/` directory of this repository, we create a function, `set_output_block` to set the output block. This is because the number of output channels can change depending on the task. For example, in the manuscript, we use a model with 2 output channels for binary segmentation, and a model with 3 output channels for multiclass segmentation. By creating a separate function to set the output block, we can easily change the number of output channels without having to modify the rest of the model. This is beyond the scope of this tutorial and we will not cover it here.

### Forward Method

The `forward` method of the model specifies how the input data is processed through the network to produce the output. In our model, we apply the input block to the input data, then pass the output through the convolutional block multiple times, and finally apply the output block to produce the final output.

```python

    def forward(self, x):
        # Apply input block
        x = self.input_block(x)

        # Apply convolutional block
        for _ in range(2):
            x = self.conv_block(x)

        # Apply output block
        x = self.output_block(x)

        return x

```

### Putting it all together

Now that we have defined the input block, convolutional block, and output block, we can put them all together to create the full model. We can do this by defining a new class called `SimpleCNN` that inherits from `nn.Module` and combines the input block, convolutional block, and output block into a single model.

In [13]:
# Define the ConvolutionalNet class
class ConvolutionalNet(nn.Module):
    def __init__(self, in_channels, out_channels, n_features=8):
        super(ConvolutionalNet, self).__init__()

        # Set up attributes
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.n_features = n_features


        ### SET UP BLOCKS ###

        # Set up input block
        self.input_block = nn.Sequential(
            nn.GroupNorm(1, in_channels, affine=False),  # Normalize input
            nn.Conv2d(in_channels, n_features, kernel_size=3, padding=1),
        )

        # Set up layers
        self.conv_block = nn.Sequential(
            nn.Conv2d(n_features, n_features, kernel_size=3, padding=1),
            nn.InstanceNorm2d(n_features),
            nn.ReLU(),
        )

        # Set up output block
        self.output_block = nn.Sequential(
            nn.Conv2d(self.n_features, out_channels, kernel_size=1),
        )

    def forward(self, x):
        """Forward pass."""

        # Input block
        x = self.input_block(x)

        # Convolutional block
        x = self.conv_block(x)

        # Output block
        x = self.output_block(x)

        # Return
        return x
    

# Set up model and data
model = ConvolutionalNet(3, 2, n_features=8)  # 3 input channels (RGB), 2 output channels, 8 features
data = torch.randn(1, 3, 32, 32)              # 1 batch, 3 channels, 32x32 image

# Run model on data
output = model(data)

# Print results
print("Data: ", data)
print("Output: ", output)

Data:  tensor([[[[-1.1278, -0.0214, -0.5631,  ...,  0.7420, -0.1258,  0.7185],
          [ 1.0944, -0.8027,  0.8063,  ...,  1.2051,  1.1187, -0.1114],
          [ 1.5907,  0.5595, -1.9239,  ..., -0.8093, -0.9308, -0.2255],
          ...,
          [ 0.5830,  0.7052, -0.4028,  ..., -0.0998, -0.3303, -0.8037],
          [ 0.1542,  0.6144, -0.1969,  ..., -1.2185, -0.1592, -1.3697],
          [-0.8191,  0.4712,  0.5140,  ..., -0.2821, -2.1205,  1.7506]],

         [[ 0.8654, -0.1745,  1.4989,  ..., -2.2217, -1.6593, -0.0299],
          [ 0.3860,  0.8777,  0.0254,  ..., -1.7589, -0.9202, -0.4260],
          [ 0.5673, -0.8840,  0.4599,  ..., -0.9532, -1.0387, -0.1461],
          ...,
          [ 0.6346, -0.0481,  0.4742,  ..., -0.2908,  0.3976,  1.1244],
          [-1.9532, -0.4765,  0.2615,  ...,  1.1766, -0.0833,  0.2104],
          [-1.0329,  0.0407, -1.8705,  ...,  1.5924, -2.2502, -1.8930]],

         [[-0.6035,  1.1668, -0.2917,  ..., -1.2814, -1.4492,  0.4326],
          [ 0.4711, -1.

## Conclusion

This covers all the basics of creating models in PyTorch. We have learned how to use the `nn.Module` class to define the architecture of a neural network, how to use prebuilt layers to build complex models, and how to create custom layers for specific tasks. We have also learned how to structure the input data for a neural network and how to define the forward pass of the network to produce the output. By following these steps, you can create your own models in PyTorch for a wide range of tasks.