## Examples of model creation and usage

### U-Net

The U-Net implemented has the following architecture:

<img src="resunet.png" width="600"/>

The building blocks of the network are the following:

<img src="resunet_blocks.png" width="600"/>

That is, the network is divided into an encoder and a decoder. The encoder is composed of stages, each stage having a number of residual blocks. At the beginning of each stage, the activations are downsampled and the number of channels is changed, with the exception of the first stage where there is no change to the number of channels. 

The decoder mirrors the encoder. Each stage of the decoder upsamples the signal, concatenates it with the output of the respective stage of the encoder and passes the output to some residual blocks. Notice that stage i of the decoder uses the output of stage i-1 of the encoder. This is by design, so that the last stage of the decoder operates on the full resolution of the input image. 

Therefore, a U-Net with n stages is defined by the following parameters:

* in_channels = ci
* num_classes = m
* blocks_per_encoder_stage = [e1,e2,...,en]
* blocks_per_decoder_stage = [d1,d2,...,dn]
* channels_per_stage = [c1,c2,...,cn]

In [1]:
import torch

from torchtrainer import models

# The layers parameter sets the number of stages and residual blocks at each stage of the U-Net. 
# Each value of the list sets the number of residual blocks at a stage. Thus, len(layers) sets
# the number of stages.
blocks_per_encoder_stage = (1, 3, 1)
blocks_per_decoder_stage = (1, 1, 1)

# The channels parameter sets the number of channels of each stage
channels_per_stage = (16, 32, 64)

# Given the layers and channels above, the created model will have three stages in the encoder.
# The first stage will have 1 residual block with 16 channels, the second stage
# 3 residual blocks with 32 channels and the last stage 1 residual block with
# 64 channels. The decoder will have 1 residual block for each stage.
model = models.UNetCustom(
    blocks_per_encoder_stage, blocks_per_decoder_stage, channels_per_stage, in_channels=1
    )
model

UNetCustom(
  (stage_input): Sequential(
    (0): Conv2d(1, 16, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), bias=False)
    (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
  )
  (encoder): ModuleDict(
    (stage_0): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(16, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU(inplace=True)
        (conv2): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU(inplace=True)
        (residual_adj): Sequential(
          (0): Conv2d(16, 16, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )

In [2]:
# Check if the number of convolution layers of the model is as expected
# It should be
num_conv = sum([
 1,                                  # Input layer
 2*sum(blocks_per_encoder_stage),    # Two convolutions for each residual block (encoder)
 2*sum(blocks_per_decoder_stage),    # Two convolutions for each residual block (decoder)
 1                                   # Output layer
])

num_conv_model = 0
for module in model.modules():
    if isinstance(module, torch.nn.Conv2d) and module.kernel_size[0]>1:
        num_conv_model += 1
print(num_conv, num_conv_model)

18 18


Some examples of architecutres

In [3]:
# Only one stage with 12 residual blocks, each having 16 channels. 
model = models.UNetCustom((12,), (12,), (16,))
# 3 stages with different number of channels each. Maximum downsampling will be 2**3=8.
model = models.UNetCustom((3,3,3), (3,1,1), (16, 32, 64))
# 5 stages with different number of channels each. Maximum downsampling will be 2**5=32.
model = models.UNetCustom((2,2,2,2,2), (2,2,2,2,2), (16, 32, 64, 128, 256));

### ResNetSeg

The ResNetSeg model mimics a ResNet without the downsampling. The parameters are similar to the U-Net above. It has the layers:

conv->stage_0->stage_1->...->stage_n->conv

where each stage is

residual_block_0->residual_block1->...->residual_block_n

The number of channels is changed at the beginning of each stage.

In [None]:
# Creates a model with 4 stages, each having 3 residual blocks.
layers = (3,3,3,3)
# First stage has 16 channels, second stage has 32, and third and fourth have 64
channels = (16,32,64,64)
model = models.ResNetSeg(layers, channels)

In [None]:
# Check if the number of convolution layers of the model is as expected
# It should be
num_conv = sum([
 1,                # Input layer
 2*sum(layers),    # Two convolutions for each residual block (encoder)
 1                 # Output layer
])

num_conv_model = 0
for module in model.modules():
    if isinstance(module, torch.nn.Conv2d) and module.kernel_size[0]>1:
        num_conv_model += 1
print(num_conv, num_conv_model)

26 26
