[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](
https://colab.research.google.com/github/CMU-IDeeL/CMU-IDeeL.github.io/blob/master/F25/document/Recitation_0_Series/0.21/0_21_Block_Processing.ipynb)

# Rec 0: Block Processing

## What is a `Layer`
A layer is the atomic computation unit in a neural network. It:

Takes input tensors
Applies operations such as nn.Conv2d, nn.Linear, nn.BatchNorm2d, nn.ReLU
Produces output tensors
Contains learnable parameters that are optimized during training

## What is a `Block`:
A neural network block:
Can represent a single layer, a set of layers, or the entire model.
These blocks can be combined recursively into larger, more complex structures.
This modular approach allows for compact code while enabling the implementation of complex networks.

A block groups layers into a reusable component (e.g., Conv→BN→ReLU, residual unit), enabling modular, reusable, and comprehensible neural architectures.

Motivation:
- For large models with many layers, it's easier to manage and understand if we break them down into smaller, more manageable blocks.
- This approach allows us to focus on the core functionality of each block, rather than the entire model.
- It also makes the code more modular and easier to modify.

# ![Neural Network Blocks](https://classic.d2l.ai/_images/blocks.svg "Multiple layers are combined into blocks, forming repeating patterns of larger models.")

# Pytorch Tools for Modularization
PyTorch provides tools to help modularize and manage complexity in model building.

In [None]:
!pip install torchinfo
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchinfo import summary



# `nn.Module`: The main building block!
- Base class for all neural network modules. Any custom model or submodule must inherit from this base class.
- Modules can also contain other Modules, allowing to nest them in a tree structure.
- You can assign the submodules as regular attributes. Conv2d, Linear, BatchNorm2d all subclass nn.Module.   


## Example: Simple CNN Classifier
The following is a simple CNN classifier with two convolutional layers and two fully connected layers.

Problem?
- If we want to add a layer we have to again write lots of code in the __init__ and in the forward function.
- Also, if we have some common block that we want to use in another model, e.g. the 3x3 conv + batchnorm + relu, we have to write it again.




In [None]:
class SimpleCNNClassifier(nn.Module):
    def __init__(self, in_c, n_classes):
        super().__init__()
        self.conv1 = nn.Conv2d(in_c, 32, kernel_size=3, stride=1, padding=1)
        self.bn1 = nn.BatchNorm2d(32)

        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.bn2 = nn.BatchNorm2d(64)

        self.fc1 = nn.Linear(64 * 28 * 28, 1024)
        self.fc2 = nn.Linear(1024, n_classes)

    def forward(self, x):
        x = self.conv1(x) # 3x3 convolution
        x = self.bn1(x)   # batch normalization
        x = F.relu(x)     # ReLU activation

        x = self.conv2(x) # 3x3 convolution
        x = self.bn2(x)   # batch normalization
        x = F.relu(x)     # ReLU activation

        x = x.view(x.size(0), -1) # flatten the tensor

        x = self.fc1(x)  # fully connected layer
        x = F.sigmoid(x) # sigmoid activation
        x = self.fc2(x)  # fully connected layer

        return x

# Summary
summary(SimpleCNNClassifier(in_c=1, n_classes=10), input_size=(1, 1, 28, 28))

Layer (type:depth-idx)                   Output Shape              Param #
SimpleCNNClassifier                      [1, 10]                   --
├─Conv2d: 1-1                            [1, 32, 28, 28]           320
├─BatchNorm2d: 1-2                       [1, 32, 28, 28]           64
├─Conv2d: 1-3                            [1, 64, 28, 28]           18,496
├─BatchNorm2d: 1-4                       [1, 64, 28, 28]           128
├─Linear: 1-5                            [1, 1024]                 51,381,248
├─Linear: 1-6                            [1, 10]                   10,250
Total params: 51,410,506
Trainable params: 51,410,506
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 66.14
Input size (MB): 0.00
Forward/backward pass size (MB): 1.21
Params size (MB): 205.64
Estimated Total Size (MB): 206.86

## `nn.Sequential`: Stack and merge layers!
- A sequential container. Modules will be added to it in the order they are passed in the constructor.
- The data is passed through all the modules in the same order as they are added to the `Sequential` container.
- You can use `nn.Sequential` to create a simple neural networks.
- You can also use `nn.Sequential` to create a more complex network by adding multiple instances of it.



In [None]:
class SimpleCNNClassifier(nn.Module):
    def __init__(self, in_c, n_classes):
        super().__init__()
        self.conv_block1 = nn.Sequential(
            nn.Conv2d(in_c, 32, kernel_size=3, stride=1, padding=1),
            nn.BatchNorm2d(32),
            nn.ReLU()
        )

        self.conv_block2 = nn.Sequential(
            nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU()
        )

        self.decoder = nn.Sequential(
            nn.Linear(64 * 28 * 28, 1024),
            nn.Sigmoid(),
            nn.Linear(1024, n_classes)
        )


    def forward(self, x):
        x = self.conv_block1(x)
        x = self.conv_block2(x)
        x = x.view(x.size(0), -1) # flat
        x = self.decoder(x)
        return x


# Summary
summary(SimpleCNNClassifier(in_c=1, n_classes=10), input_size=(1, 1, 28, 28))

Layer (type:depth-idx)                   Output Shape              Param #
SimpleCNNClassifier                      [1, 10]                   --
├─Sequential: 1-1                        [1, 32, 28, 28]           --
│    └─Conv2d: 2-1                       [1, 32, 28, 28]           320
│    └─BatchNorm2d: 2-2                  [1, 32, 28, 28]           64
│    └─ReLU: 2-3                         [1, 32, 28, 28]           --
├─Sequential: 1-2                        [1, 64, 28, 28]           --
│    └─Conv2d: 2-4                       [1, 64, 28, 28]           18,496
│    └─BatchNorm2d: 2-5                  [1, 64, 28, 28]           128
│    └─ReLU: 2-6                         [1, 64, 28, 28]           --
├─Sequential: 1-3                        [1, 10]                   --
│    └─Linear: 2-7                       [1, 1024]                 51,381,248
│    └─Sigmoid: 2-8                      [1, 1024]                 --
│    └─Linear: 2-9                       [1, 10]                   10,2

Much better! But still not perfect. Did you notice that self.conv_block1 and self.conv_block2 look exactly the same? Can we do better?
- Yes! We could create a function that reteurns a nn.Sequential to even simplify the code! Then we can just call this function in our Module


In [None]:
def conv_block(in_f, out_f, *args, **kwargs):
    return nn.Sequential(
        nn.Conv2d(in_f, out_f, *args, **kwargs),
        nn.BatchNorm2d(out_f),
        nn.ReLU()
    )

class SimpleCNNClassifier(nn.Module):
    def __init__(self, in_c, n_classes):
        super().__init__()
        self.conv_block1 = conv_block(in_c, 32, kernel_size=3, padding=1)
        self.conv_block2 = conv_block(32, 64, kernel_size=3, padding=1)

        self.decoder = nn.Sequential(
            nn.Linear(64 * 28 * 28, 1024),
            nn.Sigmoid(),
            nn.Linear(1024, n_classes)
        )

    def forward(self, x):
        x = self.conv_block1(x)
        x = self.conv_block2(x)
        x = x.view(x.size(0), -1) # flat
        x = self.decoder(x)
        return x

# Summary
summary(SimpleCNNClassifier(in_c=1, n_classes=10), input_size=(1, 1, 28, 28))

Layer (type:depth-idx)                   Output Shape              Param #
SimpleCNNClassifier                      [1, 10]                   --
├─Sequential: 1-1                        [1, 32, 28, 28]           --
│    └─Conv2d: 2-1                       [1, 32, 28, 28]           320
│    └─BatchNorm2d: 2-2                  [1, 32, 28, 28]           64
│    └─ReLU: 2-3                         [1, 32, 28, 28]           --
├─Sequential: 1-2                        [1, 64, 28, 28]           --
│    └─Conv2d: 2-4                       [1, 64, 28, 28]           18,496
│    └─BatchNorm2d: 2-5                  [1, 64, 28, 28]           128
│    └─ReLU: 2-6                         [1, 64, 28, 28]           --
├─Sequential: 1-3                        [1, 10]                   --
│    └─Linear: 2-7                       [1, 1024]                 51,381,248
│    └─Sigmoid: 2-8                      [1, 1024]                 --
│    └─Linear: 2-9                       [1, 10]                   10,2

Even cleaner! Still conv_block1 and conv_block2 are almost the same! We can do away with the conv_block function and merge them using nn.Sequential

In [None]:
class SimpleCNNClassifier(nn.Module):
    def __init__(self, in_c, n_classes):
        super().__init__()
        self.encoder = nn.Sequential(
            conv_block(in_c, 32, kernel_size=3, padding=1),
            conv_block(32, 64, kernel_size=3, padding=1)
        )
        self.decoder = nn.Sequential(
            nn.Linear(64 * 28 * 28, 1024),
            nn.Sigmoid(),
            nn.Linear(1024, n_classes)
        )


    def forward(self, x):
        x = self.encoder(x)
        x = x.view(x.size(0), -1) # flat
        x = self.decoder(x)
        return x

# Summary
summary(SimpleCNNClassifier(in_c=1, n_classes=10), input_size=(1, 1, 28, 28))

Layer (type:depth-idx)                   Output Shape              Param #
SimpleCNNClassifier                      [1, 10]                   --
├─Sequential: 1-1                        [1, 64, 28, 28]           --
│    └─Sequential: 2-1                   [1, 32, 28, 28]           --
│    │    └─Conv2d: 3-1                  [1, 32, 28, 28]           320
│    │    └─BatchNorm2d: 3-2             [1, 32, 28, 28]           64
│    │    └─ReLU: 3-3                    [1, 32, 28, 28]           --
│    └─Sequential: 2-2                   [1, 64, 28, 28]           --
│    │    └─Conv2d: 3-4                  [1, 64, 28, 28]           18,496
│    │    └─BatchNorm2d: 3-5             [1, 64, 28, 28]           128
│    │    └─ReLU: 3-6                    [1, 64, 28, 28]           --
├─Sequential: 1-2                        [1, 10]                   --
│    └─Linear: 2-3                       [1, 1024]                 51,381,248
│    └─Sigmoid: 2-4                      [1, 1024]                 --
│

## Dynamic Sequential: Create multiple layers at once!
Dynamic Sequential refers to a way of building a sequence of layers programmatically (i.e., with loops or logic) rather than hardcoding each one manually. In PyTorch, this usually means creating an nn.Sequential block where the number of layers, their configurations, or types are determined dynamically at runtime based on some condition or loop.

```python
self.encoder = nn.Sequential(
            conv_block(in_c, 32, kernel_size=3, padding=1),
            conv_block(32, 64, kernel_size=3, padding=1),
            conv_block(64, 128, kernel_size=3, padding=1),
            conv_block(128, 256, kernel_size=3, padding=1),
        )
```

Would it be nice if we can define the sizes as an array and automatically create all the layers without writing each one of them? Fortunately we can create an array and pass it to `Sequential`.



In [None]:
class SimpleCNNClassifier(nn.Module):
    def __init__(self, in_c, n_classes):
        super().__init__()
        self.enc_sizes = [in_c, 32, 64]

        # zip stops when the shortest list is exhausted
        conv_blocks = [conv_block(in_f, out_f, kernel_size=3, padding=1)
                       for in_f, out_f in zip(self.enc_sizes, self.enc_sizes[1:])]

        # Sequential cannot take a list of modules, so we need to unpack it
        self.encoder = nn.Sequential(*conv_blocks)

        self.decoder = nn.Sequential(
            nn.Linear(self.enc_sizes[-1] * 28 * 28, 1024),
            nn.Sigmoid(),
            nn.Linear(1024, n_classes)
        )


    def forward(self, x):
        x = self.encoder(x)
        x = x.view(x.size(0), -1) # flat
        x = self.decoder(x)
        return x

# Summary
summary(SimpleCNNClassifier(in_c=1, n_classes=10), input_size=(1, 1, 28, 28))

Layer (type:depth-idx)                   Output Shape              Param #
SimpleCNNClassifier                      [1, 10]                   --
├─Sequential: 1-1                        [1, 64, 28, 28]           --
│    └─Sequential: 2-1                   [1, 32, 28, 28]           --
│    │    └─Conv2d: 3-1                  [1, 32, 28, 28]           320
│    │    └─BatchNorm2d: 3-2             [1, 32, 28, 28]           64
│    │    └─ReLU: 3-3                    [1, 32, 28, 28]           --
│    └─Sequential: 2-2                   [1, 64, 28, 28]           --
│    │    └─Conv2d: 3-4                  [1, 64, 28, 28]           18,496
│    │    └─BatchNorm2d: 3-5             [1, 64, 28, 28]           128
│    │    └─ReLU: 3-6                    [1, 64, 28, 28]           --
├─Sequential: 1-2                        [1, 10]                   --
│    └─Linear: 2-3                       [1, 1024]                 51,381,248
│    └─Sigmoid: 2-4                      [1, 1024]                 --
│

In [None]:
sizes = [1, 32, 64]

for in_f,out_f in zip(sizes, sizes[1:]):
    print(in_f,out_f)

1 32
32 64


Now if we just want to add a size, we can easily add a new number to the list. It is a common practice to make the size a parameter.

In [None]:
class SimpleCNNClassifier(nn.Module):
    def __init__(self, in_c, enc_sizes, n_classes):
        super().__init__()
        self.enc_sizes = [in_c, *enc_sizes]

        conv_blocks = [conv_block(in_f, out_f, kernel_size=3, padding=1)
                       for in_f, out_f in zip(self.enc_sizes, self.enc_sizes[1:])]

        self.encoder = nn.Sequential(*conv_blocks)


        self.decoder = nn.Sequential(
            nn.Linear(self.enc_sizes[-1] * 28 * 28, 1024),
            nn.Sigmoid(),
            nn.Linear(1024, n_classes)
        )


    def forward(self, x):
        x = self.encoder(x)
        x = x.view(x.size(0), -1) # flat
        x = self.decoder(x)
        return x

# Summary
summary(SimpleCNNClassifier(in_c=1, enc_sizes=[32, 64, 128], n_classes=10), input_size=(1, 1, 28, 28))

Layer (type:depth-idx)                   Output Shape              Param #
SimpleCNNClassifier                      [1, 10]                   --
├─Sequential: 1-1                        [1, 128, 28, 28]          --
│    └─Sequential: 2-1                   [1, 32, 28, 28]           --
│    │    └─Conv2d: 3-1                  [1, 32, 28, 28]           320
│    │    └─BatchNorm2d: 3-2             [1, 32, 28, 28]           64
│    │    └─ReLU: 3-3                    [1, 32, 28, 28]           --
│    └─Sequential: 2-2                   [1, 64, 28, 28]           --
│    │    └─Conv2d: 3-4                  [1, 64, 28, 28]           18,496
│    │    └─BatchNorm2d: 3-5             [1, 64, 28, 28]           128
│    │    └─ReLU: 3-6                    [1, 64, 28, 28]           --
│    └─Sequential: 2-3                   [1, 128, 28, 28]          --
│    │    └─Conv2d: 3-7                  [1, 128, 28, 28]          73,856
│    │    └─BatchNorm2d: 3-8             [1, 128, 28, 28]          256
│   

We can do the same for the decoder part!

In [None]:
def dec_block(in_f, out_f):
    return nn.Sequential(
        nn.Linear(in_f, out_f),
        nn.Sigmoid()
    )

class SimpleCNNClassifier(nn.Module):
    def __init__(self, in_c, enc_sizes, dec_sizes,  n_classes):
        super().__init__()
        self.enc_sizes = [in_c, *enc_sizes]
        self.dec_sizes = [enc_sizes[-1] * 28 * 28, *dec_sizes]

        conv_blocks = [conv_block(in_f, out_f, kernel_size=3, padding=1)
                       for in_f, out_f in zip(self.enc_sizes, self.enc_sizes[1:])]

        self.encoder = nn.Sequential(*conv_blocks)

        dec_blocks = [dec_block(in_f, out_f)
                       for in_f, out_f in zip(self.dec_sizes, self.dec_sizes[1:])]

        self.decoder = nn.Sequential(*dec_blocks)
        self.last = nn.Linear(self.dec_sizes[-1], n_classes)


    def forward(self, x):
        x = self.encoder(x)
        x = x.view(x.size(0), -1) # flat
        x = self.decoder(x)
        x = self.last(x)
        return x

# Summary
summary(SimpleCNNClassifier(in_c=1, enc_sizes=[32, 64, 128], dec_sizes=[1024, 512], n_classes=10), input_size=(1, 1, 28, 28))

Layer (type:depth-idx)                   Output Shape              Param #
SimpleCNNClassifier                      [1, 10]                   --
├─Sequential: 1-1                        [1, 128, 28, 28]          --
│    └─Sequential: 2-1                   [1, 32, 28, 28]           --
│    │    └─Conv2d: 3-1                  [1, 32, 28, 28]           320
│    │    └─BatchNorm2d: 3-2             [1, 32, 28, 28]           64
│    │    └─ReLU: 3-3                    [1, 32, 28, 28]           --
│    └─Sequential: 2-2                   [1, 64, 28, 28]           --
│    │    └─Conv2d: 3-4                  [1, 64, 28, 28]           18,496
│    │    └─BatchNorm2d: 3-5             [1, 64, 28, 28]           128
│    │    └─ReLU: 3-6                    [1, 64, 28, 28]           --
│    └─Sequential: 2-3                   [1, 128, 28, 28]          --
│    │    └─Conv2d: 3-7                  [1, 128, 28, 28]          73,856
│    │    └─BatchNorm2d: 3-8             [1, 128, 28, 28]          256
│   

We can make things even more modular by splitting the encoder and decoder into separate classes!

In [None]:
class SimpleEncoder(nn.Module):
    def __init__(self, enc_sizes):
        super().__init__()
        self.conv_blocks = nn.Sequential(*[conv_block(in_f, out_f, kernel_size=3, padding=1)
                       for in_f, out_f in zip(enc_sizes, enc_sizes[1:])])

    def forward(self, x):
        return self.conv_blocks(x)

class SimpleDecoder(nn.Module):
    def __init__(self, dec_sizes, n_classes):
        super().__init__()
        self.dec_blocks = nn.Sequential(*[dec_block(in_f, out_f)
                       for in_f, out_f in zip(dec_sizes, dec_sizes[1:])])
        self.last = nn.Linear(dec_sizes[-1], n_classes)

    def forward(self, x):
        x = self.dec_blocks(x)
        x = self.last(x)
        return x


class SimpleCNNClassifier(nn.Module):
    def __init__(self, in_c, enc_sizes, dec_sizes,  n_classes):
        super().__init__()
        self.enc_sizes = [in_c, *enc_sizes]
        self.dec_sizes = [self.enc_sizes[-1] * 28 * 28, *dec_sizes]

        self.encoder = SimpleEncoder(self.enc_sizes)
        self.decoder = SimpleDecoder(self.dec_sizes, n_classes)

    def forward(self, x):
        x = self.encoder(x)
        x = x.flatten(1) # flat
        x = self.decoder(x)
        return x

# Summary
summary(SimpleCNNClassifier(in_c=1, enc_sizes=[32, 64, 128], dec_sizes=[1024, 512], n_classes=10), input_size=(1, 1, 28, 28))


Layer (type:depth-idx)                   Output Shape              Param #
SimpleCNNClassifier                      [1, 10]                   --
├─SimpleEncoder: 1-1                     [1, 128, 28, 28]          --
│    └─Sequential: 2-1                   [1, 128, 28, 28]          --
│    │    └─Sequential: 3-1              [1, 32, 28, 28]           384
│    │    └─Sequential: 3-2              [1, 64, 28, 28]           18,624
│    │    └─Sequential: 3-3              [1, 128, 28, 28]          74,112
├─SimpleDecoder: 1-2                     [1, 10]                   --
│    └─Sequential: 2-2                   [1, 512]                  --
│    │    └─Sequential: 3-4              [1, 1024]                 102,761,472
│    │    └─Sequential: 3-5              [1, 512]                  524,800
│    └─Linear: 2-3                       [1, 10]                   5,130
Total params: 103,384,522
Trainable params: 103,384,522
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 175.95
Input s

By diving our module into submodules it is easier to share the code, debug it and test it.

# `nn.ModuleList`: When we need to iterate!
ModuleList allows you to store Module as a list. It can be useful when you need to iterate through layer and store/use some information,

The main difference between Sequential is that ModuleList have not a forward method so the inner layers are not connected. Assuming we need each output of each layer in the decoder, we can store it by:

In [None]:
class SimpleModule(nn.Module):
    def __init__(self, sizes):
        super().__init__()
        self.layers = nn.ModuleList([nn.Linear(in_f, out_f) for in_f, out_f in zip(sizes, sizes[1:])])
        self.trace = []

    def forward(self,x):
        for layer in self.layers:
            x = layer(x)
            self.trace.append(x)
        return x

# Summary
summary(SimpleModule([1, 16, 32]), input_size=(1, 1))

# Example: Use for storing intermediate outputs/debugging
model = SimpleModule([1, 16, 32])
model(torch.rand((4,1)))

for trace in model.trace:
    print(trace.shape)


torch.Size([4, 16])
torch.Size([4, 32])


# `nn.ModuleDict`: When we need to choose!
ModuleDict allows you to store Module as a dictionary. It can be useful when you need to choose a module based on some condition.
What if we want to switch to LeakyReLU in our conv_block? We can use ModuleDict to create a dictionary of Module and dynamically switch Module when we want

In [None]:
def conv_block(in_f, out_f, activation='relu', *args, **kwargs):

    activations = nn.ModuleDict([
                ['lrelu', nn.LeakyReLU()],
                ['relu', nn.ReLU()]
    ])

    return nn.Sequential(
        nn.Conv2d(in_f, out_f, *args, **kwargs),
        nn.BatchNorm2d(out_f),
        activations[activation]
    )

# Summary
print(conv_block(1, 32, kernel_size=3, padding=1, activation='lrelu'))
print(conv_block(1, 32, kernel_size=3, padding=1, activation='relu'))



Sequential(
  (0): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (2): LeakyReLU(negative_slope=0.01)
)
Sequential(
  (0): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (2): ReLU()
)


# Putting it all together!

In [None]:
def conv_block(in_f, out_f, activation='relu', *args, **kwargs):
    activations = nn.ModuleDict([
                ['lrelu', nn.LeakyReLU()],
                ['relu', nn.ReLU()]
    ])

    return nn.Sequential(
        nn.Conv2d(in_f, out_f, *args, **kwargs),
        nn.BatchNorm2d(out_f),
        activations[activation]
    )

def dec_block(in_f, out_f):
    return nn.Sequential(
        nn.Linear(in_f, out_f),
        nn.Sigmoid()
    )

class MyEncoder(nn.Module):
    def __init__(self, enc_sizes, *args, **kwargs):
        super().__init__()
        self.conv_blocks = nn.Sequential(*[conv_block(in_f, out_f, kernel_size=3, padding=1, *args, **kwargs)
                       for in_f, out_f in zip(enc_sizes, enc_sizes[1:])])

    def forward(self, x):
        return self.conv_blocks(x)

class MyDecoder(nn.Module):
    def __init__(self, dec_sizes, n_classes):
        super().__init__()
        self.dec_blocks = nn.Sequential(*[dec_block(in_f, out_f)
                       for in_f, out_f in zip(dec_sizes, dec_sizes[1:])])
        self.last = nn.Linear(dec_sizes[-1], n_classes)

    def forward(self, x):
        x = self.dec_blocks(x)
        x = self.last(x)
        return x


class MyCNNClassifier(nn.Module):
    def __init__(self, in_c, enc_sizes, dec_sizes,  n_classes, activation='relu'):
        super().__init__()
        self.enc_sizes = [in_c, *enc_sizes]
        self.dec_sizes = [self.enc_sizes[-1] * 28 * 28, *dec_sizes]

        self.encoder = MyEncoder(self.enc_sizes, activation=activation)
        self.decoder = MyDecoder(self.dec_sizes, n_classes)

    def forward(self, x):
        x = self.encoder(x)
        x = x.flatten(1) # flat
        x = self.decoder(x)
        return x

# Summary
summary(MyCNNClassifier(in_c=1, enc_sizes=[32, 64], dec_sizes=[1024, 512], n_classes=10, activation='lrelu'), input_size=(1, 1, 28, 28))

Layer (type:depth-idx)                   Output Shape              Param #
MyCNNClassifier                          [1, 10]                   --
├─MyEncoder: 1-1                         [1, 64, 28, 28]           --
│    └─Sequential: 2-1                   [1, 64, 28, 28]           --
│    │    └─Sequential: 3-1              [1, 32, 28, 28]           384
│    │    └─Sequential: 3-2              [1, 64, 28, 28]           18,624
├─MyDecoder: 1-2                         [1, 10]                   --
│    └─Sequential: 2-2                   [1, 512]                  --
│    │    └─Sequential: 3-3              [1, 1024]                 51,381,248
│    │    └─Sequential: 3-4              [1, 512]                  524,800
│    └─Linear: 2-3                       [1, 10]                   5,130
Total params: 51,930,186
Trainable params: 51,930,186
Non-trainable params: 0
Total mult-adds (Units.MEGABYTES): 66.66
Input size (MB): 0.00
Forward/backward pass size (MB): 1.22
Params size (MB): 207.72


# Conclusion
- In summary, we have seen how to create a modular neural network using `nn.Module`, `nn.Sequential`, `nn.ModuleList`, and `nn.ModuleDict`.
- Use Module when you have a big block compose of multiple smaller blocks
- Use Sequential when you want to create a small block from layers
- Use ModuleList when you need to iterate through some layers or building blocks and do something
- Use ModuleDict when you need to parametise some blocks of your model, for example an activation function

For more information, you can check out the [official documentation](https://pytorch.org/docs/stable/nn.html#containers)
