# **Artificial Neural Networks and Deep Learning**

---

## **Lecture 7: Advancements in Convolutional Block Design**

<img src="https://drive.google.com/uc?export=view&id=1Ruszte0iwJ-i5VgTCApvJXz7yXWgnZzi" width="500"/>

## ‚öôÔ∏è Import libraries

In [None]:
# Set seed for reproducibility
SEED = 42

# Import necessary libraries
import os

# Set environment variables before importing modules
os.environ['PYTHONHASHSEED'] = str(SEED)
os.environ['MPLCONFIGDIR'] = os.getcwd() + '/configs/'

# Suppress warnings
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
warnings.simplefilter(action='ignore', category=Warning)

# Import necessary modules
import logging
import random
import numpy as np

# Set seeds for random number generators in NumPy and Python
np.random.seed(SEED)
random.seed(SEED)

# Import PyTorch
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchsummary import summary
from torch.utils.tensorboard import SummaryWriter
!pip install torchview
from torchview import draw_graph

torch.manual_seed(SEED)
if torch.cuda.is_available():
    device = torch.device("cuda")
    torch.cuda.manual_seed_all(SEED)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
else:
    device = torch.device("cpu")

print(f"PyTorch version: {torch.__version__}")
print(f"Device: {device}")

# Import visualization libraries
import matplotlib.pyplot as plt
import seaborn as sns

# Configure plot display settings
sns.set(font_scale=1.4)
sns.set_style('white')
plt.rc('font', size=14)
%matplotlib inline

# TensorBoard setup
logs_dir = "tensorboard_blocks"
!rm -rf {logs_dir}
!mkdir -p {logs_dir}
%load_ext tensorboard

In [None]:
# Define input and output dimensions
input_shape = (3, 64, 64)
output_shape = 10

# Define the batch size
BATCH_SIZE = 128

# Initialize configuration for convolutional layers
stack = 1
filters = 32
kernel_size = 3


print(f"Input shape: {input_shape}")
print(f"Output shape: {output_shape}")
print(f"batch size: {BATCH_SIZE}")
print(f"Stack: {stack}")
print(f"Filters: {filters}")
print(f"Kernel size: {kernel_size}")

## üõ†Ô∏è **First Convolutional Neural Network Block (AlexNet, 2012)**

<img src="https://miro.medium.com/v2/resize:fit:1400/1*bD_DMBtKwveuzIkQTwjKQQ.png" width="800"/>

---
**Key Features and Achievements**


*   First successful deep CNN for ImageNet
*   Introduced ReLU to combat vanishing gradient

**Key building block:**

*   Conv -> ReLU -> MaxPool sequence
*   Multiple layers stacked sequentially

**Impact:**

*   Started the "deep learning revolution"
*   Established basic CNN design patterns

**üìú Paper:** ["ImageNet Classification with Deep Convolutional Neural Networks", Krizhevsky et al.](https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf)



In [None]:
class BasicCNNBlock(nn.Module):
    """Basic CNN block with Conv -> ReLU -> MaxPool pattern (AlexNet style)."""

    def __init__(self, in_channels, filters, kernel_size=3, padding='same',
                 downsample=True, stack=2):
        super().__init__()

        layers_list = []
        current_channels = in_channels

        for i in range(stack):
            layers_list.append(
                nn.Conv2d(current_channels, filters, kernel_size, padding=padding)
            )
            layers_list.append(nn.ReLU())
            current_channels = filters

        if downsample:
            layers_list.append(nn.MaxPool2d(2))

        self.block = nn.Sequential(*layers_list)

    def forward(self, x):
        return self.block(x)


class BasicCNNModel(nn.Module):
    """Complete model using BasicCNNBlock."""

    def __init__(self, input_shape, output_shape, filters=32, kernel_size=3, stack=1):
        super().__init__()

        self.block0 = BasicCNNBlock(
            in_channels=input_shape[0],
            filters=filters,
            kernel_size=kernel_size,
            downsample=True,
            stack=stack
        )

        # Calculate flattened size
        with torch.no_grad():
            dummy = torch.zeros(1, *input_shape)
            dummy_out = self.block0(dummy)
            flatten_size = dummy_out.view(1, -1).shape[1]

        self.flatten = nn.Flatten()
        self.dense = nn.Linear(flatten_size, output_shape)

    def forward(self, x):
        x = self.block0(x)
        x = self.flatten(x)
        x = self.dense(x)
        return F.softmax(x, dim=1)

In [None]:
# Create and display the Basic CNN model
basic_cnn = BasicCNNModel(input_shape, output_shape, filters, kernel_size, stack).to(device)
summary(basic_cnn, input_size=input_shape)
model_graph = draw_graph(basic_cnn, input_size=(BATCH_SIZE,)+input_shape, expand_nested=True)
model_graph.visual_graph

## üõ†Ô∏è **Global Average Pooling (NiN, 2013)**

<img src="https://www.researchgate.net/publication/363231491/figure/fig5/AS:11431281179419529@1691187457237/Illustration-of-global-average-pooling-GAP.png" width="800"/>

---
**Key Features and Achievements**


*   Replaced Flatten and Dense layers
*   Enforced correspondence between feature maps and categories

**Key building block:**

*   Global spatial average of each feature map
*   Direct feature-to-category mapping

**Impact:**

*   Dramatic parameter reduction
*   Better generalization with fewer parameters

**üìú Paper:** ["Network In Network", Lin et al.](https://arxiv.org/pdf/1312.4400)



In [None]:
class GAPModel(nn.Module):
    """Model using Global Average Pooling instead of Flatten + Dense."""

    def __init__(self, input_shape, output_shape, filters=32, kernel_size=3, stack=1):
        super().__init__()

        self.block0 = BasicCNNBlock(
            in_channels=input_shape[0],
            filters=filters,
            kernel_size=kernel_size,
            downsample=True,
            stack=stack
        )

        self.gap = nn.AdaptiveAvgPool2d(1)
        self.flatten = nn.Flatten()
        self.dense = nn.Linear(filters, output_shape)

    def forward(self, x):
        x = self.block0(x)
        x = self.gap(x)
        x = self.flatten(x)
        x = self.dense(x)
        return F.softmax(x, dim=1)

In [None]:
# Create and display the GAP model
gap_model = GAPModel(input_shape, output_shape, filters, kernel_size, stack).to(device)
summary(gap_model, input_size=input_shape)
model_graph = draw_graph(gap_model, input_size=(BATCH_SIZE,)+input_shape, expand_nested=True)
model_graph.visual_graph

## üõ†Ô∏è **Inception Block (GoogLeNet, 2014)**

<img src="https://ar5iv.labs.arxiv.org/html/1707.07128/assets/googlenetInception.png" width="800"/>


---
**Key Features and Achievements**


*   Multi-scale feature processing
*   Winner of ILSVRC 2014

**Key building block:**

*   Parallel paths with different kernels
*   1x1 bottleneck for efficiency
*   Feature concatenation

**Impact:**

*   Established multi-path processing
*   Introduced 1x1 bottleneck concept

**üìú Paper:** ["Going deeper with convolutions", Szegedy et al.](https://arxiv.org/pdf/1409.4842)

In [None]:
class InceptionBlock(nn.Module):
    """Original Inception block (2014) with parallel convolution paths."""

    def __init__(self, in_channels, filters, downsample=True, stack=2):
        super().__init__()
        self.stack = stack
        self.downsample = downsample

        # Build stacked inception modules
        self.inception_modules = nn.ModuleList()
        current_channels = in_channels

        for s in range(stack):
            module = nn.ModuleDict({
                # 1x1 path
                'conv1': nn.Sequential(
                    nn.Conv2d(current_channels, filters // 4, 1, padding='same'),
                    nn.ReLU()
                ),
                # 3x3 path with reduction
                'conv3_reduce': nn.Sequential(
                    nn.Conv2d(current_channels, filters // 8, 1, padding='same'),
                    nn.ReLU()
                ),
                'conv3': nn.Sequential(
                    nn.Conv2d(filters // 8, filters // 4, 3, padding='same'),
                    nn.ReLU()
                ),
                # 5x5 path with reduction
                'conv5_reduce': nn.Sequential(
                    nn.Conv2d(current_channels, filters // 12, 1, padding='same'),
                    nn.ReLU()
                ),
                'conv5': nn.Sequential(
                    nn.Conv2d(filters // 12, filters // 4, 5, padding='same'),
                    nn.ReLU()
                ),
                # Pool path
                'pool': nn.MaxPool2d(3, stride=1, padding=1),
                'pool_proj': nn.Sequential(
                    nn.Conv2d(current_channels, filters // 4, 1, padding='same'),
                    nn.ReLU()
                )
            })
            self.inception_modules.append(module)
            current_channels = filters  # After concatenation

        if downsample:
            self.pool = nn.MaxPool2d(2)

    def forward(self, x):
        for module in self.inception_modules:
            conv1 = module['conv1'](x)
            conv3 = module['conv3'](module['conv3_reduce'](x))
            conv5 = module['conv5'](module['conv5_reduce'](x))
            pool_proj = module['pool_proj'](module['pool'](x))
            x = torch.cat([conv1, conv3, conv5, pool_proj], dim=1)

        if self.downsample:
            x = self.pool(x)
        return x


class InceptionModel(nn.Module):
    """Complete model using InceptionBlock."""

    def __init__(self, input_shape, output_shape, filters=32, stack=1):
        super().__init__()

        self.block0 = InceptionBlock(
            in_channels=input_shape[0],
            filters=filters,
            downsample=True,
            stack=stack
        )

        self.gap = nn.AdaptiveAvgPool2d(1)
        self.flatten = nn.Flatten()
        self.dense = nn.Linear(filters, output_shape)

    def forward(self, x):
        x = self.block0(x)
        x = self.gap(x)
        x = self.flatten(x)
        x = self.dense(x)
        return F.softmax(x, dim=1)

In [None]:
# Create and display the Inception model
inception_model = InceptionModel(input_shape, output_shape, filters, stack).to(device)
summary(inception_model, input_size=input_shape)
model_graph = draw_graph(inception_model, input_size=(BATCH_SIZE,)+input_shape, expand_nested=True)
model_graph.visual_graph

## üõ†Ô∏è **Batch Normalization (Inception Block with BN, 2015)**

<img src="https://miro.medium.com/v2/resize:fit:898/0*pSSzicm1IH4hXOHc.png" width="800"/>


---
**Key Features and Achievements**


*   Normalized activations in each layer
*   Reduced internal covariate shift

**Key building block:**

*   Normalize: $\hat{x} = \frac{x-\mu_B}{\sqrt{\sigma^2_B+\epsilon}}$
*   Scale and shift: $y = \gamma\hat{x} + \beta$
*   Placed before activation

**Impact:**

*   Enabled much faster training
*   Reduced sensitivity to initialization
*   Became standard in modern networks

**üìú Paper:** ["Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift", Ioffe and Szegedy](https://arxiv.org/pdf/1502.03167)

In [None]:
class InceptionBlockBN(nn.Module):
    """Inception block with Batch Normalization."""

    def __init__(self, in_channels, filters, downsample=True, stack=2):
        super().__init__()
        self.stack = stack
        self.downsample = downsample

        self.inception_modules = nn.ModuleList()
        current_channels = in_channels

        for s in range(stack):
            module = nn.ModuleDict({
                # 1x1 path with BN
                'conv1': nn.Sequential(
                    nn.Conv2d(current_channels, filters // 4, 1, padding='same', bias=False),
                    nn.BatchNorm2d(filters // 4),
                    nn.ReLU()
                ),
                # 3x3 path with reduction and BN
                'conv3_reduce': nn.Sequential(
                    nn.Conv2d(current_channels, filters // 8, 1, padding='same', bias=False),
                    nn.BatchNorm2d(filters // 8),
                    nn.ReLU()
                ),
                'conv3': nn.Sequential(
                    nn.Conv2d(filters // 8, filters // 4, 3, padding='same', bias=False),
                    nn.BatchNorm2d(filters // 4),
                    nn.ReLU()
                ),
                # 5x5 path with reduction and BN
                'conv5_reduce': nn.Sequential(
                    nn.Conv2d(current_channels, filters // 12, 1, padding='same', bias=False),
                    nn.BatchNorm2d(filters // 12),
                    nn.ReLU()
                ),
                'conv5': nn.Sequential(
                    nn.Conv2d(filters // 12, filters // 4, 5, padding='same', bias=False),
                    nn.BatchNorm2d(filters // 4),
                    nn.ReLU()
                ),
                # Pool path with BN
                'pool': nn.MaxPool2d(3, stride=1, padding=1),
                'pool_proj': nn.Sequential(
                    nn.Conv2d(current_channels, filters // 4, 1, padding='same', bias=False),
                    nn.BatchNorm2d(filters // 4),
                    nn.ReLU()
                )
            })
            self.inception_modules.append(module)
            current_channels = filters

        if downsample:
            self.pool = nn.MaxPool2d(2)

    def forward(self, x):
        for module in self.inception_modules:
            conv1 = module['conv1'](x)
            conv3 = module['conv3'](module['conv3_reduce'](x))
            conv5 = module['conv5'](module['conv5_reduce'](x))
            pool_proj = module['pool_proj'](module['pool'](x))
            x = torch.cat([conv1, conv3, conv5, pool_proj], dim=1)

        if self.downsample:
            x = self.pool(x)
        return x


class InceptionBNModel(nn.Module):
    """Complete model using InceptionBlockBN."""

    def __init__(self, input_shape, output_shape, filters=32, stack=1):
        super().__init__()

        self.block0 = InceptionBlockBN(
            in_channels=input_shape[0],
            filters=filters,
            downsample=True,
            stack=stack
        )

        self.gap = nn.AdaptiveAvgPool2d(1)
        self.flatten = nn.Flatten()
        self.dense = nn.Linear(filters, output_shape)

    def forward(self, x):
        x = self.block0(x)
        x = self.gap(x)
        x = self.flatten(x)
        x = self.dense(x)
        return F.softmax(x, dim=1)

In [None]:
# Create and display the Inception with BN model
inception_bn_model = InceptionBNModel(input_shape, output_shape, filters, stack).to(device)
summary(inception_bn_model, input_size=input_shape)
model_graph = draw_graph(inception_bn_model, input_size=(BATCH_SIZE,)+input_shape, expand_nested=True)
model_graph.visual_graph

## üõ†Ô∏è **Residual Block (ResNet, 2015)**

<img src="https://upload.wikimedia.org/wikipedia/commons/b/ba/ResBlock.png" width="800"/>


---
**Key Features and Achievements**


*   Enabled 1000+ layer networks
*   Winner of ILSVRC 2015

**Key building block:**

*   Skip connection: F(x) + x
*   Two conv layers with BN and ReLU

**Impact:**

*   Solved deep network degradation
*   Revolutionized network design

**üìú Paper:** ["Deep Residual Learning for Image Recognition", He et al.](https://arxiv.org/pdf/1512.03385)

In [None]:
class ResidualBlock(nn.Module):
    """Residual block with skip connections."""

    def __init__(self, in_channels, filters, kernel_size=3, downsample=True, stack=2):
        super().__init__()
        self.stack = stack
        self.downsample = downsample

        self.residual_units = nn.ModuleList()
        current_channels = in_channels

        for s in range(stack):
            unit = nn.ModuleDict({
                'conv1': nn.Conv2d(current_channels, filters, kernel_size, padding='same', bias=False),
                'bn1': nn.BatchNorm2d(filters),
                'conv2': nn.Conv2d(filters, filters, kernel_size, padding='same', bias=False),
                'bn2': nn.BatchNorm2d(filters),
            })

            # Projection for skip connection if dimensions don't match
            if current_channels != filters:
                unit['proj'] = nn.Sequential(
                    nn.Conv2d(current_channels, filters, 1, padding='same', bias=False),
                    nn.BatchNorm2d(filters)
                )

            self.residual_units.append(unit)
            current_channels = filters

        if downsample:
            self.pool = nn.MaxPool2d(2)

    def forward(self, x):
        for unit in self.residual_units:
            skip = x

            # Main path
            x = unit['conv1'](x)
            x = unit['bn1'](x)
            x = F.relu(x)

            x = unit['conv2'](x)
            x = unit['bn2'](x)

            # Adjust skip connection if needed
            if 'proj' in unit:
                skip = unit['proj'](skip)

            # Add skip connection and apply activation
            x = F.relu(x + skip)

        if self.downsample:
            x = self.pool(x)
        return x


class ResNetModel(nn.Module):
    """Complete model using ResidualBlock."""

    def __init__(self, input_shape, output_shape, filters=32, kernel_size=3, stack=1):
        super().__init__()

        # Initial convolution
        self.conv0 = nn.Conv2d(input_shape[0], filters, kernel_size, padding='same', bias=False)
        self.bn0 = nn.BatchNorm2d(filters)

        # Residual block
        self.block0 = ResidualBlock(
            in_channels=filters,
            filters=filters,
            kernel_size=kernel_size,
            downsample=False,
            stack=stack
        )

        self.gap = nn.AdaptiveAvgPool2d(1)
        self.flatten = nn.Flatten()
        self.dense = nn.Linear(filters, output_shape)

    def forward(self, x):
        x = self.conv0(x)
        x = self.bn0(x)
        x = F.relu(x)
        x = self.block0(x)
        x = self.gap(x)
        x = self.flatten(x)
        x = self.dense(x)
        return F.softmax(x, dim=1)

In [None]:
# Create and display the ResNet model
resnet_model = ResNetModel(input_shape, output_shape, filters, kernel_size, stack).to(device)
summary(resnet_model, input_size=input_shape)
model_graph = draw_graph(resnet_model, input_size=(BATCH_SIZE,)+input_shape, expand_nested=True)
model_graph.visual_graph

## üõ†Ô∏è **Squeeze-and-Excitation Block (SENet, 2017)**

<img src="https://miro.medium.com/v2/resize:fit:1400/1*QK1TVTasgdRYpVC31CuPyA.png" width="800"/>


---
**Key Features and Achievements**


*   Channel "attention" mechanism
*   Winner of ILSVRC 2017

**Key building block:**

*   Squeeze: global pooling
*   Excitation: channel recalibration
*   Feature rescaling

**Impact:**

*   Introduced "attention" in CNNs
*   Minimal overhead, significant gain

**üìú Paper:** ["Squeeze-and-Excitation Networks", Hu et al.](https://arxiv.org/pdf/1709.01507)

In [None]:
class SEBlock(nn.Module):
    """Squeeze-and-Excitation block."""

    def __init__(self, channels, reduction=16):
        super().__init__()

        self.squeeze = nn.AdaptiveAvgPool2d(1)
        self.excitation = nn.Sequential(
            nn.Linear(channels, channels // reduction),
            nn.ReLU(),
            nn.Linear(channels // reduction, channels),
            nn.Sigmoid()
        )

    def forward(self, x):
        b, c, _, _ = x.size()

        # Squeeze
        se = self.squeeze(x).view(b, c)

        # Excitation
        se = self.excitation(se).view(b, c, 1, 1)

        # Scale
        return x * se


class SENetBlock(nn.Module):
    """Convolutional block with Squeeze-and-Excitation."""

    def __init__(self, in_channels, filters, kernel_size=3, downsample=True, stack=2):
        super().__init__()
        self.stack = stack
        self.downsample = downsample

        self.conv_layers = nn.ModuleList()
        current_channels = in_channels

        for s in range(stack):
            layer = nn.Sequential(
                nn.Conv2d(current_channels, filters, kernel_size, padding='same', bias=False),
                nn.BatchNorm2d(filters),
                nn.ReLU(),
                SEBlock(filters)
            )
            self.conv_layers.append(layer)
            current_channels = filters

        if downsample:
            self.pool = nn.MaxPool2d(2)

    def forward(self, x):
        for layer in self.conv_layers:
            x = layer(x)

        if self.downsample:
            x = self.pool(x)
        return x


class SENetModel(nn.Module):
    """Complete model using SENetBlock."""

    def __init__(self, input_shape, output_shape, filters=32, kernel_size=3, stack=1):
        super().__init__()

        self.block0 = SENetBlock(
            in_channels=input_shape[0],
            filters=filters,
            kernel_size=kernel_size,
            downsample=False,
            stack=stack
        )

        self.gap = nn.AdaptiveAvgPool2d(1)
        self.flatten = nn.Flatten()
        self.dense = nn.Linear(filters, output_shape)

    def forward(self, x):
        x = self.block0(x)
        x = self.gap(x)
        x = self.flatten(x)
        x = self.dense(x)
        return F.softmax(x, dim=1)

In [None]:
# Create and display the SENet model
senet_model = SENetModel(input_shape, output_shape, filters, kernel_size, stack).to(device)
summary(senet_model, input_size=input_shape)
model_graph = draw_graph(senet_model, input_size=(BATCH_SIZE,)+input_shape, expand_nested=True, depth=5)
model_graph.visual_graph

## üõ†Ô∏è **Inverted Residual Bottleneck with SE (MobileNetV3, 2019)**

<img src="https://www.researchgate.net/publication/378806327/figure/fig5/AS:11431281232045939@1711587010842/MobileNetV3-network-structure.jpg" width="600"/>


---
**Key Features and Achievements**


*   Pioneered platform-aware Neural Architecture Search (NAS)
*   Introduced hardware-aware network design
*   Combined manual design with automated search
*   Optimized for mobile inference latency

**Key building block:**

*   "Enhanced" Inverted Residual Block (Expansion ratio tuned per block, SE module redesigned for efficiency, Hard-Swish activation function(h-swish))
*   Efficient last stage design (Reduced channels in first layer, moved SE to cheaper layers, platform-aware operator selection)

**Impact:**

*   Set new SOTA for mobile networks
*   Demonstrated successful NAS and human design fusion
*   Showed importance of hardware-aware architecture design
*   Influenced automated architecture search methods

**üìú Paper:** ["Searching for MobileNetV3", Howard et al.](https://arxiv.org/pdf/1905.02244)

---

<img src="https://drive.google.com/uc?export=view&id=1EcBp60nEorTDLROT_1L4a4Mvw9kLdKqd" width="300"/>

In [None]:
class HardSwish(nn.Module):
    """Hard-Swish activation function."""

    def forward(self, x):
        return x * F.relu6(x + 3.0) / 6.0


class HardSigmoid(nn.Module):
    """Hard-Sigmoid activation function."""

    def forward(self, x):
        return F.relu6(x + 3.0) / 6.0


class InvertedResidualSE(nn.Module):
    """Single MobileNetV3 Inverted Residual Block with SE."""

    def __init__(self, in_channels, out_channels, kernel_size=3, stride=1,
                 expansion_factor=6, use_hard_swish=True):
        super().__init__()

        self.use_residual = (stride == 1 and in_channels == out_channels)
        expanded_channels = in_channels * expansion_factor

        activation = HardSwish() if use_hard_swish else nn.ReLU()

        layers = []

        # Expansion phase (only if expansion_factor > 1)
        if expansion_factor != 1:
            layers.extend([
                nn.Conv2d(in_channels, expanded_channels, 1, bias=False),
                nn.BatchNorm2d(expanded_channels),
                activation
            ])

        # Depthwise convolution
        layers.extend([
            nn.Conv2d(expanded_channels, expanded_channels, kernel_size,
                     stride=stride, padding=kernel_size//2, groups=expanded_channels, bias=False),
            nn.BatchNorm2d(expanded_channels),
            activation
        ])

        self.conv = nn.Sequential(*layers)

        # Squeeze-and-Excitation
        se_channels = max(1, expanded_channels // 4)
        self.se = nn.Sequential(
            nn.AdaptiveAvgPool2d(1),
            nn.Conv2d(expanded_channels, se_channels, 1),
            nn.ReLU(),
            nn.Conv2d(se_channels, expanded_channels, 1),
            HardSigmoid()
        )

        # Projection phase
        self.project = nn.Sequential(
            nn.Conv2d(expanded_channels, out_channels, 1, bias=False),
            nn.BatchNorm2d(out_channels)
        )

    def forward(self, x):
        residual = x

        x = self.conv(x)
        x = x * self.se(x)
        x = self.project(x)

        if self.use_residual:
            x = x + residual

        return x


class MobileNetV3Block(nn.Module):
    """MobileNetV3 block with multiple Inverted Residual units."""

    def __init__(self, in_channels, filters, kernel_size=3, downsample=True,
                 stack=1, use_hard_swish=True):
        super().__init__()

        blocks = []
        current_channels = in_channels

        for s in range(stack):
            # Determine expansion factor
            expansion_factor = 1 if current_channels == filters else 6
            stride = 2 if (downsample and s == 0) else 1

            blocks.append(InvertedResidualSE(
                in_channels=current_channels,
                out_channels=filters,
                kernel_size=kernel_size,
                stride=stride,
                expansion_factor=expansion_factor,
                use_hard_swish=use_hard_swish
            ))

            current_channels = filters

        self.blocks = nn.Sequential(*blocks)

    def forward(self, x):
        return self.blocks(x)


class MobileNetV3Model(nn.Module):
    """Complete model using MobileNetV3Block."""

    def __init__(self, input_shape, output_shape, filters=32, kernel_size=3, stack=1):
        super().__init__()

        self.block0 = MobileNetV3Block(
            in_channels=input_shape[0],
            filters=filters,
            kernel_size=kernel_size,
            downsample=False,
            stack=stack
        )

        self.gap = nn.AdaptiveAvgPool2d(1)
        self.flatten = nn.Flatten()
        self.dense = nn.Linear(filters, output_shape)

    def forward(self, x):
        x = self.block0(x)
        x = self.gap(x)
        x = self.flatten(x)
        x = self.dense(x)
        return F.softmax(x, dim=1)

In [None]:
# Create and display the MobileNetV3 model
mobilenetv3_model = MobileNetV3Model(input_shape, output_shape, filters, kernel_size, stack).to(device)
summary(mobilenetv3_model, input_size=input_shape)
model_graph = draw_graph(mobilenetv3_model, input_size=(BATCH_SIZE,)+input_shape, expand_nested=True, depth=5)
model_graph.visual_graph

#  
<img src="https://airlab.deib.polimi.it/wp-content/uploads/2019/07/airlab-logo-new_cropped.png" width="350">

##### Connect with us:
- <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/8/81/LinkedIn_icon.svg/2048px-LinkedIn_icon.svg.png" width="14"> **LinkedIn:**  [AIRLab Polimi](https://www.linkedin.com/company/airlab-polimi/)
- <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/9/95/Instagram_logo_2022.svg/800px-Instagram_logo_2022.svg.png" width="14"> **Instagram:** [airlab_polimi](https://www.instagram.com/airlab_polimi/)

##### Contributors:
- **Eugenio Lomurno**: eugenio.lomurno@polimi.it
- **Alberto Archetti**: alberto.archetti@polimi.it
- **Roberto Basla**: roberto.basla@polimi.it
- **Carlo Sgaravatti**: carlo.sgaravatti@polimi.it

```
   Copyright 2025 Eugenio Lomurno, Alberto Archetti, Roberto Basla, Carlo Sgaravatti

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
```