<image src="https://raw.githubusercontent.com/ramiro999/pytorch-exploration/main/images/Banner-NiN.png" width=100%>


# <font color='#4C5FDA'> **Network in Network** </font>

The paper **<font color="EB9A54">"Network In Network"</font>** proposes an advanced architecture that enhances the capabilities of traditional convolutional neural networks (CNNs) by using neural micro-networks, namely multilayer perceptrons (MLPs), within each convolutional layer. This structure allows for a more abstract representation of the data in each layer, potentially improving classification performance.

<image src="https://raw.githubusercontent.com/ramiro999/pytorch-exploration/main/images/NiN-1.png" >




<image src="https://raw.githubusercontent.com/ramiro999/pytorch-exploration/main/images/NiN-2.png" >

**<font color="EB9A54">MLP Convolution Layer (mlpconv): </font>** This layer replaces the standard linear convolution in CNNs with a mini MLP that processes each patch of the input image. It is a combination of multiple layers fully connected with ReLU activations.

**<font color="EB9A54"> Global Average Pooling (GAP): </font>** instead of using fully connected layers on top of the network, the NIN uses a global average pooling layer followed by a softmax activation for classification. This reduces the total number of parameters and helps reduce overfitting.

In [15]:
import torch
import torch.nn as nn
import torch.nn.functional as F

In [16]:
print(torch.__version__)

2.3.0+cu121


In [17]:
class MLPConv(nn.Module):
  def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0):
    super(MLPConv, self).__init__()
    # The 3 layers defined
    self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding)
    self.fc1 = nn.Conv2d(out_channels, out_channels, 1)
    self.fc2 = nn.Conv2d(out_channels, out_channels, 1)

  # The method forward of the class defined such as the data of input is processing through of the network.
  def forward(self, x):
    x = self.conv(x)
    x = F.relu(x)
    x = self.fc1(x)
    x = F.relu(x)
    x = self.fc2(x)
    x = F.relu(x)
    return x

In [18]:
class NIN(nn.Module):
  def __init__(self):
    super(NIN, self).__init__()
    # First layer MLPConv: processes the input image
    self.mlpconv1 = MLPConv(3, 192, kernel_size=5, padding=2)
    # Second MLPConv layer: deeper processing of features
    self.mlpconv2 = MLPConv(192, 160, kernel_size=5, padding=2)
    # Deeper processing before reducing resolution.
    self.mlpconv3 = MLPConv(160, 96, kernel_size=5, padding=2)
    # Averaging layer to reduce spatial dimensions
    self.pooling = nn.AvgPool2d(kernel_size=3, stride=2, padding=1)
    # Additional MLPConv layers after resolution reduction
    self.mlpconv4 = MLPConv(96, 192, kernel_size=3, padding=1)
    # 1x1 convolutions that act before combining the characteristics before the final classification.
    self.mlpconv5 = MLPConv(192, 192, kernel_size=1)
    self.mlpconv6 = MLPConv(192, 10, kernel_size=1)
    # Global average clustering to aggregate spatial data.
    self.global_avg_pool = nn.AdaptiveAvgPool2d(1)

  def forward(self,x):
    # Pass sequentially through the first 3 layers with grouping.
    x = self.mlpconv1(x)
    x = self.pooling(x)
    x = self.mlpconv2(x)
    x = self.pooling(x)
    x = self.mlpconv3(x)
    x = self.pooling(x)
    # Additional layers without intermediate grouping.
    x = self.mlpconv4(x)
    x = self.mlpconv5(x)
    x = self.mlpconv6(x)
    # Apply global average pooling to prepare the classification.
    x = self.global_avg_pool(x)
    # Flatten the output for the final grading layer.
    x = x.view(x.size(0), -1)
    return x

model = NIN()


In [20]:
# Adding a sample of the architecture
input = torch.rand([2, 3, 112, 112])
print(input.shape)
ouput = model(input)
print(ouput.shape)

torch.Size([2, 3, 112, 112])
torch.Size([2, 10])


---
# <font color='#4C5FDA'> **Referencias** </font>

Network In Network

https://arxiv.org/pdf/1312.4400

---

**Elaborado por Ramiro Santiago Avila Chacon**