<a href="https://colab.research.google.com/github/Krishanu2206/Some_Implementations-/blob/main/Implementations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#THE LENET ARCHITECTURE
The LeNet architecture is a simple and classic convolutional neural network (CNN) architecture, originally designed for handwritten digit recognition (MNIST dataset) by Yann LeCun. The architecture consists of two sets of convolutional and pooling layers, followed by fully connected layers.

Convolutional Layer 1: Takes a 1-channel input (e.g., grayscale image) and outputs 6 feature maps with a kernel size of 5x5.

Max Pooling Layer 1: Applies 2x2 max pooling.

Convolutional Layer 2: Takes 6 input channels and outputs 16 feature maps with a kernel size of 5x5.

Max Pooling Layer 2: Applies 2x2 max pooling.

Flattening: Converts the 2D feature maps into a 1D feature vector.

Fully Connected Layer 1: Outputs 120 features.

Fully Connected Layer 2: Outputs 84 features.

Output Layer: Outputs 10 features, corresponding to the number of classes.

This architecture works well for small image classification tasks.

`1x32x32 Input -> (5x5), s=1, p=0 -> (5x5), s=1, p=0 -> avg pool s=2, p=0 -> conv (5x5) to 120 channels x linear 120 -> 84 x Linear 10`

In [None]:
import torch
import torch.nn as nn

In [None]:
class LeNet(nn.Module):
  def __init__(self):
    super(LeNet, self).__init__()
    self.relu = nn.ReLU()
    self.flatten = nn.Flatten()
    self.pool = nn.AvgPool2d(kernel_size=(2,2), stride=(2,2))
    self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=(5,5), stride=(1,1), padding=(0,0))
    self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=(5,5), stride=(1,1), padding=(0,0))
    self.conv3 = nn.Conv2d(in_channels=16, out_channels=120, kernel_size=(5,5), stride=(1,1), padding=(0,0))
    self.linear1 = nn.Linear(in_features=120, out_features=84)
    self.linear2 = nn.Linear(in_features=84, out_features=10)

  def forward(self, x):
    x = self.relu(self.conv1(x))
    x = self.pool(x)
    x = self.relu(self.conv2(x))
    x = self.pool(x)
    x = self.conv3(x)
    x = self.flatten(x)
    x = self.relu(self.linear1(x))
    x = self.linear2(x)
    return x

In [None]:
x = torch.randn(64, 1, 32, 32)

In [None]:
##Creating an instance of the model
model = LeNet()

In [None]:
print(model(x).shape)

torch.Size([64, 10])


#GOOGLENET/INCEPTIONNET ARCHITECTURE

GoogleNet, also known as InceptionNet, is a deep convolutional neural network architecture introduced by Szegedy et al. in 2014. It won the ILSVRC 2014 challenge and is known for its Inception modules that capture multi-scale features.

Key Features:
Inception Module: Utilizes parallel convolutions of different kernel sizes (1x1, 3x3, 5x5) and max pooling to extract features at various scales. Outputs from these layers are concatenated, combining information efficiently.
Dimensionality Reduction: Uses 1x1 convolutions to reduce the number of input channels before applying more computationally expensive convolutions, reducing the overall computation cost.
Auxiliary Classifiers: Includes auxiliary classifiers at intermediate layers to combat the vanishing gradient problem and improve training efficiency.
Deep Architecture: Contains 22 layers, stacking multiple Inception modules, making it significantly deeper than earlier networks like AlexNet and VGG.
Benefits:
Multi-Scale Feature Learning: Captures fine to coarse features in the same module.
Efficient Computation: Reduces computational cost while maintaining high accuracy.
State-of-the-Art Performance: Achieved high accuracy on large-scale image classification tasks.
GoogleNet’s innovative design influenced many subsequent deep learning architectures and remains an important milestone in the development of convolutional neural networks.

For More - https://sahiltinky94.medium.com/know-about-googlenet-and-implementation-using-pytorch-92f827d675db

https://arxiv.org/pdf/1409.4842v1







In [None]:
import torch
import torch.nn as nn

In [None]:
class conv_block(nn.Module):
  def __init__(self, in_channels, out_channels, **kwargs):
    super(conv_block, self).__init__()
    self.relu = nn.ReLU()
    self.conv = nn.Conv2d(in_channels, out_channels, **kwargs) # **kwargs = kernel size/stride/padding
    self.batchnorm = nn.BatchNorm2d(num_features = out_channels)

  def forward(self, x):
    return self.relu(self.batchnorm(self.conv(x)))


In [None]:
class Inception(nn.Module):
  def __init__(self, in_channels, out_1x1, red_3x3, out_3x3, red_5x5, out_5x5, out_1x1pool): #red = reduction
    super(Inception, self).__init__()

    self.branch1 = conv_block(in_channels, out_1x1, kernel_size=1)

    self.branch2 = nn.Sequential(
        conv_block(in_channels, red_3x3, kernel_size=(1,1)),
        conv_block(red_3x3, out_3x3, kernel_size=(3,3), padding=(1,1))
    )

    self.branch3 = nn.Sequential(
        conv_block(in_channels, red_5x5, kernel_size=(1,1)),
        conv_block(red_5x5, out_5x5, kernel_size=(5,5), padding=(2,2))
    )

    self.branch_pool = nn.Sequential(
        nn.MaxPool2d(kernel_size=(3,3), stride=(1,1), padding=(1,1)),
        conv_block(in_channels, out_1x1pool, kernel_size=(1,1))
    )

  def forward(self, x):
    #N * filters * height * width
    return torch.cat([self.branch1(x), self.branch2(x), self.branch3(x), self.branch_pool(x)], 1) ##1 indicates the dimention along which the cnctenation needs to be done

In [None]:
class GoogleNet(nn.Module):
  def __init__(self, in_channels=3, num_classes=1000):
    super(GoogleNet, self).__init__()

    self.conv1 = conv_block(in_channels=3, out_channels=64, kernel_size=(7,7), stride=(2,2), padding=(3,3))
    self.maxpool1 = nn.MaxPool2d(kernel_size=(3,3), stride=(2,2), padding=(1,1))

    self.conv2 = conv_block(in_channels=64, out_channels=192, kernel_size=(3,3), stride=(1,1), padding=(1,1))
    self.maxpool2 = nn.MaxPool2d(kernel_size=(3,3), stride=(2,2), padding=(1,1))

    self.inception3a = Inception(in_channels=192, out_1x1=64, red_3x3=96, out_3x3=128, red_5x5=16, out_5x5=32, out_1x1pool=32)
    self.inception3b = Inception(in_channels=256, out_1x1=128, red_3x3=128, out_3x3=192, red_5x5=32, out_5x5=96, out_1x1pool=64)
    self.maxpool3 = nn.MaxPool2d(kernel_size=(3,3), stride=(2,2), padding=(1,1))

    self.inception4a = Inception(480, 192, 96, 208, 16, 48, 64)
    self.inception4b = Inception(512, 160, 112, 224, 24, 64, 64)
    self.inception4c = Inception(512, 128, 128, 256, 24, 64, 64)
    self.inception4d = Inception(512, 112, 144, 288, 32, 64, 64)
    self.inception4e = Inception(528, 256, 160, 320, 32, 128, 128)
    self.maxpool4 = nn.MaxPool2d(kernel_size=(3,3), stride=(2,2), padding=(1,1))

    self.inception5a = Inception(832, 256, 160, 320, 32, 128, 128)
    self.inception5b = Inception(832, 384, 192, 384, 48, 128, 128)
    self.avgpool = nn.AvgPool2d(kernel_size=(7,7), stride=(1,1))
    self.dropout = nn.Dropout(p=0.4)
    self.fc1 = nn.Linear(1024, 1000)

  def forward(self, x):
    x = self.conv1(x)
    x = self.maxpool1(x)
    x = self.conv2(x)
    x = self.maxpool2(x)
    x = self.inception3a(x)
    x = self.inception3b(x)
    x = self.maxpool3(x)
    x = self.inception4a(x)
    x = self.inception4b(x)
    x = self.inception4c(x)
    x = self.inception4d(x)
    x = self.inception4e(x)
    x = self.maxpool4(x)
    x = self.inception5a(x)
    x = self.inception5b(x)
    x = self.avgpool(x)
    x = x.reshape(x.shape[0], -1) #2024 x 1 x 1 -> (2024x1x1 = 2024)
    x = self.dropout(x)
    x = self.fc1(x)
    return x


In [None]:
x=torch.randn(3, 3, 224, 224)

In [None]:
#create an instance of the class
model = GoogleNet()

In [None]:
y = model(x).shape
y

torch.Size([3, 1000])

#RESNET ARCHITECTURE

For understanding RESNET and why to use RESNET -
https://medium.com/@karuneshu21/resnet-paper-walkthrough-b7f3bdba55f0

https://arxiv.org/pdf/1512.03385

In [1]:
import torch
import torch.nn as nn

In [21]:
class Bottleneck(nn.Module):
  """
        Creates a Bottleneck with conv 1x1->3x3->1x1 layers.

        Note:
          1. Addition of feature maps occur at just before the final ReLU with the input feature maps
          2. if input size is different from output, select projected mapping or else identity mapping.
          3. if is_Bottleneck=False (3x3->3x3) are used else (1x1->3x3->1x1). Bottleneck is required for resnet-50/101/152
        Args:
            in_channels (int) : input channels to the Bottleneck
            intermediate_channels (int) : number of channels to 3x3 conv
            expansion (int) : factor by which the input #channels are increased
            stride (int) : stride applied in the 3x3 conv. 2 for first Bottleneck of the block and 1 for remaining

        Attributes:
            Layer consisting of conv->batchnorm->relu

        """
  def __init__(self, in_channels, intermediate_channels, expansion, is_Bottleneck, stride=1):
    super(Bottleneck, self).__init__()

    self.in_channels = in_channels
    self.expansion = expansion
    self.is_Bottleneck = is_Bottleneck
    self.intermediate_channels = intermediate_channels
    self.stride = stride

    self.relu = nn.ReLU()

    ##checking if the dimension of F(x) = dim(x)
    if self.in_channels == self.expansion * self.intermediate_channels:
      self.identity = True
    else:
      self.identity = False
      projection_layer = []
      projection_layer.append(nn.Conv2d(in_channels = self.in_channels, out_channels = self.expansion * self.intermediate_channels, kernel_size = 1, stride=stride, padding = 0, bias = False))
      projection_layer.append(nn.BatchNorm2d(num_features = self.expansion * self.intermediate_channels))
      self.projection = nn.Sequential(*projection_layer)

    if self.is_Bottleneck:
      self.convblock1_1x1 = nn.Conv2d(in_channels = self.in_channels, out_channels = self.intermediate_channels, kernel_size = 1, stride=1, padding = 0, bias = False)
      self.batchnorm1 = nn.BatchNorm2d(num_features = self.intermediate_channels)

      self.convblock2_3x3 = nn.Conv2d(in_channels = self.intermediate_channels, out_channels = self.intermediate_channels, kernel_size = 3, stride=self.stride, padding = 1, bias = False)
      self.batchnorm2 = nn.BatchNorm2d(num_features = intermediate_channels)

      self.convblock3_1x1 = nn.Conv2d(in_channels = self.intermediate_channels, out_channels = self.expansion * self.intermediate_channels, kernel_size = 1, stride=1, padding = 0, bias = False)
      self.batchnorm3 = nn.BatchNorm2d(num_features = self.expansion * self.intermediate_channels)

    else:
      #basic block
      self.convblock1_3x3 = nn.Conv2d(in_channels = self.in_channels, out_channels = self.intermediate_channels, kernel_size = 3, stride=self.stride, padding=1, bias = False)
      self.batchnorm1 = nn.BatchNorm2d(num_features = intermediate_channels)

      self.convblock2_3x3 = nn.Conv2d(in_channels = self.intermediate_channels, out_channels = self.intermediate_channels, kernel_size = 3, stride=1, padding=1, bias = False)
      self.batchnorm2 = nn.BatchNorm2d(num_features = self.intermediate_channels)

  def forward(self, x):
    copy = x
    if self.is_Bottleneck:
      x = self.batchnorm3(self.convblock3_1x1(self.relu(self.batchnorm2(self.convblock2_3x3(self.relu(self.batchnorm1(self.convblock1_1x1(x))))))))
    else:
      x = self.batchnorm2(self.convblock2_3x3(self.relu(self.batchnorm1(self.convblock1_3x3(x)))))

    if self.identity:
      x = x + copy
    else:
      x = x + self.projection(copy)

    return self.relu(x)


In [22]:
##Now lets create the ResNet
class ResNet(nn.Module):
  def __init__(self, input, in_channels, num_classes):
    """
        Creates the ResNet architecture based on the provided variant. 18/34/50/101 etc.
        Based on the input parameters, define the channels list, repeatition list along with expansion factor(4) and stride(3/1)
        using _make_blocks method, create a sequence of multiple Bottlenecks
        Average Pool at the end before the FC layer

        Args:
            resnet_variant (list) : eg. [[64,128,256,512],[3,4,6,3],4,True]
            in_channels (int) : image channels (3)
            num_classes (int) : output #classes

        Attributes:
            Layer consisting of conv->batchnorm->relu

        """
    super(ResNet, self).__init__()
    self.in_channels = in_channels
    self.num_classes = num_classes
    self.channels_list = input[0]
    self.repeatition_list = input[1]
    self.expansion = input[2]
    self.is_Bottleneck = input[3]

    self.convblock1 = nn.Conv2d(in_channels = self.in_channels, out_channels = 64, kernel_size =7, stride=2, padding=3, bias = False)
    self.batchnorm1 = nn.BatchNorm2d(num_features = 64)
    self.relu = nn.ReLU()

    self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)

    self.convblock2 = self.make_blocks(in_channels = 64, intermediate_channels = self.channels_list[0], repeat = self.repeatition_list[0], expansion = self.expansion, is_Bottleneck = self.is_Bottleneck, stride = 1)
    self.convblock3 = self.make_blocks(self.channels_list[0]*self.expansion, self.channels_list[1], self.repeatition_list[1], self.expansion, self.is_Bottleneck, stride=2)
    self.convblock4 = self.make_blocks(self.channels_list[1]*self.expansion, self.channels_list[2], self.repeatition_list[2], self.expansion, self.is_Bottleneck, stride=2)
    self.convblock5 = self.make_blocks(self.channels_list[2]*self.expansion, self.channels_list[3], self.repeatition_list[3], self.expansion, self.is_Bottleneck, stride=2)

    self.avgpool = nn.AdaptiveAvgPool2d((1,1))
    self.flatten = nn.Flatten()
    self.fc = nn.Linear(self.channels_list[3]*self.expansion, self.num_classes)

  def forward(self, x):
    x = self.relu(self.batchnorm1(self.convblock1(x)))
    x = self.maxpool(x)
    x = self.convblock5(self.convblock4(self.convblock3(self.convblock2(x))))
    x = self.flatten(self.avgpool(x))
    x = self.fc(x)

    return x

  def make_blocks(self, in_channels, intermediate_channels, repeat, expansion, is_Bottleneck, stride):

    layers = []
    layers.append(Bottleneck(in_channels = in_channels, intermediate_channels = intermediate_channels, expansion = expansion, is_Bottleneck = is_Bottleneck, stride=stride))
    for i in range(repeat-1):
      layers.append(Bottleneck(in_channels = expansion * intermediate_channels, intermediate_channels = intermediate_channels, expansion = expansion, is_Bottleneck = is_Bottleneck, stride=1))

    return nn.Sequential(*layers)

In [23]:
params = [[64,128,256,512],[3,4,6,3],4,True]
model = ResNet( params, in_channels=3, num_classes=1000)
x = torch.randn(1,3,224,224)
output = model(x)
print(output.shape)

torch.Size([1, 1000])


#EFFICIENT NET ARCHITECTURE

For more info - https://medium.com/@aniketthomas27/efficientnet-implementation-from-scratch-in-pytorch-a-step-by-step-guide-a7bb96f2bdaa

In [13]:
import torch
import torch.nn as nn
import math
from math import ceil

In [14]:
##initialising the required parameters
basic_mb_params = [
    # k, channels(c), repeats(t), stride(s), kernel_size(k)
    [1, 16, 1, 1, 3],
    [6, 24, 2, 2, 3],
    [6, 40, 2, 2, 5],
    [6, 80, 3, 2, 3],
    [6, 112, 3, 1, 5],
    [6, 192, 4, 2, 5],
    [6, 320, 1, 1, 3],
]

alpha, beta = 1.2, 1.1

scale_values = {
    # (phi, resolution, dropout)
    "b0": (0, 224, 0.2),
    "b1": (0.5, 240, 0.2),
    "b2": (1, 260, 0.3),
    "b3": (2, 300, 0.3),
    "b4": (3, 380, 0.4),
    "b5": (4, 456, 0.4),
    "b6": (5, 528, 0.5),
    "b7": (6, 600, 0.5),
}

In [15]:
#Conv Block
class ConvBlock(nn.Module):
  def __init__(self, in_channels, out_channels, kernel_size, stride, padding, groups=1):
    super(ConvBlock, self).__init__()
    self.convblock = nn.Sequential(
        nn.Conv2d(in_channels = in_channels, out_channels = out_channels, kernel_size = kernel_size, stride = stride, padding = padding, groups = groups),
        nn.BatchNorm2d(num_features = out_channels),
        nn.SiLU()
    )

  def forward(self, x):
    return self.convblock(x)

In [25]:
#SqueezeExcitation
class SqueezeExcitation(nn.Module):
  def __init__(self, in_channels, reduced_channels):
    super(SqueezeExcitation, self).__init__()
    self.se = nn.Sequential(
      nn.AdaptiveAvgPool2d(1),# C x H x W -> C x 1 x 1
      nn.Conv2d(in_channels = in_channels, out_channels = reduced_channels, kernel_size = 1),
      nn.SiLU(),
      nn.Conv2d(in_channels = reduced_channels, out_channels = in_channels, kernel_size = 1),
      nn.Sigmoid()
    )
  def forward(self,x):
    return x * self.se(x)


In [26]:
#MBBlock
class MBBlock(nn.Module):
  def __init__(self, in_channels, out_channels, kernel_size, stride, padding, ratio, reduction=2):
    super(MBBlock, self).__init__()
    hidden_channels = in_channels * ratio
    self.expand=0
    if hidden_channels != in_channels:
      self.expand = 1

    reduced_channels = int(in_channels /reduction)

    if self.expand == 1:
      self.expand_conv = ConvBlock(in_channels = in_channels, out_channels = hidden_channels, kernel_size = 1, stride = 1, padding = 1)

    self.conv = nn.Sequential(
        ConvBlock(in_channels = hidden_channels, out_channels = hidden_channels, kernel_size = kernel_size, stride = stride, padding = padding, groups = hidden_channels),
        SqueezeExcitation(hidden_channels, reduced_channels),
        nn.Conv2d(in_channels = hidden_channels, out_channels = out_channels, kernel_size = 1),
        nn.BatchNorm2d(num_features = out_channels)
    )

  def forward(self, x):
    if self.expand:
      x = self.expand_conv(x)

    return self.conv(x)

In [27]:
#EfficientNet
class EfficientNet(nn.Module):
  def __init__(self, model_name, output):
    super(EfficientNet, self).__init__()
    phi, resolution, dropout = scale_values[model_name]
    self.depth_factor, self.width_factor = alpha ** phi, beta ** phi
    self.num_classes = output
    self.last_channels = ceil(1280 * self.width_factor)
    self.avgpool = nn.AdaptiveAvgPool2d(1)
    self.feature_extractor()
    self.flatten = nn.Flatten()
    self.classifier = nn.Sequential(
        nn.Dropout(dropout),
        nn.Linear(self.last_channels, self.num_classes)
    )

  def feature_extractor(self):
    channels = int(32 * self.width_factor)
    features = [ConvBlock(in_channels = 3, out_channels = channels, kernel_size = 3, stride = 2, padding = 1)]
    in_channels = channels

    for k, c, repeat, s, n in basic_mb_params:
      out_channels = 4*ceil(int(c*self.width_factor)/4)
      depth = int(repeat * self.depth_factor)

      for layer in range(depth):
        if layer == 0:
          stride = s
        else:
          stride =1
        features.append(MBBlock(in_channels, out_channels, n, stride, n//2, k))
        in_channels = out_channels

    features.append(ConvBlock(in_channels = in_channels, out_channels = self.last_channels, kernel_size=1, stride=1, padding=0))

    self.extractor = nn.Sequential(*features)

  def forward(self, x):
    return self.classifier(self.flatten(self.avgpool(self.extractor(x))))


In [28]:
model_name = 'b1'
output = 1000 ##num_classes
#creating an instance of the model
model = EfficientNet(model_name, 1000)

In [29]:
#testing
x = torch.randn(1,3,224,224)
output = model(x)
print(output.shape)

torch.Size([1, 1000])
