## Module 4 - Core Components of Neural Networks

## CNNs

* Linear layers treat every pixel as independent
* Convolutional layers scan for patterns in images using filters
* CNNs mimic biological vision
* By assigning different weights in the filter, we highlight different kinds of patterns.
* Model learns by itself the best filters for the task


In [None]:
# basic convolutional layer

conv_layer = nn.Conv2d(
    in_channels = 3,  # no. of input channels (rgb - 3, grayscale - 1)
    out_channels = 16, # no. of filters
    kernel_size = 3,
    stride = 1,
    padding = 1
)

In [None]:
# Full channels
class SimpleCNN(nn.Module):

    def __init__(self):
        super(SimpleCNN, self).__init__()

        #first convolutional layer
        self.conv1 = nn.Conv2d(in_channel = 1, out_channel = 32, kernel_size=3, padding=1)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(kernel_size=2)  

        #second convolutional layer
        self.conv2 = nn.Conv2d(in_channel = 32, out_channel = 64, kernel_size=3, padding=1)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(kernel_size=2)

        self.dropout = nn.Dropout(0.5) # added after activation but before final classification layer

        # fully connected layer
        self.fc = nn.Linear(64*7*7,10)

    def forward(self,x):
        # first conv block
        x = self.conv1(x)
        x = self.relu1(x)
        x = self.pool1(x)

        #second conv block
        x = self.conv2(x)
        x = self.relu2(x)
        x = self.pool2(x)

        #flatten before the fully connected layer
        x = self.flatten(x)

        #fully connected layer
        x = self.fc(x)

        return x

# create instance of CNN

model = SimpleCNN()
print(model)



Dropout: randomly drop a percentage of neurons

* This breaks reliance on certain patterns.
* Prevents model from getting lazy and leaning on shortcut patterns and help learn robust features
* It prevents co-adaptation by forcing the model to learn robust features instead of relying on shortcuts

In [None]:
# loss func
loss_function = nn.CrossEntropyLoss()

#Define optimizer
optimizer = optim.Adam(model.parameters(), lr=0.0005, weight_decay=0.0005)

In [None]:
Weight Decay: Instead of turning neurons off, it discourages the networks from using very large weights.

Large weights can be a sign of overfitting. Weight decay adds a small penalty for large weights, nudging the model towards simpler and more robust solutions.