# Convoluntional Neural Networks

<div style="display: flex; align-items: center;">
    <img src="../imgs/CNN.jpg" alt="Your Image" width="600" style="margin-right: 20px;">
    <div>
        <p>Convolutional Neural Networks are deep learning models or multi-layer perceptrons similar to artificial neural networks, commonly used to analyze visual images. The founder of convolutional neural networks is renowned computer scientist Yann LeCun. He was the first person to solve handwritten digit problems using convolutional neural networks on the MNIST dataset.</p>
        <p>The architecture of convolutional neural networks is very similar to that of conventional artificial neural networks, especially in the last layer of the network, which is fully connected. Furthermore, it is noted that convolutional neural networks can accept multiple feature maps as inputs rather than vectors.</p>
    </div>
</div>

A convolutional neural network mainly consists of the following 5 layers:
- **Data input layer/Input layer**: Preprocessing the raw image data.(In this chapter, we don't use it, because we have already preprocessed the MNIST data in the class MNISTTrainer.)
- **Convolutional computation layer/CONV layer**: Processing image data, which extracts image features by performing element multiplication operations on the input image through filters (also known as convolution kernels) and summing them up.
- **ReLU incentive layer**: Perform nonlinear mapping on the output results of convolutional layers.
- **Pooling layer**: Used to compress data and parameters, reduce overfitting
- **Fully connected layer/FC layer**: Like MLP.

## CNN
Before building model.Let's see the parameters of the two new layers:<br>
**nn.Conv2d**:<br>`in_channels`: Input channels.<br>`out_channels`: Output channels.<br>`kernel_size`: Size of the convolutional kernel.<br>`stride`: Stride of the convolution.<br>`padding`: Zero-padding added to both sides of the input.<br>`groups`:The number of groups. This is an optional parameter that defaults to 1. When groups is greater than 1, the input and output channels are divided into groups number of groups, and the convolutional kernel operates within these groups. Specifically, when groups is equal to in_channels, depthwise separable convolution can be performed, which can significantly reduce the number of model parameters and computational requirements.<br><br>
**nn.MaxPool2d**:<br>`kernel_size`: Size of the max pooling window.<br>`stride`: Stride of the max pooling operation.

In [1]:
import torch
import torch.nn as nn

class CNN(nn.Module):
    def __init__(self, input_channels, output_size):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(input_channels, 32, kernel_size=3, padding=1)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
        self.pool = nn.MaxPool2d(2, 2)
        # we can see fc1 and fc2 as a MLP
        self.fc1 = nn.Linear(64 * 7 * 7, 128)
        self.fc2 = nn.Linear(128, output_size)

    def forward(self, x):
        x = self.pool(torch.relu(self.conv1(x)))
        x = self.pool(torch.relu(self.conv2(x)))
        x = x.view(-1, 64 * 7 * 7)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

## Use class MINISTTrainer to train CNN

In [None]:
import sys
sys.path.append('../tools')
from MNIST import MNISTTrainer

model = CNN(input_channels=1, output_size=10)
trainer = MNISTTrainer(model, loss='CE', lr=0.01, optimizer='Adam', batch_size=256, epoch=10, model_type='classification')
trainer.train()

2024-05-18 15:24:46
Epoch 1 / 10


[Train]: 100%|████████████████████████| 211/211 [00:11<00:00, 17.62it/s, train_loss=0.309]
[Valid]: 100%|█████████████████████| 24/24 [00:01<00:00, 17.33it/s, val_acc=97.5, val_loss=0.000343]


2024-05-18 15:24:59
Epoch 2 / 10


[Train]: 100%|███████████████████████| 211/211 [00:13<00:00, 16.02it/s, train_loss=0.0632]
[Valid]: 100%|█████████████████████| 24/24 [00:01<00:00, 17.89it/s, val_acc=98.6, val_loss=0.000177]


2024-05-18 15:25:14
Epoch 3 / 10


[Train]: 100%|███████████████████████| 211/211 [00:13<00:00, 15.74it/s, train_loss=0.0491]
[Valid]: 100%|█████████████████████| 24/24 [00:01<00:00, 17.09it/s, val_acc=98.7, val_loss=0.000174]


2024-05-18 15:25:28
Epoch 4 / 10


[Train]: 100%|███████████████████████| 211/211 [00:14<00:00, 14.73it/s, train_loss=0.0411]
[Valid]: 100%|█████████████████████| 24/24 [00:01<00:00, 19.02it/s, val_acc=98.2, val_loss=0.000264]


2024-05-18 15:25:44
Epoch 5 / 10


[Train]: 100%|███████████████████████| 211/211 [00:12<00:00, 16.90it/s, train_loss=0.0381]
[Valid]: 100%|█████████████████████| 24/24 [00:01<00:00, 17.33it/s, val_acc=98.3, val_loss=0.000226]


2024-05-18 15:25:58
Epoch 6 / 10


[Train]: 100%|███████████████████████| 211/211 [00:14<00:00, 14.33it/s, train_loss=0.0368]
[Valid]: 100%|█████████████████████| 24/24 [00:01<00:00, 16.59it/s, val_acc=98.8, val_loss=0.000189]


2024-05-18 15:26:14
Epoch 7 / 10


[Train]: 100%|█████████████████████████| 211/211 [00:14<00:00, 14.53it/s, train_loss=0.03]
[Valid]: 100%|█████████████████████| 24/24 [00:01<00:00, 16.76it/s, val_acc=98.3, val_loss=0.000251]


2024-05-18 15:26:30
Epoch 8 / 10


[Train]: 100%|███████████████████████| 211/211 [00:14<00:00, 14.32it/s, train_loss=0.0322]
[Valid]: 100%|██████████████████████| 24/24 [00:01<00:00, 16.30it/s, val_acc=98.2, val_loss=0.00027]


2024-05-18 15:26:46
Epoch 9 / 10


[Train]: 100%|███████████████████████| 211/211 [00:14<00:00, 14.52it/s, train_loss=0.0296]
[Valid]: 100%|█████████████████████| 24/24 [00:01<00:00, 17.36it/s, val_acc=98.6, val_loss=0.000202]


2024-05-18 15:27:02
Epoch 10 / 10


[Train]: 100%|███████████████████████| 211/211 [00:14<00:00, 14.42it/s, train_loss=0.0307]
[Valid]: 100%|█████████████████████| 24/24 [00:01<00:00, 17.43it/s, val_acc=98.7, val_loss=0.000197]


In [None]:
trainer.test()