#### Convolutional Neural Networks

* Inspired by biological brain part visual cortex

* Mostly performed good on grid like data structure. example images, time series etc.

* It has been sweeping the board in competitions for the servel years, but perhaps its first big success came in the late 90's when Yann LeCun used it to solve MNIST with 99.5% accuracy.

* A convolutional neural network (CNN) takes an input image and classifies it into any of the output classes. Each image passes through a series of different layers – primarily convolutional layers, pooling layers, and fully connected layers. The below picture summarizes what an image passes through in a CNN:


![](https://blog.paperspace.com/content/images/size/w1000/2021/05/image-32.png)


* I will implement CNN from scratch in PyTorch.


##### import Library

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

#### Convolutional Layer
* used to extract features from image
* convol operation perform between image and filter as shown in image
![](https://blog.paperspace.com/content/images/size/w1000/2021/08/Convolutional.webp)
* diffrent filters are used to extract diffrent kind of features (weights are tuned by training)


In [2]:
### code for convolue 
def apply_kernel(image, kernel):
    ri, ci = image.shape       # image dimensions
    rk, ck = kernel.shape      # kernel dimensions
    ro, co = ri-rk+1, ci-ck+1  # output dimensions
    output = torch.zeros([ro, co])
    for i in range(ro): 
        for j in range(co):
            output[i,j] = torch.sum(image[i:i+rk,j:j+ck] * kernel)
    return output

#### Pooling Layer
* used to reduce size of any image while maintaining most imp features
* common type of pooling layers are max and average pooling which take max and average value respectively from the given size of the filter
![](https://blog.paperspace.com/content/images/2021/05/image-35.png)

##### Dataset Loading

In [3]:
# loading CIFAR-10 Dataset. Dataset has 60,000 color images beloging to 10 diffrent classes.

# Define relevant variables for the ML task
batch_size = 64
num_classes = 10
learning_rate = 0.001
num_epochs = 20

# Device will determine whether to run the training on GPU or CPU.
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

In [4]:
# Use transforms.compose method to reformat images for modeling,
# and save to variable all_transforms for later use
all_transforms = transforms.Compose([transforms.Resize((32,32)),
                                     transforms.ToTensor(),
                                     transforms.Normalize(mean=[0.4914, 0.4822, 0.4465],
                                                          std=[0.2023, 0.1994, 0.2010])
                                     ])
# Create Training dataset
train_dataset = torchvision.datasets.CIFAR10(root = './data',
                                             train = True,
                                             transform = all_transforms,
                                             download = True)

# Create Testing dataset
test_dataset = torchvision.datasets.CIFAR10(root = './data',
                                            train = False,
                                            transform = all_transforms,
                                            download=True)

# Instantiate loader objects to facilitate processing
train_loader = torch.utils.data.DataLoader(dataset = train_dataset,
                                           batch_size = batch_size,
                                           shuffle = True)


test_loader = torch.utils.data.DataLoader(dataset = test_dataset,
                                           batch_size = batch_size,
                                           shuffle = True)

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz


100%|██████████| 170498071/170498071 [00:15<00:00, 11203538.84it/s]


Extracting ./data/cifar-10-python.tar.gz to ./data
Files already downloaded and verified


##### Class implement

In [5]:
## scratch Cnn class 
class ScratchCNN(nn.Module):
    
    def __init__(self, num_classes):
        super(ScratchCNN, self).__init__()
        self.conv_layer1 = nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3)
        self.conv_layer2 = nn.Conv2d(in_channels=32, out_channels=32, kernel_size=3)
        self.max_pool1 = nn.MaxPool2d(kernel_size = 2, stride = 2)
        
        self.conv_layer3 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3)
        self.conv_layer4 = nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3)
        self.max_pool2 = nn.MaxPool2d(kernel_size = 2, stride = 2)
        
        self.fc1 = nn.Linear(1600, 128)
        self.relu1 = nn.ReLU()
        self.fc2 = nn.Linear(128, num_classes)
        
    def forward(self, x):
        out = self.conv_layer1(x)
        out = self.conv_layer2(out)
        out = self.max_pool1(out)
        
        out = self.conv_layer3(out)
        out = self.conv_layer4(out)
        out = self.max_pool2(out)
                
        out = out.reshape(out.size(0), -1)
        
        out = self.fc1(out)
        out = self.relu1(out)
        out = self.fc2(out)
        return out


    

#### Set Hyperparameter

In [6]:
model = ScratchCNN(num_classes).to(device)

# Set Loss function with criterion
criterion = nn.CrossEntropyLoss()

# Set optimizer with optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate, weight_decay = 0.005, momentum = 0.9)  

total_step = len(train_loader)

#### Training

In [7]:
# We use the pre-defined number of epochs to determine how many iterations to train the network on
for epoch in range(num_epochs):
	#Load in the data in batches using the train_loader object
    for i, (images, labels) in enumerate(train_loader):  
        # Move tensors to the configured device
        images = images.to(device)
        labels = labels.to(device)
        
        optimizer.zero_grad()
        
        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)
        
        # Backward and optimize
        
        loss.backward()
        optimizer.step()

    print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, num_epochs, loss.item()))

Epoch [1/20], Loss: 1.6282
Epoch [2/20], Loss: 1.5802
Epoch [3/20], Loss: 0.8911
Epoch [4/20], Loss: 1.2056
Epoch [5/20], Loss: 1.2577
Epoch [6/20], Loss: 1.1080
Epoch [7/20], Loss: 1.3959
Epoch [8/20], Loss: 0.7590
Epoch [9/20], Loss: 0.9595
Epoch [10/20], Loss: 0.8614
Epoch [11/20], Loss: 0.7860
Epoch [12/20], Loss: 0.9462
Epoch [13/20], Loss: 0.9958
Epoch [14/20], Loss: 1.3702
Epoch [15/20], Loss: 0.3938
Epoch [16/20], Loss: 0.9042
Epoch [17/20], Loss: 0.6285
Epoch [18/20], Loss: 0.6525
Epoch [19/20], Loss: 0.4563
Epoch [20/20], Loss: 0.5478


#### Testing

In [8]:
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in train_loader:
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
    
    print('Accuracy of the network on the {} train images: {} %'.format(50000, 100 * correct / total))




Accuracy of the network on the 50000 train images: 82.76 %


##### Advantage of CNN
* Fewer parameters: A small set of parameters (the kernel) is used to calculate outputs of the entire image, so the model has much fewer parameters compared to a fully connected layer.

* Sparsity of connections: In each layer, each output element only depends on a small number of input elements, which makes the forward and backward passes more efficient.

* Parameter sharing and spatial invariance: The features learned by a kernel in one part of the image can be used to detect similar pattern in a different part of another image.