## **CNN works on images and below is the example of the consisting layers.**

<img src="./convolutional/convolutional_0.png" alt="convolutional"/>

<img src="./convolutional/convolutional_1.png" alt="convolutional" />
<img src="./convolutional/convolutional_2.png" alt="convolutional" />

## after the convolution layers, there are pooling layers.

![pooling](./convolutional/max_pool.png)

![CNN](./convolutional/convolutional_4.png)

In [15]:
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
import torch.nn.functional as F 

In [16]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

In [17]:
#Hyper Parameter
num_epochs = 10
batch_size = 4
learning_rate = 0.001

## Dataset has PILImage images of rgb value [0,255].
## we have to normalize that rgb values.
## <span style = 'color:cyan'>To normalize with pytorch, image array needs to be Tensor </span>

## **To change the rgb value ranging from 0 to 255, we need to normalize those tensor in the range of (-1,1) <span style='color:cyan'>because it is easy to change than the range of (0,1)</span>**

## **We have rbg which means 3 channels. Pytorch let us to normalize the each channel separately. That's why we got 3 means and 3 std**

## **If we want the range of (0,1), we have to calculate ourselves and fill them**

In [18]:
transforming = transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))])

In [19]:
train_dataset = torchvision.datasets.CIFAR10(root='./dataset',train=True,download=True,transform=transforming)
test_dataset = torchvision.datasets.CIFAR10(root='./dataset',train=False,download=True,transform=transforming)

Files already downloaded and verified
Files already downloaded and verified


# Shape of the dataset is as follows.

In [20]:
print(f'The shape of training dataset is {train_dataset.data.shape} and that of test dataset is {test_dataset.data.shape}')
print(f'Type of data is {type(train_dataset.data)}')

The shape of training dataset is (50000, 32, 32, 3) and that of test dataset is (10000, 32, 32, 3)
Type of data is <class 'numpy.ndarray'>


# TRANSFOMRING OF SINGLE IMAGE

In [21]:
test_img = train_dataset.data[0]
print(type(test_img))
print(test_img.dtype)
print(f'Mean and std before normalize:\nmean = {test_img.mean()}, std = {test_img.std()}')

<class 'numpy.ndarray'>
uint8
Mean and std before normalize:
mean = 103.447265625, std = 51.97371837166704


In [22]:
test_tensor = transforming(test_img)
print(type(test_tensor))
print(f'Mean and std after normalize:\nmean = {test_tensor.mean()}, std = {test_tensor.std()}')

<class 'torch.Tensor'>
Mean and std after normalize:
mean = -0.18864887952804565, std = 0.4077033996582031


# <span style='color:cyan'>***Transform argument at cell [15] in train_dataset does not work until you call the DataLoader.***</span>

In [None]:
#back to pipeline
train_loader = torch.utils.data.DataLoader(train_dataset,batch_size=batch_size,shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset,batch_size=batch_size,shuffle=False)

<span style='color:cyan'>kernel_size means filter_size in convolutional layer</span>

In [None]:
classes = ('plane','car','bird','cat','deer','dog','frog','horse','ship','truck')

# <span style='color:cyan'>Check details of image size reducing down in test_cnn.py</span>

In [None]:
class ConvNet(nn.Module):
    def __init__(self,inputs_channel_size,outputs_channel_size,filter_size,pool_kernel_size):
        super(ConvNet,self).__init__()
        self.conv1 = nn.Conv2d(in_channels = inputs_channel_size,
                               out_channels = outputs_channel_size, 
                               kernel_size = filter_size)
        self.pool = nn.MaxPool2d(kernel_size = pool_kernel_size)
        self.conv2 = nn.Conv2d(in_channels = outputs_channel_size,
                               out_channels = 16,
                               kernel_size = filter_size)
        self.fc1 = nn.Linear(in_features = 16*5*5,
                             out_features = 120)
        self.fc2 = nn.Linear(in_features = 120,
                             out_features = 84)
        self.fc3 = nn.Linear(in_features = 84,
                             out_features = 10)    
    
    def forward(self,x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1,16*5*5)  # tensor flatten
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x
    

## <span style='color:cyan'>The size of image would be smaller after leaving the convlutional layer.</span>

calculation of image size after pooling is as below.

![conv calculation](./convolutional/output_size_after_pooling.png)

original size is (32x32).

filter AKA kernel size is 5.

That's why <span style='color:cyan'>32x32 would become 28x28</span> after first convolutional layer.

In [None]:
in_channel_size = 3 # rgb colors channels
out_channel_size = 6 # anything you want
filter_size = 5
pool_kernel_size = (2,2)

In [None]:
model = ConvNet(inputs_channel_size=in_channel_size,
                outputs_channel_size=out_channel_size,pool_kernel_size=pool_kernel_size,
                filter_size = filter_size).to(device)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr = learning_rate)

In [None]:
n_total_steps = len(train_loader)
for epoch in range(num_epochs):
    for i, (images,labels) in enumerate(train_loader):
        # watch test_cnn.py  
        # origin shape: [4,3,32,32] = 4,3,1024
        images = images.to(device)
        labels = labels.to(device)
        #forward pass
        outputs = model(images)
        loss = criterion(outputs,labels)
        #backward
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        if (i+1)%2000 == 0:
            print(f'Epoch {epoch+1}/{num_epochs},step {i+1}/{n_total_steps}, loss {loss.item():.3f}')

In [None]:
with torch.no_grad():
    n_correct = 0
    n_sample = 0
    n_class_correct = [0 for i in range(10)]
    n_class_samples = [0 for i in range(10)]
    for images,labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        # max returns (value,index)
        _,predicted_index = torch.max(outputs,1)
        n_sample += labels.size(0)
        n_correct += (predicted_index==labels).sum().item()
        
        for i in range(batch_size):
            label = labels[i]
            pred = predicted_index[i]
            if (label == pred):
                n_class_correct[label] += 1
            n_class_samples[label] += 1
        
    acc = 100.0 * n_correct/n_sample
    print(f'Accuracy is {acc}')
    
    for i in range(10):
        acc = 100.0 * n_class_correct[i]/n_class_samples[i]
        print(f'Accuracy of {classes[i]}:{acc}')
        