<a href="https://colab.research.google.com/github/meliksahb/501-DeepLearning/blob/main/CENG_501_PyTorch_CNN_on_MNIST.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Training a CNN model on MNIST using PyTorch

Sample MNIST images:

![MNIST examples](https://www.researchgate.net/profile/Stefan_Elfwing/publication/266205382/figure/fig5/AS:267913563209738@1440886979379/Example-images-of-the-ten-handwritten-digits-in-the-MNIST-training-set.png)

- 10 classes
- 60 thousand training images
- 10 thousand testing images
- Each image is monochrome, 28-by-28 pixels.

In [2]:
#@title Import the required modules
import torch
import torch.nn as nn
import torchvision.datasets as datasets
import torchvision.transforms as transforms

In [3]:
# Define the "device". If GPU is available, device is set to use it, otherwise CPU will be used.
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

In [4]:
#@title Download the dataset
train_data = datasets.MNIST(root = './data', train = True,
                        transform = transforms.ToTensor(), download = True)

test_data = datasets.MNIST(root = './data', train = False,
                       transform = transforms.ToTensor())

100%|██████████| 9.91M/9.91M [00:00<00:00, 18.2MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 477kB/s]
100%|██████████| 1.65M/1.65M [00:00<00:00, 4.44MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 9.62MB/s]


In [5]:
# About the ToTensor() transformation.

# PyTorch networks expect a tensor as input with dimensions N*C*H*W  where
# N: batch size
# C: channel size
# H: height
# W: width

# Normally an image is of size H*W*C.
# ToTensor() transformation moves the channel dimension to the beginning as needed by PyTorch.

In [6]:
#@title Define the data loaders
batch_size = 100
train_loader = torch.utils.data.DataLoader(dataset = train_data,
                                             batch_size = batch_size,
                                             shuffle = True)

test_loader = torch.utils.data.DataLoader(dataset =  test_data ,
                                      batch_size = batch_size,
                                      shuffle = False)

In [24]:
#@title Define a CNN network

class CNN(nn.Module):
    #This defines the structure of the NN.
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5,padding=2)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5,padding=2)
        self.conv3 = nn.Conv2d(20, 40, kernel_size=5,padding=2)
        self.conv4 = nn.Conv2d(40, 80, kernel_size=5,padding=2)
        self.conv_drop = nn.Dropout2d()  #Dropout
        self.fc1 = nn.Linear(720, 128) # Adjusted input size for fc1
        self.fc2 = nn.Linear(128, 10) # Adjusted input size for fc2
        self.pool = nn.MaxPool2d(2,padding=1)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.pool(self.relu(self.conv1(x)))
        #print(x.shape)
        x = self.pool(self.relu(self.conv2(x)))
        x = self.conv_drop(x)  # Apply dropout here
        #print(x.shape)
        x = self.pool(self.relu(self.conv3(x)))
        x = self.conv_drop(x)
        #print(x.shape) # Print shape after the new layer and pooling
        x = self.pool(self.relu(self.conv4(x)))
        x = self.conv_drop(x)
        #print(x.shape) # Print shape after the new layer and pooling
        x = x.view(-1, 720)
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x


# Create an instance
net = CNN().to(device)

Calculating dimensions:


1.   After conv layer

dimension of output = (Input_dim + 2*pooling - kernel size)/(stride) + 1


2.   After max pooling

dimension of output = (Input_dim + 2*pooling - dilation*(kernel size-1) - 1)/(stride) + 1



---


For example:

Input= 1x28x28 (CxHxW)

Conv1 = (1,10,5,padding=2) # No stride, number of filters are 10, kernel size is 5x5 ➞ Output: 10x28x28

Maxpool1 = (2,padding=1) ➞ Output: 10x15x15

Conv2 = (10,20,5,padding=2) # No stride, number of filters are 20, kernel size is 5x5 ➞ Output: 20x15x15

Maxpool2 = (2,padding=1) ➞ Output: 20x8x8  # 8.5 round up to 8


In [25]:
print(net)

CNN(
  (conv1): Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (conv2): Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (conv3): Conv2d(20, 40, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (conv4): Conv2d(40, 80, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (conv_drop): Dropout2d(p=0.5, inplace=False)
  (fc1): Linear(in_features=720, out_features=128, bias=True)
  (fc2): Linear(in_features=128, out_features=10, bias=True)
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=1, dilation=1, ceil_mode=False)
  (relu): ReLU()
)


In [26]:
#@title Define the loss function and the optimizer
loss_fun = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam( net.parameters(), lr=1.e-3)

In [27]:
#@title Train the model

num_epochs = 5
for epoch in range(num_epochs):
  for i ,(images,labels) in enumerate(train_loader):
    images = images.to(device)
    labels = labels.to(device)

    optimizer.zero_grad()
    output = net(images)
    loss = loss_fun(output, labels)
    loss.backward()
    optimizer.step()

    if (i+1) % batch_size == 0:
      print('Epoch [%d/%d], Step [%d/%d], Loss: %.4f'
                 %(epoch+1, num_epochs, i+1, len(train_data)//batch_size, loss.item()))

Epoch [1/5], Step [100/600], Loss: 0.6062
Epoch [1/5], Step [200/600], Loss: 0.5638
Epoch [1/5], Step [300/600], Loss: 0.2168
Epoch [1/5], Step [400/600], Loss: 0.1573
Epoch [1/5], Step [500/600], Loss: 0.2267
Epoch [1/5], Step [600/600], Loss: 0.1919
Epoch [2/5], Step [100/600], Loss: 0.1885
Epoch [2/5], Step [200/600], Loss: 0.1062
Epoch [2/5], Step [300/600], Loss: 0.1097
Epoch [2/5], Step [400/600], Loss: 0.1048
Epoch [2/5], Step [500/600], Loss: 0.0849
Epoch [2/5], Step [600/600], Loss: 0.1487
Epoch [3/5], Step [100/600], Loss: 0.2629
Epoch [3/5], Step [200/600], Loss: 0.2253
Epoch [3/5], Step [300/600], Loss: 0.1843
Epoch [3/5], Step [400/600], Loss: 0.1199
Epoch [3/5], Step [500/600], Loss: 0.0829
Epoch [3/5], Step [600/600], Loss: 0.0163
Epoch [4/5], Step [100/600], Loss: 0.2304
Epoch [4/5], Step [200/600], Loss: 0.0766
Epoch [4/5], Step [300/600], Loss: 0.0265
Epoch [4/5], Step [400/600], Loss: 0.1153
Epoch [4/5], Step [500/600], Loss: 0.0512
Epoch [4/5], Step [600/600], Loss:

In [28]:
#@title Run the trained model on the testing set

correct = 0
total = 0
for images,labels in test_loader:
  images = images.to(device)
  labels = labels.to(device)

  out = net(images)
  _, predicted_labels = torch.max(out,1)
  correct += (predicted_labels == labels).sum()
  total += labels.size(0)

print('Percent correct: %.3f %%' %((100*correct)/(total+1)))

Percent correct: 97.630 %


End of the notebook.