<p style="font-family:ComicSansMS; font-size: 30px;"> Convolutional Neural Network with PyTorch</p>

<p style="font-family:ComicSansMS; font-size: 24px; color: magenta"> Model A ¶</p>

<p style="font-family:ComicSansMS; font-size: 16px; color: magenta"> 2 Convolutional Layers ¶</p>
<p style="font-family:ComicSansMS; font-size: 16px; color: magenta"> Same Padding (same output size) ¶</p>
<p style="font-family:ComicSansMS; font-size: 22px; color: yellow"> 2 Max Pooling Layers ¶</p>
<p style="font-family:ComicSansMS; font-size: 16px; color: magenta"> 1 Fully Connected Layer ¶</p>

In [1]:
# Steps¶
# Step 1: Load Dataset
# Step 2: Make Dataset Iterable
# Step 3: Create Model Class
# Step 4: Instantiate Model Class
# Step 5: Instantiate Loss Class
# Step 6: Instantiate Optimizer Class
# Step 7: Train Model

> Step 1: Loading MNIST Train Dataset 

> Images from 1 to 9

> MNIST Dataset and Size of Training Dataset (Excluding Labels)

In [10]:
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets

train_dataset = dsets.MNIST(root='./data', 
                            train=True, 
                            transform=transforms.ToTensor(),
                            download=True)

test_dataset = dsets.MNIST(root='./data', 
                           train=False, 
                           transform=transforms.ToTensor())

print(train_dataset.train_data.size())

100%|██████████| 9.91M/9.91M [00:04<00:00, 1.99MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 7.22MB/s]
100%|██████████| 1.65M/1.65M [00:01<00:00, 1.27MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 1.51MB/s]

torch.Size([60000, 28, 28])





> Size of our training dataset labels

In [11]:
print(train_dataset.train_labels.size())

torch.Size([60000])


> Size of our testing dataset (excluding labels)

In [12]:
print(test_dataset.test_data.size())

torch.Size([10000, 28, 28])


> Size of our testing dataset labels

In [13]:
print(test_dataset.test_labels.size())

torch.Size([10000])


> Step 2: Make Dataset Iterable

> Load Dataset into Dataloader

In [14]:
batch_size = 100
n_iters = 3000
num_epochs = n_iters / (len(train_dataset) / batch_size)
num_epochs = int(num_epochs)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 
                                           batch_size=batch_size, 
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 
                                          batch_size=batch_size, 
                                          shuffle=False)

> Step 3: Create Model Class

> Define our simple 2 convolutional layer CNN

In [15]:
class CNNModel(nn.Module):
    def __init__(self):
        super(CNNModel, self).__init__()

        # Convolution 1
        self.cnn1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, stride=1, padding=2)
        self.relu1 = nn.ReLU()

        # Max pool 1
        self.maxpool1 = nn.MaxPool2d(kernel_size=2)

        # Convolution 2
        self.cnn2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, stride=1, padding=2)
        self.relu2 = nn.ReLU()

        # Max pool 2
        self.maxpool2 = nn.MaxPool2d(kernel_size=2)

        # Fully connected 1 (readout)
        self.fc1 = nn.Linear(32 * 7 * 7, 10) 

    def forward(self, x):
        # Convolution 1
        out = self.cnn1(x)
        out = self.relu1(out)

        # Max pool 1
        out = self.maxpool1(out)

        # Convolution 2 
        out = self.cnn2(out)
        out = self.relu2(out)

        # Max pool 2 
        out = self.maxpool2(out)

        # Resize
        # Original size: (100, 32, 7, 7)
        # out.size(0): 100
        # New out size: (100, 32*7*7)
        out = out.view(out.size(0), -1)

        # Linear function (readout)
        out = self.fc1(out)

        return out

> Step 4: Instantiate Model Class¶

In [16]:
model = CNNModel()

> Step 5: Instantiate Loss Class

In [18]:
# Convolutional Neural Network: Cross Entropy Loss
# Feedforward Neural Network: Cross Entropy Loss
# Logistic Regression: Cross Entropy Loss
# Linear Regression: MSE

> Our cross entropy loss

In [19]:
criterion = nn.CrossEntropyLoss()

> Step 6: Instantiate Optimizer Class

In [20]:
# Even simplier equation
# parameters = parameters - learning_rate * parameters_gradients
# At every iteration, we update our model's parameters

> Optimizer

In [21]:
learning_rate = 0.01

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)  

> Print model's parameter

In [22]:
print(model.parameters())

print(len(list(model.parameters())))

# Convolution 1: 16 Kernels
print(list(model.parameters())[0].size())

# Convolution 1 Bias: 16 Kernels
print(list(model.parameters())[1].size())

# Convolution 2: 32 Kernels with depth = 16
print(list(model.parameters())[2].size())

# Convolution 2 Bias: 32 Kernels with depth = 16
print(list(model.parameters())[3].size())

# Fully Connected Layer 1
print(list(model.parameters())[4].size())

# Fully Connected Layer Bias
print(list(model.parameters())[5].size())

<generator object Module.parameters at 0x0000019B30578BA0>
6
torch.Size([16, 1, 5, 5])
torch.Size([16])
torch.Size([32, 16, 5, 5])
torch.Size([32])
torch.Size([10, 1568])
torch.Size([10])


> Step 7: Train Model

In [23]:
# Process
# Convert inputs to tensors with gradient accumulation abilities
# CNN Input: (1, 28, 28)
# Feedforward NN Input: (1, 28*28)
# Clear gradient buffets
# Get output given inputs
# Get loss
# Get gradients w.r.t. parameters
# Update parameters using gradients
# parameters = parameters - learning_rate * parameters_gradients
# REPEAT

> Model training

In [24]:
iter = 0
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # Load images
        images = images.requires_grad_()

        # Clear gradients w.r.t. parameters
        optimizer.zero_grad()

        # Forward pass to get output/logits
        outputs = model(images)

        # Calculate Loss: softmax --> cross entropy loss
        loss = criterion(outputs, labels)

        # Getting gradients w.r.t. parameters
        loss.backward()

        # Updating parameters
        optimizer.step()

        iter += 1

        if iter % 500 == 0:
            # Calculate Accuracy         
            correct = 0
            total = 0
            # Iterate through test dataset
            for images, labels in test_loader:
                # Load images
                images = images.requires_grad_()

                # Forward pass only to get logits/output
                outputs = model(images)

                # Get predictions from the maximum value
                _, predicted = torch.max(outputs.data, 1)

                # Total number of labels
                total += labels.size(0)

                # Total correct predictions
                correct += (predicted == labels).sum()

            accuracy = 100 * correct / total

            # Print Loss
            print('Iteration: {}. Loss: {}. Accuracy: {}'.format(iter, loss.item(), accuracy))

Iteration: 500. Loss: 0.4485189914703369. Accuracy: 88.88999938964844
Iteration: 1000. Loss: 0.3324435353279114. Accuracy: 91.69999694824219
Iteration: 1500. Loss: 0.23597347736358643. Accuracy: 94.66000366210938
Iteration: 2000. Loss: 0.06913601607084274. Accuracy: 95.51000213623047
Iteration: 2500. Loss: 0.10545193403959274. Accuracy: 96.2300033569336
Iteration: 3000. Loss: 0.05854206904768944. Accuracy: 96.63999938964844


<p style="font-family:ComicSansMS; font-size: 24px; color: magenta"> Model A ¶</p>

<p style="font-family:ComicSansMS; font-size: 16px; color: magenta"> 2 Convolutional Layers ¶</p>
<p style="font-family:ComicSansMS; font-size: 16px; color: magenta"> Same Padding (same output size) ¶</p>
<p style="font-family:ComicSansMS; font-size: 22px; color: yellow"> 2 Average Pooling Layers ¶</p>
<p style="font-family:ComicSansMS; font-size: 16px; color: magenta"> 1 Fully Connected Layer ¶</p>

In [25]:
# Steps¶
# Step 1: Load Dataset
# Step 2: Make Dataset Iterable
# Step 3: Create Model Class
# Step 4: Instantiate Model Class
# Step 5: Instantiate Loss Class
# Step 6: Instantiate Optimizer Class
# Step 7: Train Model

> 2 Conv + 2 Average Pool + 1 FC (Zero Padding, Same Padding)

In [26]:
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets

'''
STEP 1: LOADING DATASET
'''

train_dataset = dsets.MNIST(root='./data', 
                            train=True, 
                            transform=transforms.ToTensor(),
                            download=True)

test_dataset = dsets.MNIST(root='./data', 
                           train=False, 
                           transform=transforms.ToTensor())

'''
STEP 2: MAKING DATASET ITERABLE
'''

batch_size = 100
n_iters = 3000
num_epochs = n_iters / (len(train_dataset) / batch_size)
num_epochs = int(num_epochs)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 
                                           batch_size=batch_size, 
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 
                                          batch_size=batch_size, 
                                          shuffle=False)

'''
STEP 3: CREATE MODEL CLASS
'''
class CNNModel(nn.Module):
    def __init__(self):
        super(CNNModel, self).__init__()

        # Convolution 1
        self.cnn1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, stride=1, padding=2)
        self.relu1 = nn.ReLU()

        # Average pool 1
        self.avgpool1 = nn.AvgPool2d(kernel_size=2)

        # Convolution 2
        self.cnn2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, stride=1, padding=2)
        self.relu2 = nn.ReLU()

        # Average pool 2
        self.avgpool2 = nn.AvgPool2d(kernel_size=2)

        # Fully connected 1 (readout)
        self.fc1 = nn.Linear(32 * 7 * 7, 10) 

    def forward(self, x):
        # Convolution 1
        out = self.cnn1(x)
        out = self.relu1(out)

        # Average pool 1
        out = self.avgpool1(out)

        # Convolution 2 
        out = self.cnn2(out)
        out = self.relu2(out)

        # Max pool 2 
        out = self.avgpool2(out)

        # Resize
        # Original size: (100, 32, 7, 7)
        # out.size(0): 100
        # New out size: (100, 32*7*7)
        out = out.view(out.size(0), -1)

        # Linear function (readout)
        out = self.fc1(out)

        return out

'''
STEP 4: INSTANTIATE MODEL CLASS
'''

model = CNNModel()

'''
STEP 5: INSTANTIATE LOSS CLASS
'''
criterion = nn.CrossEntropyLoss()


'''
STEP 6: INSTANTIATE OPTIMIZER CLASS
'''
learning_rate = 0.01

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

'''
STEP 7: TRAIN THE MODEL
'''
iter = 0
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # Load images as tensors with gradient accumulation abilities
        images = images.requires_grad_()

        # Clear gradients w.r.t. parameters
        optimizer.zero_grad()

        # Forward pass to get output/logits
        outputs = model(images)

        # Calculate Loss: softmax --> cross entropy loss
        loss = criterion(outputs, labels)

        # Getting gradients w.r.t. parameters
        loss.backward()

        # Updating parameters
        optimizer.step()

        iter += 1

        if iter % 500 == 0:
            # Calculate Accuracy         
            correct = 0
            total = 0
            # Iterate through test dataset
            for images, labels in test_loader:
                # Load images to tensors with gradient accumulation abilities
                images = images.requires_grad_()

                # Forward pass only to get logits/output
                outputs = model(images)

                # Get predictions from the maximum value
                _, predicted = torch.max(outputs.data, 1)

                # Total number of labels
                total += labels.size(0)

                # Total correct predictions
                correct += (predicted == labels).sum()

            accuracy = 100 * correct / total

            # Print Loss
            print('Iteration: {}. Loss: {}. Accuracy: {}'.format(iter, loss.item(), accuracy))

Iteration: 500. Loss: 0.49990642070770264. Accuracy: 85.27999877929688
Iteration: 1000. Loss: 0.5022627115249634. Accuracy: 88.37000274658203
Iteration: 1500. Loss: 0.26689252257347107. Accuracy: 90.41999816894531
Iteration: 2000. Loss: 0.33730411529541016. Accuracy: 91.47000122070312
Iteration: 2500. Loss: 0.225868359208107. Accuracy: 91.81999969482422
Iteration: 3000. Loss: 0.42214828729629517. Accuracy: 93.25


> Comparison of accuracies

In [27]:
# It seems like average pooling test accuracy is less than the max pooling accuracy! 
# Does this mean average pooling is better? This is not definitive and depends on a lot of factors 
# including the model's architecture, seed (that affects random weight initialization) and more.

<p style="font-family:ComicSansMS; font-size: 24px; color: magenta"> Model C</p>


<p style="font-family:ComicSansMS; font-size: 16px; color: magenta"> 2 Convolutional Layers</p>
<p style="font-family:ComicSansMS; font-size: 22px; color: yellow"> Valid Padding (smaller output size)</p>
<p style="font-family:ComicSansMS; font-size: 22px; color: yellow"> 2 Max Pooling Layers</p>
<p style="font-family:ComicSansMS; font-size: 16px; color: magenta"> 1 Fully Connected Layer</p>

In [28]:
# Steps¶
# Step 1: Load Dataset
# Step 2: Make Dataset Iterable
# Step 3: Create Model Class
# Step 4: Instantiate Model Class
# Step 5: Instantiate Loss Class
# Step 6: Instantiate Optimizer Class
# Step 7: Train Model

> 2 Conv + 2 Max Pool + 1 FC (Valid Padding, No Padding)

In [29]:
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets

'''
STEP 1: LOADING DATASET
'''

train_dataset = dsets.MNIST(root='./data', 
                            train=True, 
                            transform=transforms.ToTensor(),
                            download=True)

test_dataset = dsets.MNIST(root='./data', 
                           train=False, 
                           transform=transforms.ToTensor())

'''
STEP 2: MAKING DATASET ITERABLE
'''

batch_size = 100
n_iters = 3000
num_epochs = n_iters / (len(train_dataset) / batch_size)
num_epochs = int(num_epochs)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 
                                           batch_size=batch_size, 
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 
                                          batch_size=batch_size, 
                                          shuffle=False)

'''
STEP 3: CREATE MODEL CLASS
'''
class CNNModel(nn.Module):
    def __init__(self):
        super(CNNModel, self).__init__()

        # Convolution 1
        self.cnn1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, stride=1, padding=0)
        self.relu1 = nn.ReLU()

        # Max pool 1
        self.maxpool1 = nn.MaxPool2d(kernel_size=2)

        # Convolution 2
        self.cnn2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, stride=1, padding=0)
        self.relu2 = nn.ReLU()

        # Max pool 2
        self.maxpool2 = nn.MaxPool2d(kernel_size=2)

        # Fully connected 1 (readout)
        self.fc1 = nn.Linear(32 * 4 * 4, 10) 

    def forward(self, x):
        # Convolution 1
        out = self.cnn1(x)
        out = self.relu1(out)

        # Max pool 1
        out = self.maxpool1(out)

        # Convolution 2 
        out = self.cnn2(out)
        out = self.relu2(out)

        # Max pool 2 
        out = self.maxpool2(out)

        # Resize
        # Original size: (100, 32, 7, 7)
        # out.size(0): 100
        # New out size: (100, 32*7*7)
        out = out.view(out.size(0), -1)

        # Linear function (readout)
        out = self.fc1(out)

        return out

'''
STEP 4: INSTANTIATE MODEL CLASS
'''

model = CNNModel()

'''
STEP 5: INSTANTIATE LOSS CLASS
'''
criterion = nn.CrossEntropyLoss()


'''
STEP 6: INSTANTIATE OPTIMIZER CLASS
'''
learning_rate = 0.01

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

'''
STEP 7: TRAIN THE MODEL
'''
iter = 0
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # Load images as tensors with gradient accumulation abilities
        images = images.requires_grad_()

        # Clear gradients w.r.t. parameters
        optimizer.zero_grad()

        # Forward pass to get output/logits
        outputs = model(images)

        # Calculate Loss: softmax --> cross entropy loss
        loss = criterion(outputs, labels)

        # Getting gradients w.r.t. parameters
        loss.backward()

        # Updating parameters
        optimizer.step()

        iter += 1

        if iter % 500 == 0:
            # Calculate Accuracy         
            correct = 0
            total = 0
            # Iterate through test dataset
            for images, labels in test_loader:
                # Load images to tensors with gradient accumulation abilities
                images = images.requires_grad_()

                # Forward pass only to get logits/output
                outputs = model(images)

                # Get predictions from the maximum value
                _, predicted = torch.max(outputs.data, 1)

                # Total number of labels
                total += labels.size(0)

                # Total correct predictions
                correct += (predicted == labels).sum()

            accuracy = 100 * correct / total

            # Print Loss
            print('Iteration: {}. Loss: {}. Accuracy: {}'.format(iter, loss.item(), accuracy))

Iteration: 500. Loss: 0.4874335825443268. Accuracy: 88.18000030517578
Iteration: 1000. Loss: 0.47496795654296875. Accuracy: 91.7699966430664
Iteration: 1500. Loss: 0.18802450597286224. Accuracy: 93.86000061035156
Iteration: 2000. Loss: 0.17608784139156342. Accuracy: 95.05999755859375
Iteration: 2500. Loss: 0.19334834814071655. Accuracy: 95.68000030517578
Iteration: 3000. Loss: 0.155306875705719. Accuracy: 96.05000305175781


In [None]:
# Model A	        Model B	        Model C
# Max Pooling	    Average Pooling	Max Pooling
# Same Padding	Same Padding	Valid Padding
# 96.64%	        93.25%	        96.05%

In [31]:
# General Deep Learning Notes on CNN and FNN
    # 3 ways to expand a convolutional neural network
        # More convolutional layers
        # Less aggressive downsampling
        # Smaller kernel size for pooling (gradually downsampling)
        # More fully connected layers

    # Cons
        # Need a larger dataset
        # Curse of dimensionality
        # Does not necessarily mean higher accuracy