# Simple CNN with PyTorch

In this notebook example, we will walk through how to train a simple CNN to classify images.

We will rely on the following modules, including torch and torchvision.

In [1]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision import transforms

## 1. Data Loader

The first step is to create a data loader.

A data loader can be treated as a list (or iterator, technically). Each time it will provide a minibatch of (img, label) pairs.

Please wait till the number "2" apppears in the left "In[ ]" for the data to be fully downloaded, or execute this part again to see "Files already downloaded and verified".

In [2]:
# Choose a dataset -- MNIST for example
dataset = datasets.MNIST(root='data', train=True, download=True)

# Set how the input images will be transformed
dataset.transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.1307,], std=[0.3081,])
])

# Create a data loader
train_loader = DataLoader(dataset, batch_size=256, shuffle=True, num_workers=2)

# Show the shape of a batch.

print("The shape of one batch is {}".format((next(iter(train_loader)))[0].size()))

The shape of one batch is torch.Size([256, 1, 28, 28])


## 2. Model

The second step is to define our model. This part is left to be filled by yourself.

The way to define a model can be found at [this tutorial](https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#sphx-glr-beginner-blitz-cifar10-tutorial-py).

The __requirements__ are as the following:

* Define the first convolutional layer with channel size = 5, kernel size = 3 and stride = 2, padding = 1.
* Define the second convolutional layer with channel size = 8, kernel size = 3 and stride = 1, padding = 1.
* Use max pooling layer with stride = 2 between the two convolution layers.
* Define the FC layer(s) and finally return a tensor with shape torch.Size([256, 10]). (Use torch.view to reshape tensors. You can try any number of FC layers).
* Use ReLU activation between any two layers.

In [3]:
import torch.nn as nn
import torch.nn.functional as F

"""
Define the Model. The requirements are:
    Define the first convolutional layer with channel size = 5, kernel size = 3 and stride = 2, padding = 1.
    Define the second convolutional layer with channel size = 8, kernel size = 3 and stride = 1, padding = 1.
    Use max pooling layer with stride = 2 between the two convolution layers.
    Define the FC layer(s) and finally return a tensor with shape torch.Size([256, 10]). (Use torch.view to reshape tensors. You can try any number of FC layers).
    Use ReLU activation between any two layers.
"""

class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()  # Call parent class's constructor

        # Define the first convolutional layer
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=5, kernel_size=3, stride=2, padding=1)
        # Define the max pooling layer
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        # Define the second convolutional layer
        self.conv2 = nn.Conv2d(in_channels=5, out_channels=8, kernel_size=3, stride=1, padding=1)
        # Fully connected layers
        self.fc1 = nn.Linear(392, 256)           # a intermediate FC layer 
        self.fc2 = nn.Linear(256,512)            # a intermediate FC layer 
        self.fc3 = nn.Linear(512,256)            # a intermediate FC layer 
        self.fc4 = nn.Linear(256, 10)



    def forward(self, x):
        x = self.conv1(x)               #first convolutional layer
        x = F.relu(x)
        x = self.pool(x)                #pooling layer between 2 convolutional layers
        x = self.conv2(x)               #Second convolutional layer
        x = F.relu(x)
        x = x.view(-1, 392)
        x = F.relu(self.fc1(x))         #use reLu function to connect every layer
        x = F.relu(self.fc2(x))
        x = F.relu(self.fc3(x))        
        x = self.fc4(x)

        return x


model = SimpleCNN()  # You may change 'device' here !!!
print(model)

# use cuda device if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 
model = model.to(device) 
print("the device is:",device)

SimpleCNN(
  (conv1): Conv2d(1, 5, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(5, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (fc1): Linear(in_features=392, out_features=256, bias=True)
  (fc2): Linear(in_features=256, out_features=512, bias=True)
  (fc3): Linear(in_features=512, out_features=256, bias=True)
  (fc4): Linear(in_features=256, out_features=10, bias=True)
)
the device is: cpu


## 3. Loss and Optimizer

The third step is to define the loss function and the optimization algorithm.

* Define the __criterion__ to be Cross Entropy Loss.
* Define the __optimizer__ to be SGD with momentum factor 0.9 and weight_decay 5e-4.

Information can be found at PyTorch documents.

In [4]:
#############################################################################
# TODO: Define the criterion to be Cross Entropy Loss.                      #
#       Define the optimizer to be SGD with momentum factor 0.9             #
#       and weight_decay 5e-4.
# You may change the learning rate.                                      #
#############################################################################

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.03, momentum=0.9)
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, factor = 0.1, patience =5 , verbose=True)

#############################################################################
#                          END OF YOUR CODE                                 #
#############################################################################

## 4. Start training

The next step is to start the training process.

In [5]:
def train(epoch):
    running_loss = 0.0
    model.train()  # Set the model to be in training mode
    for batch_index, (inputs, targets) in enumerate(train_loader):
        # Forward
        inputs, targets = inputs.to(device), targets.to(device)
        
        outputs = model(inputs)
        loss = criterion(outputs, targets)
        loss.backward()        # Compute (or accumulate, actually) parameter gradients
        optimizer.step()       # Update the parameters
        optimizer.zero_grad()  # Set parameter gradients to zero
        running_loss += loss.item()
        
        if batch_index % 10 == 0:
            print('epoch {}  batch {}/{}  loss {:.3f}'.format(
                epoch, batch_index, len(train_loader), loss.item()))
            
    scheduler.step(loss)


In [6]:
# Choose a dataset -- MNIST for example
dataset = datasets.MNIST(root='data', train=False, download=True)

# Set how the input images will be transformed
dataset.transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.1307, ], std=[0.3081, ])
])

# Create a data loader
test_dataloader = DataLoader(dataset, batch_size=64, shuffle=False, num_workers=1)

def test(dataloader):
    model.eval()

    # Evaluate your model on the test dataset
    correct = 0
    total = 0
    with torch.no_grad():
        for images, labels in dataloader:
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs.data, -1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    # Print the accuracy and loss
    accuracy = correct / total
    print('Accuracy:', accuracy)

In [7]:
for epoch in range(0, 100):
    train(epoch)
    # You may validate model here
    test(test_dataloader)

epoch 0  batch 0/235  loss 2.302
epoch 0  batch 10/235  loss 2.298
epoch 0  batch 20/235  loss 2.287
epoch 0  batch 30/235  loss 2.268
epoch 0  batch 40/235  loss 2.156
epoch 0  batch 50/235  loss 1.441
epoch 0  batch 60/235  loss 1.782
epoch 0  batch 70/235  loss 0.718
epoch 0  batch 80/235  loss 0.612
epoch 0  batch 90/235  loss 0.487
epoch 0  batch 100/235  loss 0.459
epoch 0  batch 110/235  loss 0.250
epoch 0  batch 120/235  loss 0.312
epoch 0  batch 130/235  loss 0.338
epoch 0  batch 140/235  loss 0.201
epoch 0  batch 150/235  loss 0.259
epoch 0  batch 160/235  loss 0.273
epoch 0  batch 170/235  loss 0.258
epoch 0  batch 180/235  loss 0.184
epoch 0  batch 190/235  loss 0.226
epoch 0  batch 200/235  loss 0.224
epoch 0  batch 210/235  loss 0.195
epoch 0  batch 220/235  loss 0.290
epoch 0  batch 230/235  loss 0.172
Accuracy: 0.9496
epoch 1  batch 0/235  loss 0.181
epoch 1  batch 10/235  loss 0.088
epoch 1  batch 20/235  loss 0.232
epoch 1  batch 30/235  loss 0.163
epoch 1  batch 40/2

## 5. What's next?

We have sketched a simple framework for training CNNs. There are a few more functions yet to be completed.

  - Use gpu and cudnn
  - Do validation after each epoch
  - Adjust the learning rate

Please read the ppt carefully in class.

The learning rate is initalized as 0.03, the prog will automatically consider using cuda device when it is available.