# A Simple CNN on CIFAR10

In this notebook we build a simple CNN and apply it to the CIFAR10 dataset. We start by loading the necessary libraries.

In [9]:
import numpy as np
import torch
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import torch.optim as optim

CIFAR10 comes with the dataset collection of PyTorch. The data are downloaded to the `data` folder. We then create a data loader for the training set and the test set.

In [10]:
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

# Creation of data loaders for training and test set.
trainset = torchvision.datasets.CIFAR10(
    root='../data', train=True,
    download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=20, shuffle=True, num_workers=4)

testset = torchvision.datasets.CIFAR10(
    root='../data', train=False,
    download=True, transform=transform)
testloader = torch.utils.data.DataLoader(
    testset, batch_size=20, shuffle=False, num_workers=4)

Files already downloaded and verified
Files already downloaded and verified


We then create a simple model consisting of two consecutive pairs of convolutional and pooling layers. More precisely, the first convolutional layer creates 16 filters with a square kernel of 5. The second layer creates 32 filters, with the same kernel size.

In [11]:
class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5)
        self.fc1 = nn.Linear(in_features=32 * 5 * 5, out_features=120)
        self.fc2 = nn.Linear(in_features=120, out_features=84)
        self.fc3 = nn.Linear(in_features=84, out_features=10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


net = Net()
net.cuda()

Net(
  (conv1): Conv2d (3, 16, kernel_size=(5, 5), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1))
  (conv2): Conv2d (6, 32, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=800, out_features=120)
  (fc2): Linear(in_features=120, out_features=84)
  (fc3): Linear(in_features=84, out_features=10)
)

We now create a loss function based on cross-entropy. Note that the output of the last dense layer of the model is *not* a softmax. The [documentation](http://pytorch.org/docs/master/nn.html#crossentropyloss) says: "This criterion combines LogSoftMax and NLLLoss in one single class". We then use SGD with momentum for the optimizer.

In [12]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

Note also that the batch size for the training and the test sets are selected when creating the data iterator. Note also that we send the `inputs` and the `labels` to the GPU before calling `Variable`.

In [13]:
for epoch in range(3):
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs, labels = Variable(inputs.cuda()), Variable(labels.cuda())
        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.data[0]
    print('[%d] loss: %.3f' %
          (epoch + 1, running_loss / (i + 1)))
    running_loss = 0.0

RuntimeError: Given groups=1, weight[32, 6, 5, 5], so expected input[20, 16, 14, 14] to have 6 channels, but got 16 channels instead

We now create a test set iterator, initialize the number of correct prediction, and compute the accuracy. Note that we must put the `images` and the `labels` on the GPU as well, as the model is on the GPU. If we don't, an exception will be raised, due to the incompatible types.

In [None]:
dataiter = iter(testloader)
correct = 0
total = 0

for data in testloader:
    # ipdb.set_trace()
    images, labels = data
    outputs = net(Variable(images.cuda()))
    _, predicted = torch.max(outputs.data, 1)
    total += labels.size(0)
    correct += (predicted == labels.cuda()).sum()

print('Accuracy: {}'.format(100 * correct / total))