We are going to build a Resnet based Neural network to classify the CIFAR 10 Dataset. Before, we begin, let me say that the purpose of this tutorial is not to achieve the best possible accuracy on the task, but to show you how to use PyTorch.

In [1]:
import torch
import torch.nn as nn
import torch.utils.data
import torch.optim as optim
import numpy as np
import pickle
import os
from PIL import Image
import random
import time
import torchvision

cuda_available = torch.cuda.is_available()

While PyTorch provided many layers out of the box with it's `torch.nn module`, we will have to implement the residual block ourselves. Before implementing the neural network, we implement the ResNet Block.

In [2]:
class ResidualBlock(nn.Module):
    expansion = 1

    def __init__(self, in_channels, out_channels, stride=1):
        super(ResidualBlock, self).__init__()
        
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(out_channels)
        
        self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_channels)
    
        self.shortcut = nn.Sequential()
        if stride != 1 or in_channels != out_channels:
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(out_channels)
            )

    def forward(self, x):
        out = nn.ReLU(inplace=True)(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))
        out += self.shortcut(x)
        out = nn.ReLU(inplace=True)(out)
        return out

Now, we can define our full network.



In [3]:
class ResNet(nn.Module):
    def __init__(self, num_classes=10):
        super(ResNet, self).__init__()
        
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=(3, 3), stride=1, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.block1 = self._create_block(64, 64, stride=1)
        self.block2 = self._create_block(64, 128, stride=2)
        self.block3 = self._create_block(128, 256, stride=2)
        self.block4 = self._create_block(256, 512, stride=2)
        self.linear = nn.Linear(512, num_classes)
    
    def _create_block(self, in_channels, out_channels, stride):
        return nn.Sequential(
            ResidualBlock(in_channels, out_channels, stride),
            ResidualBlock(out_channels, out_channels, 1)
        )

    def forward(self, x):
        out = nn.ReLU()(self.bn1(self.conv1(x)))
        out = self.block1(out)
        out = self.block2(out)
        out = self.block3(out)
        out = self.block4(out)
        out = nn.AvgPool2d(4)(out)
        out = out.view(out.size(0), -1)
        out = self.linear(out)
        return out

### Download CIFAR-10 Dataset and Preprocess
This code snippet demonstrates how to download the CIFAR-10 dataset and preprocess it for training and testing purposes using PyTorch.on
- **CIFAR-10 Dataset**: CIFAR-10 is a popular benchmark dataset consisting of 60,000 32x32 color images in 10 classes, with 6,000 images per class. It is commonly used for image classification tasks.
- **Data Preparation**:
  - **Training Data**:
    - Downloads the CIFAR-10 training dataset to the specified directory.
    - Applies a series of transformations including resizing the images to 224x224, random horizontal and vertical flips for data augmentation, converting images to PyTorch tensors, and normalizing the pixel values.
  - **Testing Data**:
    - Downloads the CIFAR-10 testing dataset to the specified directory.
    - Applies transformations similar to the training data, except for data augmentation.
- **Transformations**:
  - `Resize`: Resizes the images to a standard size of 224x224 pixels.
  - `RandomHorizontalFlip` and `RandomVerticalFlip`: Randomly flips the images horizontally and vertically during training for data augmentation.
  - `ToTensor`: Converts the images to PyTorch tensors.
  - `Normalize`: Normalizes the pixel values of the images using pre-defined mean and standard deviation values for each channel.


In [4]:
# Download CIFAR10 dataset
data_train = torchvision.datasets.CIFAR10(
    "./data/cifar", download=True, 
    transform=torchvision.transforms.Compose([
        torchvision.transforms.Resize(32),
        torchvision.transforms.RandomHorizontalFlip(),
        torchvision.transforms.RandomVerticalFlip(),
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])
)
        
data_test = torchvision.datasets.CIFAR10(
    "./data/cifar", download=True, train=False,
    transform=torchvision.transforms.Compose([
        torchvision.transforms.Resize(32),
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])
)

Files already downloaded and verified
Files already downloaded and verified


In [5]:
N_CLASSES = 10
EPOCHS = 10
BATCH_SIZE = 64

### Define Train and Test Dataloadersw
The code snippet defines PyTorch dataloaders for the training and testing datasets, which are essential components for iterating over the data during training and evaluation processestion
- **Train Dataloader**:
  - Constructs a dataloader for the training dataset (`data_train`).
  - Batch size is set to `BATCH_SIZE` (previously defined as 64).
  - Utilizes a random sampler (`torch.utils.data.sampler.RandomSampler`) to shuffle the data before each epoch.
  - `pin_memory` is set to `True` to optimize data transfer to CUDA-compatible devices.
- **Test Dataloader**:
  - Constructs a dataloader for the testing dataset (`data_test`).
  - Batch size is set to `BATCH_SIZE`.
  - Utilizes a sequential sampler (`torch.utils.data.sampler.SequentialSampler`) to iterate over the data in a sequential manner without shuffling.
  - `pin_memory` is set to `True` for efficient data transfer during evaluation.


In [6]:
# Define train dataloader
train_dataloader = torch.utils.data.DataLoader(data_train, batch_size=BATCH_SIZE, sampler=torch.utils.data.sampler.RandomSampler(data_train), pin_memory=True)

# Define test dataloader
test_dataloader = torch.utils.data.DataLoader(data_test, batch_size=BATCH_SIZE, sampler=torch.utils.data.sampler.SequentialSampler(data_test), pin_memory=True)

### Model Initialization and Training Setupw
This code snippet initializes a ResNet model, sets up the training configuration including the loss function, optimizer, and learning rate schedulertion
- **Model Initialization**:
  - Creates an instance of the ResNet model (`clf`).
  - If CUDA is available (`cuda_available`), moves the model to the GPU for accelerated computation.

- **Loss Function**:
  - Defines the loss function as the Cross Entropy Loss (`nn.CrossEntropyLoss()`). This loss function is commonly used for multi-class classification tasks.

- **Optimizer**:
  - Configures the Stochastic Gradient Descent (SGD) optimizer (`optim.SGD`) to optimize the parameters of the ResNet model.
  - Learning rate is set to 0.1, momentum to 0.9, and weight decay to 5e-4.

- **Learning Rate Scheduler**:
  - Sets up a MultiStepLR learning rate scheduler (`torch.optim.lr_scheduler.MultiStepLR`) to adjust the learning rate during training.
  - The learning rate is reduced by a factor of 0.1 at epochs 150 and 200.


In [7]:
clf = ResNet()
if cuda_available:
    clf = clf.cuda()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(clf.parameters(), lr=0.1, momentum=0.9, weight_decay=5e-4)
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[150, 200], gamma=0.1)

In [None]:
for epoch in range(10):
    losses = []
    
    # Train
    start = time.time()
    for inputs, targets in train_dataloader:
        if cuda_available:
            inputs, targets = inputs.cuda(), targets.cuda()
        
        optimizer.zero_grad()
        outputs = clf(inputs)
        loss = criterion(outputs, targets)
        loss.backward()
        optimizer.step()
        losses.append(loss.item())
      
    # Evaluate
    clf.eval()
    total = 0
    correct = 0
    
    with torch.no_grad():
        for inputs, targets in test_dataloader:
            if cuda_available:
                inputs, targets = inputs.cuda(), targets.cuda()

            outputs = clf(inputs)
            _, predicted = torch.max(outputs, 1)
            total += targets.size(0)
            correct += predicted.eq(targets).sum().item()

        print('Epoch : %d Test Acc : %.3f' % (epoch, 100.*correct/total))
        print('--------------------------------------------------------------')
    clf.train()

    scheduler.step()



Epoch : 0 Test Acc : 40.370
--------------------------------------------------------------
Epoch : 1 Test Acc : 50.600
--------------------------------------------------------------
Epoch : 2 Test Acc : 51.660
--------------------------------------------------------------
Epoch : 3 Test Acc : 64.460
--------------------------------------------------------------
Epoch : 4 Test Acc : 67.910
--------------------------------------------------------------
Epoch : 5 Test Acc : 66.280
--------------------------------------------------------------
Epoch : 6 Test Acc : 66.100
--------------------------------------------------------------
Epoch : 7 Test Acc : 67.800
--------------------------------------------------------------
Epoch : 8 Test Acc : 71.260
--------------------------------------------------------------
