## Load and Prepare CIFAR-10 Dataset

Here, we load the CIFAR-10 dataset for both training and testing purposes. We use the `torchvision.datasets` module which provides a straightforward API to download and load this dataset automatically. The data is loaded into DataLoader instances that provide batches of images and corresponding labels, and allow shuffling and parallel processing using multiple worker threads.


In [7]:
import torch
import torchvision
import torchvision.transforms as transforms

# Set up transformations: convert images to tensors and normalize them
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

# Load and transform the CIFAR-10 training data
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

# Load and transform the CIFAR-10 testing data
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
                                         shuffle=False, num_workers=2)

# Define the classes in the CIFAR-10 dataset
classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')


Files already downloaded and verified
Files already downloaded and verified


## Define the Convolutional Neural Network

This cell defines our convolutional neural network architecture using PyTorch's `nn.Module`. The network consists of two convolutional layers followed by max pooling, and three fully connected layers. This kind of architecture is common for image classification tasks. ReLU activations are used to introduce non-linearities into the model, and a pooling layer reduces the spatial dimensions of the output from convolutional layers.


In [8]:
# Import necessary modules for building the network
import torch.nn as nn
import torch.nn.functional as F

# Define a simple convolutional neural network
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()

## Setup Loss Function and Optimizer

We define the loss function and the optimizer in this segment. The `CrossEntropyLoss` is suitable for multi-class classification problems like this one. We choose the SGD (Stochastic Gradient Descent) optimizer with momentum, which helps accelerate gradients vectors in the right directions, thus leading to faster converging.


In [9]:
# Set up the loss function and optimizer
import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

## Training the Network

This segment contains the training loop where the network learns from the training data. It iterates over the dataset, applies the forward and backward passes, updates the weights with the optimizer, and logs the loss periodically. This process is repeated for a specified number of epochs, allowing the model to improve its accuracy gradually.


In [5]:
# Train the network
for epoch in range(2):  # run the training loop twice
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')
            running_loss = 0.0
print('Finished Training')

[1,  2000] loss: 2.181
[1,  4000] loss: 1.828
[1,  6000] loss: 1.666
[1,  8000] loss: 1.578
[1, 10000] loss: 1.521
[1, 12000] loss: 1.471
[2,  2000] loss: 1.399
[2,  4000] loss: 1.402
[2,  6000] loss: 1.343
[2,  8000] loss: 1.340
[2, 10000] loss: 1.310
[2, 12000] loss: 1.295
Finished Training


## Test the Network

After training, we evaluate the performance of the network on the test dataset. This code calculates the total and correct predictions to determine the accuracy of the model. It's crucial to use `torch.no_grad()` during inference to indicate to PyTorch that we do not need to compute gradients, which reduces memory consumption and speeds up computation.


In [6]:
# Test the network on the test data
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))

Accuracy of the network on the 10000 test images: 52 %


## Additional Imports and Model Enhancements

In this cell, we include additional necessary imports and redefine our neural network with added complexity and regularization techniques such as dropout to prevent overfitting. We also introduce data augmentation techniques to improve the model's ability to generalize to new, unseen data.


In [10]:
# Redefine the CNN with additional complexity and dropout layers
class EnhancedNet(nn.Module):
    def __init__(self):
        super(EnhancedNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, 3, padding=1)
        self.conv2 = nn.Conv2d(32, 64, 3, padding=1)
        self.conv3 = nn.Conv2d(64, 128, 3, padding=1)
        self.pool = nn.MaxPool2d(2, 2)
        self.dropout1 = nn.Dropout(0.25)
        self.dropout2 = nn.Dropout(0.5)
        self.fc1 = nn.Linear(128 * 4 * 4, 512)
        self.fc2 = nn.Linear(512, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = self.pool(F.relu(self.conv3(x)))
        x = x.view(-1, 128 * 4 * 4)
        x = self.dropout1(x)
        x = F.relu(self.fc1(x))
        x = self.dropout2(x)
        x = self.fc2(x)
        return x

net = EnhancedNet()

# Optimizer setup
optimizer = optim.Adam(net.parameters(), lr=0.001)

# Loss function
criterion = nn.CrossEntropyLoss()

## Training Loop with Enhanced Network

Here, we run the training loop using the enhanced model architecture. We've increased the complexity of the network and included dropout to improve the model's performance and generalization on unseen data from the CIFAR-10 dataset.


In [15]:
# Train the network
for epoch in range(2):  # loop over the dataset multiple times
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
        if i % 500 == 499:    # print every 500 mini-batches
            print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 1000:.3f}')
            running_loss = 0.0

print('Finished Training')


[1,   500] loss: 0.498
[1,  1000] loss: 0.511
[1,  1500] loss: 0.494
[1,  2000] loss: 0.506
[1,  2500] loss: 0.527
[1,  3000] loss: 0.468
[1,  3500] loss: 0.516
[1,  4000] loss: 0.509
[1,  4500] loss: 0.499
[1,  5000] loss: 0.516
[1,  5500] loss: 0.526
[1,  6000] loss: 0.512
[1,  6500] loss: 0.529
[1,  7000] loss: 0.512
[1,  7500] loss: 0.542
[1,  8000] loss: 0.525
[1,  8500] loss: 0.521
[1,  9000] loss: 0.539
[1,  9500] loss: 0.524
[1, 10000] loss: 0.525
[1, 10500] loss: 0.525
[1, 11000] loss: 0.528
[1, 11500] loss: 0.538
[1, 12000] loss: 0.535
[1, 12500] loss: 0.518
[2,   500] loss: 0.499
[2,  1000] loss: 0.499
[2,  1500] loss: 0.499
[2,  2000] loss: 0.491
[2,  2500] loss: 0.500
[2,  3000] loss: 0.500
[2,  3500] loss: 0.516
[2,  4000] loss: 0.526
[2,  4500] loss: 0.520
[2,  5000] loss: 0.509
[2,  5500] loss: 0.508
[2,  6000] loss: 0.497
[2,  6500] loss: 0.514
[2,  7000] loss: 0.522
[2,  7500] loss: 0.515
[2,  8000] loss: 0.506
[2,  8500] loss: 0.500
[2,  9000] loss: 0.511
[2,  9500] 

## Testing the Enhanced Network

After training, we evaluate the performance of the enhanced network on the test dataset to see if our changes have improved accuracy.


In [16]:
# Test the network on the test data
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))


Accuracy of the network on the 10000 test images: 61 %
