# UNet Model on CIFAR-10 Dataset
*Author: Preetham Ramesh*

## Objective
The goal of this notebook is to apply the UNet pretrained model on the CIFAR-10 dataset.

## Resources
* **About UNet**
The UNet model is a convolutional neural network commonly used for image segmentation tasks. Its architecture consists of a contracting path to capture context and a symmetric expanding path for precise localization. This U-shaped network structure enables the model to efficiently learn and segment images by combining high-resolution features from the contracting path with localization information from the expanding path.

* **About CIFAR-10**
The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. http://www.cs.toronto.edu/~kriz/cifar.html

## 1.) Import packages and notebook settings

In [1]:
from collections import OrderedDict
import os
import random
import numpy as np
import torch
import torch.nn as nn

In [2]:
device = torch.device('mps' if torch.backends.mps.is_available() else ('cuda:0' if torch.cuda.is_available() else 'cpu'))
print(device)

cuda:0


**Seed Setting**:
Different sources of randomness contribute to the result of a neural network model. Nevertheless, a good neural network model should not depend on the eed but the data, architecture, and hyperparameters used. We introduce a seed value for the sake of reproducibility of our results. We set the `seed_value` to `42` for the following sources of randomness:
1. within the environment
2. within Python
3. within some packages like numpy and torch
4. and anywhere else where randomness is introduced like within architectures (some dropout layers introduce randomness)


In [3]:
seed_value = 42

# 1. Set `PYTHONHASHSEED` environment variable at a fixed value
os.environ['PYTHONHASHSEED']=str(seed_value)

# 2. Set `python` built-in pseudo-random generator at a fixed value
random.seed(seed_value)

# 3. Set `numpy` and `torch` pseudo-random generator at a fixed value
np.random.seed(seed_value)
torch.manual_seed(seed_value)

<torch._C.Generator at 0x7fce7410f2f0>

**Variable setting**

In [43]:
BATCH_SIZE = 64
NUM_EPOCHS = 100
LR = 0.001

**Model**

In [44]:
import torch.nn as nn

class UNetCIFAR(nn.Module):

    def __init__(self, in_channels=3, out_channels=10, init_features=32):
        super(UNetCIFAR, self).__init__()

        features = init_features
        
        # Encoder
        self.encoder1 = UNetCIFAR._block(in_channels, features, name="enc1")
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.encoder2 = UNetCIFAR._block(features, features * 2, name="enc2")
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
        
        # Bottleneck
        self.bottleneck = UNetCIFAR._block(features * 2, features * 4, name="bottleneck")
        
        # Decoder
        self.upconv1 = nn.ConvTranspose2d(
            features * 4, features * 2, kernel_size=2, stride=2
        )
        self.decoder1 = UNetCIFAR._block(features * 4, features * 2, name="dec1")
        self.upconv2 = nn.ConvTranspose2d(
            features * 2, features, kernel_size=2, stride=2
        )
        self.decoder2 = UNetCIFAR._block(features * 2, features, name="dec2")
        
        # Output Convolution
        self.conv = nn.Conv2d(
            in_channels=features, out_channels=out_channels, kernel_size=1
        )

        self.conv = nn.Conv2d(
            in_channels=features, out_channels=out_channels, kernel_size=1
        )

    def forward(self, x):
        enc1 = self.encoder1(x)
        enc2 = self.encoder2(self.pool1(enc1))

        bottleneck = self.bottleneck(self.pool2(enc2))

        dec1 = self.upconv1(bottleneck)
        dec1 = torch.cat((dec1, enc2), dim=1)
        dec1 = self.decoder1(dec1)
        dec2 = self.upconv2(dec1)
        dec2 = torch.cat((dec2, enc1), dim=1)
        dec2 = self.decoder2(dec2)
        return self.conv(dec2)

    @staticmethod
    def _block(in_channels, features, name):
        return nn.Sequential(
            nn.Conv2d(
                in_channels=in_channels,
                out_channels=features,
                kernel_size=3,
                padding=1,
                bias=False,
            ),
            nn.BatchNorm2d(num_features=features),
            nn.ReLU(inplace=True),
            nn.Conv2d(
                in_channels=features,
                out_channels=features,
                kernel_size=3,
                padding=1,
                bias=False,
            ),
            nn.BatchNorm2d(num_features=features),
            nn.ReLU(inplace=True),
        )


**Transform and Load Data**

In [46]:
from torchvision import datasets, transforms
from torch.utils.data import random_split

# Define transformations for the dataset (optional but recommended)
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))  # Normalize image data
])

# Download and load the CIFAR-10 dataset
train_dataset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)

val_size = 7500

train_size = len(train_dataset) - val_size
train_dataset, val_dataset = random_split(train_dataset, [train_size, val_size])

# DataLoaders to handle batches and shuffling
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=BATCH_SIZE)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=BATCH_SIZE, shuffle=False)

Files already downloaded and verified
Files already downloaded and verified


**Training**

In [47]:
import torch.optim as optim

# Assuming you have instantiated the UNetCIFAR model
model = UNetCIFAR().to(device)  
criterion = nn.MSELoss()  # Define the loss function
optimizer = optim.Adam(model.parameters(), lr=LR)  # Define the optimizer

# Training loop
for epoch in range(NUM_EPOCHS):
    model.train()
    running_loss = 0.0
    for i, data in enumerate(train_loader, 0):
        inputs, labels = data
        inputs, labels = inputs.to(device), labels.to(device).float()
        
        # Reshape labels to match the expected format for CrossEntropyLoss
        # labels = labels.unsqueeze(-2)  # Remove any single-dimensional entries
        # labels = labels.long()  # Ensure labels are of type long (for indices)
        
        optimizer.zero_grad()
        outputs = model(inputs)
        # print(outputs.shape)

        labels = labels.view(-1, 1, 1, 1)
        labels = labels.repeat(1, 10, 32, 32)

        loss = criterion(outputs.float(), labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        if i % 100 == 99:
            print(f"Epoch: {epoch + 1}, Batch: {i + 1}, Loss: {running_loss / 100:.4f}")
            running_loss = 0.0


Epoch: 1, Batch: 100, Loss: 19.7130
Epoch: 1, Batch: 200, Loss: 10.1696
Epoch: 1, Batch: 300, Loss: 7.5505
Epoch: 1, Batch: 400, Loss: 7.4835
Epoch: 1, Batch: 500, Loss: 6.8185
Epoch: 1, Batch: 600, Loss: 6.6497
Epoch: 2, Batch: 100, Loss: 6.1528
Epoch: 2, Batch: 200, Loss: 5.7138
Epoch: 2, Batch: 300, Loss: 5.8116
Epoch: 2, Batch: 400, Loss: 5.4837
Epoch: 2, Batch: 500, Loss: 5.3434
Epoch: 2, Batch: 600, Loss: 5.3246
Epoch: 3, Batch: 100, Loss: 4.9010
Epoch: 3, Batch: 200, Loss: 4.9260
Epoch: 3, Batch: 300, Loss: 4.7508
Epoch: 3, Batch: 400, Loss: 4.5863
Epoch: 3, Batch: 500, Loss: 4.6473
Epoch: 3, Batch: 600, Loss: 4.5947
Epoch: 4, Batch: 100, Loss: 4.3219
Epoch: 4, Batch: 200, Loss: 4.0834
Epoch: 4, Batch: 300, Loss: 4.1170
Epoch: 4, Batch: 400, Loss: 4.2058
Epoch: 4, Batch: 500, Loss: 4.3716
Epoch: 4, Batch: 600, Loss: 4.2756
Epoch: 5, Batch: 100, Loss: 4.0920
Epoch: 5, Batch: 200, Loss: 3.9160
Epoch: 5, Batch: 300, Loss: 3.9722
Epoch: 5, Batch: 400, Loss: 3.8883
Epoch: 5, Batch: 5

In [38]:
model.eval()  
total_val_loss = 0.0
correct_val = 0
total_val = 0

with torch.no_grad():
    for i, data in enumerate(val_loader, 0):
        inputs, labels = data
        inputs, labels = inputs.to(device), labels.to(device).float()
        
        labels = labels.view(-1, 1, 1, 1)
        labels = labels.repeat(1, 10, 32, 32)

        outputs = model(inputs)
        val_loss = criterion(outputs.float(), labels)
        total_val_loss += val_loss.item()

        # Compute validation accuracy
        _, predicted = outputs.max(1)
        total_val += labels.size(0) * 32 * 32  # Total number of pixels
        correct_val += predicted.eq(labels.view_as(predicted)).sum().item()

# Calculate metrics
train_loss = running_loss / len(train_loader)
val_loss = total_val_loss / len(val_loader)
val_accuracy = 100 * correct_val / total_val

print(f"Epoch: {epoch + 1}, Train Loss: {train_loss:.4f}, Val Loss: {val_loss:.4f}")
print(f"Val Accuracy: {val_accuracy:.2f}%")



RuntimeError: shape '[64, 32, 32]' is invalid for input of size 655360