<a href="https://colab.research.google.com/github/ishandahal/stats453-deep_learning_torch/blob/main/Conv/standardizing_images.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **Standardizing Images**

In this notebook we are going to calculate the mean and standard deviation of the training set and use it to standardize the training set. Transforms the images so that they have zero mean and unit variance accross channels.

### Imports

In [1]:
import time
import numpy as np
import torch
import torch.nn.functional as F
from torchvision import datasets
from torchvision import transforms
from torch.utils.data import DataLoader

if torch.cuda.is_available():
    torch.backends.cudnn.deterministic = True

### Settings and Dataset

In [2]:
## Settings 

# device 
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

## hyper-parameters
random_seed = 1
learning_rate = 0.05
num_epochs = 10
batch_size = 128

## architecture
num_classes = 10

### Compute the mean and standard deviation for normalization 

In [3]:
### preliminary dataloader 

train_dataset = datasets.CIFAR10(root='data',
                               train=True,
                               transform=transforms.ToTensor(),
                               download=True)

train_loader = DataLoader(dataset=train_dataset,
                          batch_size=batch_size,
                          shuffle=False)

train_mean = []
train_std = []

for i, image in enumerate(train_loader):
    numpy_images = image[0].numpy()

    batch_mean = np.mean(numpy_images, axis=(0, 2, 3))
    batch_std = np.std(numpy_images, axis=(0, 2, 3))

    train_mean.append(batch_mean)
    train_std.append(batch_std)

train_mean = torch.tensor(np.mean(train_mean, axis=0))
train_std = torch.tensor(np.mean(train_std, axis=0))

print('Mean: ', train_mean)
print('Std: ', train_std)

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to data/cifar-10-python.tar.gz


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Extracting data/cifar-10-python.tar.gz to data
Mean:  tensor([0.4914, 0.4822, 0.4465])
Std:  tensor([0.2467, 0.2432, 0.2612])


torch.ToTensor() method converts the images so that the values are in the range [0, 1] which is why we see the values of mean and std below 1

### Standardizing Dataset Loader 
Now we can use the custom function to standardize the dataset according to the computed mean and standard deviation 

In [4]:
custom_transform = transforms.Compose([transforms.ToTensor(),
                                       transforms.Normalize(train_mean, train_std)])

In [5]:
## preparing the dataset 

train_dataset = datasets.CIFAR10(root='data',
                               train=True,
                               transform=custom_transform,
                               download=True)

test_dataset = datasets.CIFAR10(root='data',
                               train=False,
                               transform=custom_transform)

train_loader = DataLoader(dataset=train_dataset,
                          batch_size=batch_size,
                          shuffle=True,)

test_loader = DataLoader(dataset=test_dataset,
                         batch_size=batch_size,
                         shuffle=False)

Files already downloaded and verified


In [15]:
## checking the dataset 

for images, labels in train_loader:
    print(f"Feature batch dimensions: ", images.size())
    print(f"Target batch dimensions: ", labels.size())
    break

for images, labels in test_loader:
    print(f"Feature batch dimensions: ", images.size())
    print(f"Target batch dimensions: ", labels.size())
    break

Feature batch dimensions:  torch.Size([128, 3, 32, 32])
Target batch dimensions:  torch.Size([128])
Feature batch dimensions:  torch.Size([128, 3, 32, 32])
Target batch dimensions:  torch.Size([128])


For the above batch check to see that the mean is roughly 0 and std is 1. 

In [7]:
print('Channel mean for the batch: ', torch.mean(images, dim=(0, 2, 3)))
print('Channel std for the batch: ', torch.std(images, dim=(0, 2, 3)))

Channel mean for the batch:  tensor([-0.0189, -0.0161, -0.0251])
Channel std for the batch:  tensor([1.0145, 1.0248, 1.0157])


In [10]:
## model 

class ConvNet(torch.nn.Module):

    def __init__(self, num_classes):
        super(ConvNet, self).__init__()

        ## calculating same padding: (w - k + 2*p)/s + 1 = o
        ## => p = s (o - 1) - w + k / 2

        # 32x32x3 => 32x32x4
        self.conv_1 = torch.nn.Conv2d(in_channels=3,
                                      out_channels=4,
                                      kernel_size=(3, 3),
                                      stride=(1, 1),
                                      padding=(2)) # 1(32 - 1) - 32 + 3 / 2 = 2
        # 32x32x4 => 16x16x4
        self.pool_1 = torch.nn.MaxPool2d(kernel_size=(2, 2),
                                         stride=(2, 2),
                                         padding=0) # 2(16-1) - 32 + 2 / 2 = 0
        # 16x16x4 => 16x16x8
        self.conv_2 = torch.nn.Conv2d(in_channels=4,
                                      out_channels=8,
                                      kernel_size=(3, 3),
                                      stride=(1, 1),
                                      padding=1) # 1(16 - 1) - 16 + 3 / 2 = 1
        # 16x16x8 => 8x8x8
        self.pool_2 = torch.nn.MaxPool2d(kernel_size=(2, 2),
                                         stride=(2, 2),
                                         padding=0) # 2(8 - 1) - 16 + 2 / 2 = 0
        # 8x8x8 => 10x10
        self.linear_1 = torch.nn.Linear(in_features=8*8*8,
                                        out_features=num_classes,
                                        )
    
    def forward(self, x):
        out = self.conv_1(x)
        out = F.relu(out)
        out = self.pool_1(out)

        out = self.conv_2(out)
        out = F.relu(out)
        out = self.pool_2(out)

        out = torch.flatten(out, 1)

        logits = self.linear_1(out)
        probas = F.softmax(logits, dim=1)
        return logits, probas 

torch.manual_seed(random_seed)
model = ConvNet(num_classes)

model = model.to(device)

optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

### Training

In [11]:
def compute_accuracy(model, data_loader, device):
    accuracy, num_examples = 0, 0

    for features, labels in data_loader:
        features = features.to(device)
        labels = labels.to(device)

        logits, probas = model(features)
        _, predicted_labels = torch.max(probas, dim=1)
        accuracy += (predicted_labels == labels).sum()
        num_examples += features.size(0)

    return accuracy.float() / num_examples * 100

start_time = time.time()

for epoch in range(num_epochs):
    model.train()
    for batch_idx, (features, labels) in enumerate(train_loader):

        features = features.to(device)
        labels = labels.to(device)

        ## forward and backward
        logit, probas = model(features)

        cost = F.cross_entropy(logit, labels)
        optimizer.zero_grad()

        cost.backward()

        ## update model parameters
        optimizer.step()

        ## logging 

        if not batch_idx % 50:
            print(f"Epoch: {epoch+1:03d}/{num_epochs:03d}  |  Batch no: {batch_idx:03d}/{len(train_loader):03d}  "
                  f"|  Cost: {cost:.4f}")
    
    model.eval()
    print(f"Training Accuracy: {compute_accuracy(model, train_loader, device=device):.2f}%", end="")
    print(f"Time elapsed: {(time.time() - start_time)/60:.2f} min")

print(f"Total time elapsed: {(time.time() - start_time) / 60:.2f} min")

Epoch: 001/010  |  Batch no: 000/391  |  Cost: 2.3050
Epoch: 001/010  |  Batch no: 050/391  |  Cost: 2.0531
Epoch: 001/010  |  Batch no: 100/391  |  Cost: 2.0734
Epoch: 001/010  |  Batch no: 150/391  |  Cost: 1.8774
Epoch: 001/010  |  Batch no: 200/391  |  Cost: 1.8091
Epoch: 001/010  |  Batch no: 250/391  |  Cost: 1.7566
Epoch: 001/010  |  Batch no: 300/391  |  Cost: 1.5912
Epoch: 001/010  |  Batch no: 350/391  |  Cost: 1.8087
Training Accuracy: 42.41%Time elapsed: 0.34 min
Epoch: 002/010  |  Batch no: 000/391  |  Cost: 1.5954
Epoch: 002/010  |  Batch no: 050/391  |  Cost: 1.5798
Epoch: 002/010  |  Batch no: 100/391  |  Cost: 1.5831
Epoch: 002/010  |  Batch no: 150/391  |  Cost: 1.5584
Epoch: 002/010  |  Batch no: 200/391  |  Cost: 1.5754
Epoch: 002/010  |  Batch no: 250/391  |  Cost: 1.4309
Epoch: 002/010  |  Batch no: 300/391  |  Cost: 1.3717
Epoch: 002/010  |  Batch no: 350/391  |  Cost: 1.5754
Training Accuracy: 49.53%Time elapsed: 0.69 min
Epoch: 003/010  |  Batch no: 000/391  | 

### Evaluation

In [13]:
print("Test accuracy: %.2f%%" % compute_accuracy(model, test_loader, device))

Test accuracy: 55.06%
