## 8. The Impact of Applying Batch Normalization

This technique normalize the value (z) at each hidden units of the hidden layers similar to the operation of scaling the input data

In [1]:
import torch 
import torch.nn as nn 
from torch.utils.data import Dataset, DataLoader
from torch.optim import SGD, Adam
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import warnings
warnings.filterwarnings("ignore")
device = 'cuda' if torch.cuda.is_available() else 'cpu'

In [2]:
from torchvision import datasets
data_folder = "Datasets"

In [3]:
fmnist = datasets.FashionMNIST(data_folder, download=True, train=True)
train_images = fmnist.data                                
train_targets = fmnist.targets

In [4]:
validation_fmnist = datasets.FashionMNIST(data_folder, download=True, train=False)
validation_images = validation_fmnist.data                                
validation_targets = validation_fmnist.targets 

In [5]:
from common_functions import get_data,train_with_validation,display_train_validation_results

In [6]:
train_data_loader, validation_data_loader = get_data(32,train_images,train_targets,validation_images,validation_targets)

In [7]:
class NeuralNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.input_to_hidden_layer = nn.Linear(28*28, 1000)
        #Declaring batch_norm to perform the batch normalization
        self.batch_norm = nn.BatchNorm1d(1000)
        self.hidden_layer_activation = nn.ReLU()
        self.hidden_layer_to_output = nn.Linear(1000,10)
    
    def forward(self,x):
        x = self.input_to_hidden_layer(x)
        x0 = self.batch_norm(x)
        x1 = self.hidden_layer_activation(x0)
        x2 = self.hidden_layer_to_output(x1)
        
        return x2 

In [8]:
def build_model(optimizer = Adam , lr = 1e-3):
    model = NeuralNet().to(device)
    loss_function = nn.CrossEntropyLoss()
    optimizer = optimizer(model.parameters(), lr = lr)
    
    return model , loss_function , optimizer

In [9]:
model,loss_function, optimizer = build_model(optimizer = Adam , lr=1e-2)

In [None]:
train_losses, train_accuracies, validation_losses, validation_accuracies = train_with_validation(20, train_data_loader,
                                                                                                 validation_data_loader,
                                                                                                 model,
                                                                                                 loss_function,
                                                                                                 optimizer)

Epoch: 1
Train Loss: 0.513
Train Accuracy: 88%
Validation Loss: 0.400
Validation Accuracy: 85%
<--------------------------------------------------------->
Epoch: 2
Train Loss: 0.384
Train Accuracy: 88%
Validation Loss: 0.384
Validation Accuracy: 86%
<--------------------------------------------------------->
Epoch: 3
Train Loss: 0.350
Train Accuracy: 90%
Validation Loss: 0.356
Validation Accuracy: 87%
<--------------------------------------------------------->
Epoch: 4
Train Loss: 0.324
Train Accuracy: 90%
Validation Loss: 0.355
Validation Accuracy: 87%
<--------------------------------------------------------->
Epoch: 5
Train Loss: 0.309
Train Accuracy: 91%
Validation Loss: 0.350
Validation Accuracy: 88%
<--------------------------------------------------------->
Epoch: 6
Train Loss: 0.295
Train Accuracy: 91%
Validation Loss: 0.365
Validation Accuracy: 87%
<--------------------------------------------------------->
Epoch: 7
Train Loss: 0.282
Train Accuracy: 92%
Validation Loss: 0.335


In [None]:
display_train_validation_results(20,train_losses, train_accuracies, validation_losses, validation_accuracies)

Batch normalization achieves a better validation accuracy, and generally  helps when training deep neural networks. It helps us avoid gradients becoming so small that the weights are barely updated