<a href="https://colab.research.google.com/github/Dhanasree-Rajamani/Deep_Learning_Assignments/blob/main/Assignment_3/Pytorch_Lightning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This colab consists of:  

- Designing the model with Pytorch Lightning 

- This Neural network model consists of 4 layers : 1 input, 2 hidden and 1 output

Install Pytorch Lightning

In [None]:
!pip install pytorch_lightning

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
import torch
from torch import nn
import pytorch_lightning as pl

We are setting the model here:

model consists of 4 layers : 1 input layer, 2 hidden layers and 1 output layer

input is 28*28 handwritten digit images

since digits are 0-9, we have 10 output labels


In [None]:
import torch
from torch import nn

class MNISTClassifier(nn.Module):

  def __init__(self):
    super(MNISTClassifier, self).__init__()

    # mnist images are (1, 28, 28) (channels, width, height) 
    self.layer_1 = torch.nn.Linear(28 * 28, 128)
    self.layer_2 = torch.nn.Linear(128, 128)
    self.layer_3 = torch.nn.Linear(128, 256)
    self.layer_4 = torch.nn.Linear(256, 10)

  def forward(self, x):
    batch_size, channels, width, height = x.size()

    # (b, 1, 28, 28) -> (b, 1*28*28)
    x = x.view(batch_size, -1)

    # layer 1
    x = self.layer_1(x)
    x = torch.relu(x)

    # layer 2
    x = self.layer_2(x)
    x = torch.relu(x)

    # layer 3
    x = self.layer_3(x)
    x = torch.relu(x)

    # layer 4
    x = self.layer_4(x)

    # probability distribution over labels
    x = torch.log_softmax(x, dim=1)

    return x

Designing the model with Pytorch Lightning

This Neural network model consists of 4 layers : 1 input, 2 hidden and 1 output

In [None]:
import torch
from torch import nn
import pytorch_lightning as pl

class LightningMNISTClassifier(pl.LightningModule):

  def __init__(self):
    super(LightningMNISTClassifier, self).__init__()

    # mnist images are (1, 28, 28) (channels, width, height) 
    self.layer_1 = torch.nn.Linear(28 * 28, 128)
    self.layer_2 = torch.nn.Linear(128, 128)
    self.layer_3 = torch.nn.Linear(128, 256)
    self.layer_4 = torch.nn.Linear(256, 10)

  def forward(self, x):
    batch_size, channels, width, height = x.siz()

    # (b, 1, 28, 28) -> (b, 1*28*28)
    x = x.view(batch_size, -1)

    # layer 1
    x = self.layer_1(x)
    x = torch.relu(x)

    # layer 2
    x = self.layer_2(x)
    x = torch.relu(x)

    # layer 3
    x = self.layer_3(x)
    x = torch.relu(x)

    # layer 4
    x = self.layer_4(x)

    # probability distribution over labels
    x = torch.log_softmax(x, dim=1)

    return x

Getting The Data

We split MNIST data to a training, validation and test split.

The dataset is added to the Dataloader which handles the loading, shuffling and batching of the dataset.

Data preparation:
- Image transforms 
- Generate training, validation and test dataset splits.
- Wrap each dataset split in a DataLoader

In [None]:
from torch.utils.data import DataLoader, random_split
from torchvision.datasets import MNIST
import os
from torchvision import datasets, transforms


# ----------------
# TRANSFORMS
# ----------------
# prepare transforms standard to MNIST
transform=transforms.Compose([transforms.ToTensor(), 
                              transforms.Normalize((0.1307,), (0.3081,))])

# ----------------
# TRAINING, VAL DATA
# ----------------
mnist_train = MNIST(os.getcwd(), train=True, download=True)

# train (55,000 images), val split (5,000 images)
mnist_train, mnist_val = random_split(mnist_train, [55000, 5000])

# ----------------
# TEST DATA
# ----------------
mnist_test = MNIST(os.getcwd(), train=False, download=True)

# ----------------
# DATALOADERS
# ----------------
# The dataloaders handle shuffling, batching, etc...
mnist_train = DataLoader(mnist_train, batch_size=64)
mnist_val = DataLoader(mnist_val, batch_size=64)
mnist_test = DataLoader(mnist_test, batch_size=64)

Laoding the data using the following methods: 
- train_dataloader()
- val_dataloader()
- test_dataloader()

Method for data preparation: 
- prepare_data()

In [None]:
from torch.utils.data import DataLoader, random_split
from torchvision.datasets import MNIST
import os
from torchvision import datasets, transforms

class MNISTDataModule(pl.LightningDataModule):

  def setup(self, stage):
    # transforms for images
    transform=transforms.Compose([transforms.ToTensor(), 
                                  transforms.Normalize((0.1307,), (0.3081,))])
      
    # prepare transforms standard to MNIST
    mnist_train = MNIST(os.getcwd(), train=True, download=True, transform=transform)
    mnist_test = MNIST(os.getcwd(), train=False, download=True, transform=transform)
    
    self.mnist_train, self.mnist_val = random_split(mnist_train, [55000, 5000])

  def train_dataloader(self):
    return DataLoader(self.mnist_train, batch_size=64)

  def val_dataloader(self):
    return DataLoader(self.mnist_val, batch_size=64)

  def test_dataloader(self):
    return DataLoader(self,mnist_test, batch_size=64)

The Optimizer

Using Adam optimizer for optimization.

The optimizer is given the weights to optimizer when we init the optimizer.

The optimizer code is added to the function configure_optimizers() in the LightningModule.

In [None]:
class LightningMNISTClassifier(pl.LightningModule):

    def configure_optimizers(self):
      optimizer = torch.optim.Adam(self.parameters(), lr=1e-3)
      return optimizer

The loss

In this case, we want to take our logits and calculate the cross entropy loss. Since cross entropy is the same as NegativeLogLikelihood(log_softmax), we just need to add the nll_loss.


In [None]:
from torch.nn import functional as F

def cross_entropy_loss(logits, labels):
  return F.nll_loss(logits, labels)

In PyTorch Lightning, calculating loss as follows.

In [None]:
from torch.nn import functional as F

class LightningMNISTClassifier(pl.LightningModule):

  def cross_entropy_loss(self, logits, labels):
    return F.nll_loss(logits, labels)

Full Training loop 

Determine training and validation loss

In [None]:
import torch
from torch import nn
import pytorch_lightning as pl
from torch.utils.data import DataLoader, random_split
from torch.nn import functional as F
from torchvision.datasets import MNIST
from torchvision import datasets, transforms
import os

# -----------------
# MODEL
# -----------------
class LightningMNISTClassifier(pl.LightningModule):

  def __init__(self):
    super(LightningMNISTClassifier, self).__init__()

    # mnist images are (1, 28, 28) (channels, width, height) 
    self.layer_1 = torch.nn.Linear(28 * 28, 128)
    self.layer_2 = torch.nn.Linear(128, 128)
    self.layer_3 = torch.nn.Linear(128, 256)
    self.layer_4 = torch.nn.Linear(256, 10)

  def forward(self, x):
    batch_size, channels, width, height = x.sizes()

    # (b, 1, 28, 28) -> (b, 1*28*28)
    x = x.view(batch_size, -1)

    # layer 1
    x = self.layer_1(x)
    x = torch.relu(x)

    # layer 2
    x = self.layer_2(x)
    x = torch.relu(x)

    # layer 3
    x = self.layer_3(x)
    x = torch.relu(x)

    # layer 4
    x = self.layer_4(x)

    # probability distribution over labels
    x = torch.log_softmax(x, dim=1)

    return x


# ----------------
# DATA
# ----------------
transform=transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
mnist_train = MNIST(os.getcwd(), train=True, download=True, transform=transform)
mnist_test = MNIST(os.getcwd(), train=False, download=True, transform=transform)

# train (55,000 images), val split (5,000 images)
mnist_train, mnist_val = random_split(mnist_train, [55000, 5000])
mnist_test = MNIST(os.getcwd(), train=False, download=True)

# The dataloaders handle shuffling, batching, etc...
mnist_train = DataLoader(mnist_train, batch_size=64)
mnist_val = DataLoader(mnist_val, batch_size=64)
mnist_test = DataLoader(mnist_test, batch_size=64)

# ----------------
# OPTIMIZER
# ----------------
pytorch_model = MNISTClassifier()
optimizer = torch.optim.Adam(pytorch_model.parameters(), lr=1e-3)

# ----------------
# LOSS
# ----------------
def cross_entropy_loss(logits, labels):
  return F.nll_loss(logits, labels)

# ----------------
# TRAINING LOOP
# ----------------
num_epochs = 1
for epoch in range(num_epochs):

  # TRAINING LOOP
  for train_batch in mnist_train:
    x, y = train_batch

    logits = pytorch_model(x)
    loss = cross_entropy_loss(logits, y)
    print('train loss: ', loss.item())

    loss.backward()

    optimizer.step()
    optimizer.zero_grad()

  # VALIDATION LOOP
  with torch.no_grad():
    val_loss = []
    for val_batch in mnist_val:
      x, y = val_batch
      logits = pytorch_model(x)
      val_loss.append(cross_entropy_loss(logits, y).item())

    val_loss = torch.mean(torch.tensor(val_loss))
    print('val_loss: ', val_loss.item())



train loss:  2.2945244312286377
train loss:  2.268749475479126
train loss:  2.2636992931365967
train loss:  2.19991397857666
train loss:  2.224827766418457
train loss:  2.097792625427246
train loss:  2.0646939277648926
train loss:  2.069551467895508
train loss:  1.9330109357833862
train loss:  1.8593388795852661
train loss:  1.7185806035995483
train loss:  1.673564076423645
train loss:  1.474409580230713
train loss:  1.4042366743087769
train loss:  1.3478622436523438
train loss:  1.2908762693405151
train loss:  1.3241838216781616
train loss:  0.964457094669342
train loss:  1.1541366577148438
train loss:  1.035294771194458
train loss:  0.9673970937728882
train loss:  0.9156351685523987
train loss:  0.7617364525794983
train loss:  0.8678506016731262
train loss:  0.9399540424346924
train loss:  0.5769922733306885
train loss:  1.0428754091262817
train loss:  0.7471843957901001
train loss:  0.6398500204086304
train loss:  0.8257083296775818
train loss:  0.6847691535949707
train loss:  0.590