# Pytorch Lightning
<img src="https://raw.githubusercontent.com/yousraaa92/pytorch-deep-learning-workshop/main/Images/lightning.png" alt="" width=600 height=250/>

## What is Pytorch Lightning?


PyTorch Lightning is a PyTorch-based high-level open-source framework that aims to simplify the training and deployment of models by providing a lightweight and standardized interface. It was built and designed with academics in mind so they could experiment with novel deep learning and machine learning models by abstracting away the boilerplate code and repetitive tasks and encouraging a more structured and organized approach to development.

## Advantages of PyTorch Lightning

* It is easy to install using pip.

* The framework’s code tends to be simple, clean, and easy to reproduce. This is because the engineering code is separate from the main code.

* It supports 16-bit precision. This helps in speeding up model training.
It can run distributed training. It supports training on multiple machines at the same time.

* It integrates easily with other popular machine learning tools. For example, it supports Google’s Tensorboard.

* Compared to PyTorch, it has a minimum running speed overhead of about 300ms which makes it pretty fast.
* Its models are hardware agnostic. It can run on any CPU, GPU, or TPU machine.


## Is PyTorch Lightning Better Than PyTorch?
If your project values code organization, reproducibility for experimentation, and a high degree of scalability, PyTorch Lighting can significantly simplify the development process. If your project values flexibility and fine-grained control, you may want to stick with PyTorch.

## PyTorch Lightning Example Walkthrough

### Installing PyTorch Lightning

In [None]:
! pip install pytorch-lightning

Collecting pytorch-lightning
  Downloading pytorch_lightning-2.1.1-py3-none-any.whl (776 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/776.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m122.9/776.3 kB[0m [31m3.5 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m768.0/776.3 kB[0m [31m11.6 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m776.3/776.3 kB[0m [31m10.1 MB/s[0m eta [36m0:00:00[0m
Collecting torchmetrics>=0.7.0 (from pytorch-lightning)
  Downloading torchmetrics-1.2.0-py3-none-any.whl (805 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m805.2/805.2 kB[0m [31m55.6 MB/s[0m eta [36m0:00:00[0m
Collecting lightning-utilities>=0.8.0 (from pytorch-lightning)
  Downloading lightning_utilities-0.9.0-py3-none-any.whl (23 kB)
Installing collected packages: 

### Necessary Imports

In [None]:
import pytorch_lightning as pl
import torch
import torch.nn as nn
import torch.nn.functional as F
import os

### PyTorch Model

In [None]:
# Setup device agnostic code
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cpu'

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

# Define the model
class MNISTClassifier(nn.Module):
    def __init__(self):
        super(MNISTClassifier, self).__init__()
        self.fc1 = nn.Linear(784, 10)

    def forward(self, x):
        x = x.view(-1, 784) # the input image is reshaped into a 2D tensor
        x = self.fc1(x)
        return x

# Define the training data
train_dataloader = torch.utils.data.DataLoader(
    torchvision.datasets.MNIST(
        root='./data',
        train=True,
        download=True,
        transform=transforms.ToTensor()
    ),
    batch_size=32,
    shuffle=True
)

# Define the validation data
test_dataloader = torch.utils.data.DataLoader(
    torchvision.datasets.MNIST(
        root='./data',
        train=False,
        download=True,
        transform=transforms.ToTensor()
    ),
    batch_size=32,
    shuffle=False
)

# Create an instance of the model
model = MNISTClassifier()

# Define the optimizer
optimizer = optim.Adam(model.parameters(), lr=0.001)

loss_fn = nn.CrossEntropyLoss()



Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [00:00<00:00, 112105611.37it/s]


Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 32174155.07it/s]

Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz



100%|██████████| 1648877/1648877 [00:00<00:00, 23785239.56it/s]


Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<00:00, 2237290.52it/s]


Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw



#### Training the Model

In [None]:
def train_step(model: torch.nn.Module,
               data_loader: torch.utils.data.DataLoader,
               loss_fn: torch.nn.Module,
               optimizer: torch.optim.Optimizer,
               accuracy_fn,
               device: torch.device = device):
    train_loss, train_acc = 0, 0
    model.to(device)
    for batch, (X, y) in enumerate(data_loader):
        # Send data to GPU
        X, y = X.to(device), y.to(device)

        # 1. Forward pass
        y_pred = model(X)

        # 2. Calculate loss
        loss = loss_fn(y_pred, y)
        train_loss += loss
        train_acc += accuracy_fn(y_true=y,
                                 y_pred=y_pred.argmax(dim=1)) # Go from logits -> pred labels

        # 3. Optimizer zero grad
        optimizer.zero_grad()

        # 4. Loss backward
        loss.backward()

        # 5. Optimizer step
        optimizer.step()

    # Calculate loss and accuracy per epoch and print out what's happening
    train_loss /= len(data_loader)
    train_acc /= len(data_loader)
    print(f"Train loss: {train_loss:.5f} | Train accuracy: {train_acc:.2f}%")

In [None]:
def test_step(data_loader: torch.utils.data.DataLoader,
              model: torch.nn.Module,
              loss_fn: torch.nn.Module,
              accuracy_fn,
              device: torch.device = device):
    test_loss, test_acc = 0, 0
    model.to(device)
    model.eval() # put model in eval mode
    # Turn on inference context manager
    with torch.inference_mode():
        for X, y in data_loader:
            # Send data to GPU
            X, y = X.to(device), y.to(device)

            # 1. Forward pass
            test_pred = model(X)

            # 2. Calculate loss and accuracy
            test_loss += loss_fn(test_pred, y)
            test_acc += accuracy_fn(y_true=y,
                y_pred=test_pred.argmax(dim=1) # Go from logits -> pred labels
            )

        # Adjust metrics and print out
        test_loss /= len(data_loader)
        test_acc /= len(data_loader)
        print(f"Test loss: {test_loss:.5f} | Test accuracy: {test_acc:.2f}%\n")

In [None]:
# Calculate accuracy (a classification metric)
def accuracy_fn(y_true, y_pred):
    correct = torch.eq(y_true, y_pred).sum().item() # torch.eq() calculates where two tensors are equal
    acc = (correct / len(y_pred)) * 100
    return acc

In [None]:
torch.manual_seed(42)
from tqdm import tqdm


# Train and test model
epochs = 10
for epoch in tqdm(range(epochs)):
    print(f"Epoch: {epoch}\n---------")
    train_step(data_loader=train_dataloader,
        model=model,
        loss_fn=loss_fn,
        optimizer=optimizer,
        accuracy_fn=accuracy_fn,
        device=device
    )
    test_step(data_loader=test_dataloader,
        model=model,
        loss_fn=loss_fn,
        accuracy_fn=accuracy_fn,
        device=device
    )

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 0
---------
Train loss: 0.46445 | Train accuracy: 88.17%


 10%|█         | 1/10 [00:11<01:44, 11.58s/it]

Test loss: 0.30565 | Test accuracy: 91.46%

Epoch: 1
---------
Train loss: 0.30265 | Train accuracy: 91.53%


 20%|██        | 2/10 [00:22<01:31, 11.45s/it]

Test loss: 0.28035 | Test accuracy: 92.12%

Epoch: 2
---------
Train loss: 0.28275 | Train accuracy: 92.14%


 30%|███       | 3/10 [00:34<01:19, 11.43s/it]

Test loss: 0.27088 | Test accuracy: 92.35%

Epoch: 3
---------
Train loss: 0.27282 | Train accuracy: 92.38%


 40%|████      | 4/10 [00:44<01:05, 10.87s/it]

Test loss: 0.26963 | Test accuracy: 92.41%

Epoch: 4
---------
Train loss: 0.26630 | Train accuracy: 92.60%


 50%|█████     | 5/10 [00:55<00:55, 11.02s/it]

Test loss: 0.26322 | Test accuracy: 92.61%

Epoch: 5
---------
Train loss: 0.26159 | Train accuracy: 92.69%


 60%|██████    | 6/10 [01:07<00:44, 11.14s/it]

Test loss: 0.26850 | Test accuracy: 92.43%

Epoch: 6
---------
Train loss: 0.25786 | Train accuracy: 92.76%


 70%|███████   | 7/10 [01:18<00:33, 11.25s/it]

Test loss: 0.26540 | Test accuracy: 92.71%

Epoch: 7
---------
Train loss: 0.25492 | Train accuracy: 92.96%


 80%|████████  | 8/10 [01:28<00:21, 10.93s/it]

Test loss: 0.26678 | Test accuracy: 92.61%

Epoch: 8
---------
Train loss: 0.25280 | Train accuracy: 93.02%


 90%|█████████ | 9/10 [01:40<00:11, 11.04s/it]

Test loss: 0.26387 | Test accuracy: 92.60%

Epoch: 9
---------
Train loss: 0.25086 | Train accuracy: 93.10%


100%|██████████| 10/10 [01:51<00:00, 11.14s/it]

Test loss: 0.26918 | Test accuracy: 92.53%






### PyTorch Lightning Model

In [27]:
import torch.optim as optim
import pytorch_lightning as pl
from torch.optim.lr_scheduler import StepLR

# Define the LightningClassifier
class MNISTLightningClassifier(pl.LightningModule):


    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 10)

    def forward(self, x):
        x = x.view(-1, 784) # the input image is reshaped into a 2D tensor
        x = self.fc1(x)
        return x

    def training_step(self, batch, batch_idx):
        x, y = batch
        output = self(x) #forward pass
        loss = F.cross_entropy(output, y) #calculate the loss
        self.log('train_loss', loss)
        return loss

    def validation_step(self, batch, batch_idx):
        x, y = batch
        output = self(x)
        loss = F.cross_entropy(output, y)
        self.log('val_loss', loss)
        return loss

    def test_step(self, batch, batch_idx):
        x, y = batch
        output = self(x)
        loss = F.cross_entropy(output, y) #calculate the loss
        self.log('test_loss', loss)
        return loss


    def configure_optimizers(self): #flexible
        optimizer = optim.Adam(self.parameters(), lr=0.001)
        return {"optimizer": optimizer}

#### DataModule in PyTorch Lightning


In [28]:
from torch.utils.data import DataLoader, random_split
from torchvision.datasets import MNIST
import os
from torchvision import datasets, transforms

class MNISTDataModule(pl.LightningDataModule):

  def setup(self, stage):
    # transforms for images
    transform=transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.1307,), (0.3081,))])

    # prepare transforms standard to MNIST
    mnist_train = MNIST(os.getcwd(), train=True, download=True, transform=transform)
    self.mnist_test = MNIST(os.getcwd(), train=False, download=True, transform=transform)

    self.mnist_train, self.mnist_val = random_split(mnist_train, [55000, 5000])

  def train_dataloader(self):
    return DataLoader(self.mnist_train, batch_size=64)

  def val_dataloader(self):
    return DataLoader(self.mnist_val, batch_size=64)

  def test_dataloader(self):
    return DataLoader(self.mnist_test, batch_size=64)

#### The Trainer Class

In [29]:
dm = MNISTDataModule()

model = MNISTLightningClassifier()

trainer = pl.Trainer(max_epochs=10)

# Train the model
trainer.fit(model, dm)

# Evaluate the model
trainer.test(dataloaders=dm.test_dataloader())


INFO:pytorch_lightning.utilities.rank_zero:GPU available: False, used: False
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.callbacks.model_summary:
  | Name | Type   | Params
--------------------------------
0 | fc1  | Linear | 7.9 K 
--------------------------------
7.9 K     Trainable params
0         Non-trainable params
7.9 K     Total params
0.031     Total estimated model params size (MB)


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=10` reached.
INFO:pytorch_lightning.utilities.rank_zero:Restoring states from the checkpoint path at /content/lightning_logs/version_7/checkpoints/epoch=9-step=8600.ckpt
INFO:pytorch_lightning.utilities.rank_zero:Loaded model weights from the checkpoint at /content/lightning_logs/version_7/checkpoints/epoch=9-step=8600.ckpt


Testing: |          | 0/? [00:00<?, ?it/s]

[{'test_loss': 0.28838077187538147}]