# **Going Deeper -- the Mechanics of PyTorch (Part 3/3)**

### **Higher-level PyTorch APIs: a short introduction to PyTorch Lightning**

- Explore PyTorch Lightning.
- Makes training deep neural networks simpler by removing much of the boilerplate code.
- Focus lies in simplicity and flexibility
- It allows us to use many advanced features such as multi-GPU support and fast low-precision training.

#### **Setting up the PyTorch Lightning model**

- All that is required to implement a `Lightening` model is to use `LightningModule` instead of the regular PyTorch module.

In [1]:
import pytorch_lightning as pl
import torch
import torch.nn as nn

from torchmetrics import __version__ as torchmetrics_version
from pkg_resources import parse_version

from torchmetrics import Accuracy

In [2]:
print(torchmetrics_version)

1.8.2


In [3]:
parse_version(pl.__version__)

<Version('2.5.5')>

In [4]:
class MultiLayerPerceptron(pl.LightningModule):
    def __init__(self, image_shape=(1, 28, 28), hidden_units=(32, 16)):
        super().__init__()
        
        # new PL attributes:
        if parse_version(torchmetrics_version) > parse_version("0.8"):
            self.train_acc = Accuracy(task="multiclass", num_classes=10)
            self.valid_acc = Accuracy(task="multiclass", num_classes=10)
            self.test_acc = Accuracy(task="multiclass", num_classes=10)
        else:
            self.train_acc = Accuracy() # track accuracy during training
            self.valid_acc = Accuracy() # track accuracy during validation
            self.test_acc = Accuracy() # track accuracy during testing
        
        # Model similar to previous section:
        input_size = image_shape[0] * image_shape[1] * image_shape[2] 
        all_layers = [nn.Flatten()]
        for hidden_unit in hidden_units: 
            layer = nn.Linear(input_size, hidden_unit) 
            all_layers.append(layer) 
            all_layers.append(nn.ReLU()) 
            input_size = hidden_unit 
 
        all_layers.append(nn.Linear(hidden_units[-1], 10)) 
        self.model = nn.Sequential(*all_layers)
    
    # forward pass (returns the logits)
    def forward(self, x):
        x = self.model(x)
        return x

    # defines a single forward pass during training
    def training_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = nn.functional.cross_entropy(logits, y)
        preds = torch.argmax(logits, dim=1)
        self.train_acc.update(preds, y)
        self.log("train_loss", loss, prog_bar=True)
        return loss

   # Conditionally define epoch end methods based on PyTorch Lightning version
    if parse_version(pl.__version__) >= parse_version("2.0"):
        # For PyTorch Lightning 2.0 and above
        def on_train_epoch_end(self):
            self.log("train_acc", self.train_acc.compute())
            self.train_acc.reset()

        def on_validation_epoch_end(self):
            self.log("valid_acc", self.valid_acc.compute())
            self.valid_acc.reset()

        def on_test_epoch_end(self):
            self.log("test_acc", self.test_acc.compute())
            self.test_acc.reset()

    else:
        # For PyTorch Lightning < 2.0
        def training_epoch_end(self, outs):
            self.log("train_acc", self.train_acc.compute())
            self.train_acc.reset()

        def validation_epoch_end(self, outs):
            self.log("valid_acc", self.valid_acc.compute())
            self.valid_acc.reset()

        def test_epoch_end(self, outs):
            self.log("test_acc", self.test_acc.compute())
            self.test_acc.reset()
    
    
    # def on_train_epoch_end(self, outs):
    #     self.log("train_acc", self.train_acc.compute())

    def validation_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = nn.functional.cross_entropy(self(x), y)
        preds = torch.argmax(logits, dim=1)
        self.valid_acc.update(preds, y)
        self.log("valid_loss", loss, prog_bar=True)
        self.log("valid_acc", self.valid_acc.compute(), prog_bar=True)
        return loss


    def test_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = nn.functional.cross_entropy(logits, y)
        preds = torch.argmax(logits, dim=1)
        self.test_acc.update(preds, y)
        self.log("test_loss", loss, prog_bar=True)
        self.log("test_acc", self.test_acc.compute(), prog_bar=True)
        return loss

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=1e-3)
        return optimizer

Let’s now discuss the different methods one by one. As you can see, the `__init__` constructor contains the same model code that we used in a previous subsection. What’s new is that we added the accuracy attributes such as `self.train_acc = Accuracy()`. These will allow us to track the accuracy during training. Accuracy was imported from the `torchmetrics` module, which should be automatically installed with Lightning More information can be found at https://torchmetrics.readthedocs.io/en/latest/pages/quickstart.html.


The ``forward` method implements a simple forward pass that returns the logits (outputs of the last fully connected layer of our network before the softmax layer) when we call our model on the input data. The logits, computed via the forward method by calling `self(x)`, are used for the `training`, `validation`, and `test steps`, which we’ll describe next.


The `training_step`, `training_epoch_end`, `validation_step`, `test_step`, and `configure_optimizers` methods are methods that are specifically recognized by Lightning. For instance, `training_step` defines a single forward pass during training, where we also keep track of the accuracy and loss so that we can analyze these later. Note that we compute the accuracy via `self.train_acc.update(preds, y)` but don’t log it yet. The `training_step` method is executed on each individual batch during training, and via the `training_epoch_end` method, which is executed at the end of each training epoch, we compute the training set accuracy from the accuracy values we accumulated via training.


The `validation_step` and `test_step` methods define, analogous to the `training_step` method, how the validation and test evaluation process should be computed. Similar to `training_step`, each `validation_step` and `test_step` receives a single batch, which is why we log the accuracy via respective accuracy attributes derived from Accuracy of torchmetric. However, note that `validation_step` is only called in certain intervals, for example, after each training epoch. This is why we log the validation accuracy inside the validation step, whereas with the training accuracy, we log it after each training epoch, otherwise, the accuracy plot that we inspect later will look too noisy.

Finally, via the configure_optimizers method, we specify the optimizer used for training.

#### **Setting up the data loaders for Lightning**

Three ways to prepare the dataset for Lightning;

- Make the dataset part of the model
- Set up the data loaders as usual and feed them to the `fit` method of a Ligtning Trainer.
- Create a `LightningDataModule`.

In [5]:
from torch.utils.data import DataLoader
from torch.utils.data import random_split
 
from torchvision.datasets import MNIST
from torchvision import transforms

In [6]:
class MnistDataModule(pl.LightningDataModule):
    def __init__(self, data_path='../data'):
        super().__init__()
        self.data_path = data_path
        self.transform = transforms.Compose([transforms.ToTensor()])
        
    def prepare_data(self):
        MNIST(root=self.data_path, download=True) 

    def setup(self, stage=None):
        # stage is either 'fit', 'validate', 'test', or 'predict'
        # here note relevant
        mnist_all = MNIST( 
            root=self.data_path,
            train=True,
            transform=self.transform,  
            download=False
        ) 

        self.train, self.val = random_split(
            mnist_all, [55000, 5000], generator=torch.Generator().manual_seed(1)
        )

        self.test = MNIST( 
            root=self.data_path,
            train=False,
            transform=self.transform,  
            download=False
        ) 

    def train_dataloader(self):
        return DataLoader(self.train, batch_size=64, num_workers=4, 
                          persistent_workers=True)

    def val_dataloader(self):
        return DataLoader(self.val, batch_size=64, num_workers=4,
                          persistent_workers=True)

    def test_dataloader(self):
        return DataLoader(self.test, batch_size=64, num_workers=4,
                          persistent_workers=True)
    

torch.manual_seed(1) 
mnist_dm = MnistDataModule()

- In the `prepare_data` method, we define general steps, such as downloading the dataset.
- In the `setup` method, we define the datasets used for training, validation, and testing.
- `55000` examples for training and `5000` examples for validation.

#### **Training the model using the PyTorch Lightning Trainer class**

- Lightning implements a Trainer class that makes the training model super convenient by taking care of all the intermediate steps, such as calling `zero_grad()`, `backward()`, and `optimizer.step()` for us.

In [7]:
from pytorch_lightning.callbacks import ModelCheckpoint

mnistclassifier = MultiLayerPerceptron()

callbacks = [ModelCheckpoint(save_top_k=1, mode='max', monitor="valid_acc")] # save top 1 model

if torch.backends.mps.is_available():          # Apple Silicon GPU
    trainer = pl.Trainer(
        max_epochs=10,
        callbacks=callbacks,
        accelerator="mps",
        devices=1
    )
elif torch.cuda.is_available():                 # NVIDIA GPU
    trainer = pl.Trainer(
        max_epochs=10,
        callbacks=callbacks,
        accelerator="gpu",
        devices=1
    )
else:                                           # CPU fallback
    trainer = pl.Trainer(
        max_epochs=10,
        callbacks=callbacks,
        accelerator="cpu"
    )

trainer.fit(model=mnistclassifier, datamodule=mnist_dm)


GPU available: True (mps), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs

  | Name      | Type               | Params | Mode 
---------------------------------------------------------
0 | train_acc | MulticlassAccuracy | 0      | train
1 | valid_acc | MulticlassAccuracy | 0      | train
2 | test_acc  | MulticlassAccuracy | 0      | train
3 | model     | Sequential         | 25.8 K | train
---------------------------------------------------------
25.8 K    Trainable params
0         Non-trainable params
25.8 K    Total params
0.103     Total estimated model params size (MB)
10        Modules in train mode
0         Modules in eval mode


Epoch 9: 100%|██████████| 860/860 [00:07<00:00, 120.50it/s, v_num=4, train_loss=0.248, valid_loss=0.166, valid_acc=0.951]  

`Trainer.fit` stopped: `max_epochs=10` reached.


Epoch 9: 100%|██████████| 860/860 [00:07<00:00, 120.33it/s, v_num=4, train_loss=0.248, valid_loss=0.166, valid_acc=0.951]


#### **Evaluating the model using TensorBoard**

- We can visualize our `logs` specified in our `Lightning` model earlier in TensorBoard.
- Lightning also supports other loggers as well.


- By default, Lightning tracks the training in a subfolder named `lightning_logs`.
- To visualize the training runs, you can execute the following code in the command-line terminal, which will open `TensorBoard` in your browser:

```bash
tensorboard --logdir lightning_logs/
```

In [13]:
import tensorboard

In [15]:
# Start tensorboard
# %load_ext tensorboard
# %tensorboard --logdir lightning_logs/

- By looking at the training and validation accuracies in the tensorboard, we can hypothesize that training the model for a few additional epochs can improve performance.
- Lightning allows us to load a trained model and train it for additional epochs conveniently.
- Lightning tracks the individual training runs via subfolders.


- we can use the following code to load the latest model checkpoint from this folder and train the model via fit: 

In [17]:
path = 'lightning_logs/version_4/checkpoints/epoch=9-step=8600.ckpt'


if torch.backends.mps.is_available():          # Apple Silicon GPU
    trainer = pl.Trainer(
        max_epochs=15,
        callbacks=callbacks,
        accelerator="mps",
        devices=1
    )
elif torch.cuda.is_available():                 # NVIDIA GPU
    trainer = pl.Trainer(
        max_epochs=15,
        callbacks=callbacks,
        accelerator="gpu",
        devices=1
    )
    
else: 
    trainer = pl.Trainer(
        max_epochs=10,
        callbacks=callbacks,
        accelerator="cpu"
    )


trainer.fit(model=mnistclassifier, datamodule=mnist_dm, ckpt_path=path)

Trainer already configured with model summary callbacks: [<class 'pytorch_lightning.callbacks.model_summary.ModelSummary'>]. Skipping setting a default `ModelSummary` callback.
GPU available: True (mps), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
/Users/user/Projects/ML-PyTorch-SkLearn/venv/lib/python3.9/site-packages/pytorch_lightning/callbacks/model_checkpoint.py:751: Checkpoint directory /Users/user/Projects/ML-PyTorch-SkLearn/chp13_Going_Deeper_The_Mechanics_of_PyTorch/lightning_logs/version_4/checkpoints exists and is not empty.
Restoring states from the checkpoint path at lightning_logs/version_4/checkpoints/epoch=9-step=8600.ckpt

  | Name      | Type               | Params | Mode 
---------------------------------------------------------
0 | train_acc | MulticlassAccuracy | 0      | train
1 | valid_acc | MulticlassAccuracy | 0      | train
2 | test_acc  | MulticlassAccuracy | 0      | train
3 | model     | Sequential         | 25.8 K

Epoch 14: 100%|██████████| 860/860 [00:08<00:00, 106.86it/s, v_num=5, train_loss=0.161, valid_loss=0.165, valid_acc=0.952]  

`Trainer.fit` stopped: `max_epochs=15` reached.


Epoch 14: 100%|██████████| 860/860 [00:08<00:00, 106.67it/s, v_num=5, train_loss=0.161, valid_loss=0.165, valid_acc=0.952]


- `TensorBoard` allows us to show the results from the additional training epochs (version_1) next to the previous ones (version_0), which is very convenient. Indeed, we can see that training for five more epochs improved the validation accuracy.

- Once we are finished with training, we can evaluate the model on the test set using the following code:

In [9]:
trainer.test(model=mnistclassifier, datamodule=mnist_dm)

Testing DataLoader 0: 100%|██████████| 157/157 [00:01<00:00, 138.13it/s]
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
        test_acc            0.9567000269889832
        test_loss           0.14652474224567413
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────


[{'test_loss': 0.14652474224567413, 'test_acc': 0.9567000269889832}]

- Note that PyTorch Lightning also saves the model automatically for us.