Key Differences and Benefits

Simplified Training Loop

PyTorch Lightning eliminates the manual training loop. Instead of writing the epoch loop, batch iteration, and gradient management yourself, you define:

training_step() - what happens in one forward pass
configure_optimizers() - which optimizer to use

Automatic Features

Lightning automatically handles:

Gradient zeroing: No need for optimizer.zero_grad()
Backward pass: No need for loss.backward()
Optimizer step: No need for optimizer.step()
Device placement: Automatically moves tensors to GPU/CPU
Logging: Built-in integration with TensorBoard, Weights & Biases, etc.

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset
import torchvision.transforms as transforms
import pytorch_lightning as pl

In [2]:
class CustomDataset(Dataset):
    def __init__(self, data, labels, transform=None):
        self.data = data
        self.labels = labels
        self.transform = transform
    
    def __len__(self):
        return len(self.data)
    
    def __getitem__(self, idx):
        sample = self.data[idx]
        label = self.labels[idx]

        if self.transform:
            sample = self.transform(sample)
        return sample, label

In [3]:
# standart format MNIST (N, C, H, W)
X = torch.rand(100, 1, 28, 28) # features
# flatten 100 labels with classes from 0 to 9
y = torch.randint(0, 10, (100,)) # targets

In [4]:
# mean, std for the greychannel, [0,1] -> [-1,1]
dataset = CustomDataset(X, y, transform=transforms.Normalize((0.5,), (0.5)))
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)

In [None]:
class SimpleLightningModel(pl.LightningModule):
    def __init__(self, learning_rate=0.001):
        super(SimpleLightningModel, self).__init__()
        self.learning_rate = learning_rate

        # same architecture
        self.flatten = nn.Flatten()
        self.fc = nn.Sequential(
            nn.Linear(28*28, 128),
            nn.ReLU(),
            nn.Linear(128,10),
        )

        # Loss function
        self.criterion = nn.CrossEntropyLoss()
        self.save_hyperparameters()

    def forward(self, x):
        x = self.flatten(x)
        x = self.fc(x)
        return x
    
    def training_step(self, batch, batch_idx):
        inputs, labels = batch
        outputs = self(inputs)
        loss = self.criterion(outputs, labels)

        self.log('train_loss', loss, on_step=True, on_epoch=True, prog_bar=True)
        return loss
    
    def configure_optimizers(self):
        return optim.Adam(self.parameters(), lr=self.learning_rate)

In [9]:
model = SimpleLightningModel(learning_rate=0.001)
trainer = pl.Trainer(
    max_epochs=5,
    accelerator='auto',
    devices='auto',
    log_every_n_steps=1,
)

trainer.fit(model, dataloader)

ðŸ’¡ Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
/home/office/miniforge3/envs/ml1/lib/python3.12/site-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py:76: Starting from v1.9.0, `tensorboardX` has been removed as a dependency of the `pytorch_lightning` package, due to potential conflicts with other packages in the ML ecosystem. For this reason, `logger=True` will use `CSVLogger` as the default logger, unless the `tensorboard` or `tensorboardX` packages are found. Please `pip install lightning[extra]` or one of them to enable TensorBoard support by default
  return _C._get_float32_matmul_precision()
You are using a CUDA device ('NVIDIA GeForce RTX 5070 Laptop GPU') that has Tensor Cores. To properly utilize them, you sh

Epoch 4: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 4/4 [00:00<00:00, 156.53it/s, v_num=0, train_loss_step=1.320, train_loss_epoch=1.300]

`Trainer.fit` stopped: `max_epochs=5` reached.


Epoch 4: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 4/4 [00:00<00:00, 111.42it/s, v_num=0, train_loss_step=1.320, train_loss_epoch=1.300]
