Pytorch-Lightning
- Simple Tutorial:
https://github.com/Lightning-AI/lightning/tree/36aa4e2ebb66fc718c17bfde0ec244f66aa0851f
- How to use GPU: 
https://pytorch-lightning.readthedocs.io/en/latest/accelerators/gpu_basic.html
- Notes: 
This ipython kernel cannot be interrupted (pressing the button/pressing Kernel-Interrupt Kernel). 
Also the connected terminal shows a lot same error messages:
   `tornado.iostream.StreamClosedError: Stream is closed`

In [1]:
import sys
print(sys.prefix)

/home/P76114511/main


In [2]:
# go into requirements.txt 
!pip install -q torch 
!pip install -q pytorch_lightning
!pip install -q torchvision

In [3]:
import os
import torch
from torch import nn
import torch.nn.functional as F
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader, random_split
from torchvision import transforms
import pytorch_lightning as pl

In [4]:
!pip install -q ipywidgets

In [5]:
class LitAutoEncoder(pl.LightningModule):
    def __init__(self):
        super().__init__()
        self.encoder = nn.Sequential(nn.Linear(28 * 28, 128), nn.ReLU(), nn.Linear(128, 3))
        self.decoder = nn.Sequential(nn.Linear(3, 128), nn.ReLU(), nn.Linear(128, 28 * 28))

    def forward(self, x):
        # in lightning, forward defines the prediction/inference actions
        embedding = self.encoder(x)
        return embedding

    def training_step(self, batch, batch_idx):
        # training_step defines the train loop. It is independent of forward
        x, y = batch
        x = x.view(x.size(0), -1)
        z = self.encoder(x)
        x_hat = self.decoder(z)
        loss = F.mse_loss(x_hat, x)
        self.log("train_loss", loss)
        return loss

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=1e-3)
        return optimizer

In [8]:
NW = 12
MAX_EPOCHS=5
DEVICE=1 #number of devices want to use, not index 

In [None]:
dataset = MNIST(os.getcwd(), download=True, transform=transforms.ToTensor())
train, val = random_split(dataset, [55000, 5000])
autoencoder = LitAutoEncoder()
trainer = pl.Trainer(max_epochs = MAX_EPOCHS, accelerator="gpu", devices=DEVICE)
trainer.fit(autoencoder, DataLoader(train, num_workers=NW), DataLoader(val,num_workers=NW))

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
  rank_zero_warn("You passed in a `val_dataloader` but have no `validation_step`. Skipping val loop.")
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]

  | Name    | Type       | Params
---------------------------------------
0 | encoder | Sequential | 100 K 
1 | decoder | Sequential | 101 K 
---------------------------------------
202 K     Trainable params
0         Non-trainable params
202 K     Total params
0.810     Total estimated model params size (MB)


Training: 0it [00:00, ?it/s]