# Toycode for PyTorch-Lightning

So far I have bumped into many technical difficulties related to Pytorch-lightning and Tensorboard. In this notebook, I will use a simple model and dataset to experiment with different functionalities including: 

<ul>
    <li>Tensorboard logging</li>
    <li>Callback</li> 
    <li>Freeze parameters</li>
</ul>

Other things I need to experiment 

<ul>
    <li>Rouge metric and other metrics</li>
    <li>Optimizer and scheduler</li>
</ul>

## Resources 

<ul>
    <li><a href="https://pytorch-lightning.readthedocs.io/en/stable/">PyTorch-lightning doc</a></li>
    <li><a href="https://pytorch-lightning.readthedocs.io/en/stable/rapid_prototyping_templates.html">Rapid prototyping</a></li>
</ul>
    

In [1]:
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
import pytorch_lightning as pl
from transformers import AdamW
import time

In [2]:
nSamp = 400000

# Generate `nSamp` points
X = (np.random.rand(nSamp, 5) ** 2 * 5).tolist()
# Labels are the norms of `nSamp` points
y = (np.sqrt(np.sum(np.square(X), axis = 1))).tolist()

In [3]:
class MyDataset(Dataset): 
    def __init__(self, X, y): 
        super().__init__()
        self.X = X
        self.y = y
        
    def __len__(self): 
        return len(y)
    
    def __getitem__(self,idx):
        return {
            'source': torch.tensor(X[idx]), 
            'target': torch.tensor([y[idx]]) # !! Even the target is a single number, we wrap it as vector
        }

In [4]:
class MyModel(pl.LightningModule): 
    ''' Part 1: Define the architecture of model in init '''
    def __init__(self, hparams):
        super(MyModel, self).__init__()
        self.layer1 = nn.Linear(5, 10)
        self.layer2 = nn.Linear(10, 8)
        self.layer3 = nn.Linear(8, 1)
        self.hparams = hparams 
        
    ''' Part 2: Define the forward propagation '''
    def forward(self, x): 
        x = F.relu(self.layer1(x))
        x = F.relu(self.layer2(x))
        x = self.layer3(x)
        return x
    
    ''' Part 3: Prepare optimizer and scheduler '''
    def configure_optimizers(self): 
        optimizer = AdamW(self.parameters(), lr = self.hparams['learning_rate'])
        return optimizer
    
    ''' Part 4.1: Training logic '''
    def training_step(self, batch, batch_idx): 
        X = batch['source']
        y = batch['target']
        y_hat = self(X)    # Calls forward function 
        loss = F.mse_loss(y_hat, y)
        self.log('train_loss', loss)
        return loss
    
    ''' Part 4.2: Validation logic '''
    def validation_step(self, batch, batch_idx): 
        X = batch['source']
        y = batch['target']
        y_hat = self(X)
        loss = F.mse_loss(y_hat, y)
        self.log('val_loss', loss)
        
    ''' Part 4.3: Test logic '''
    def test_step(self, batch, batch_idx): 
        X = batch['source']
        y = batch['target']
        y_hat = self(X)
        loss = F.mse_loss(y_hat, y)
        
    ''' Part 5: Data loaders '''
    def train_dataloader(self): 
        dataset = MyDataset(X[:int(0.7 * nSamp)], y[:int(0.7 * nSamp)])
        return DataLoader(dataset, batch_size = hparams['batch_size'])
    
    def val_dataloader(self): 
        dataset = MyDataset(X[int(0.7 * nSamp):int(0.9 * nSamp)], y[int(0.7 * nSamp):int(0.9 * nSamp)])
        return DataLoader(dataset, batch_size = hparams['batch_size'])
    
    def test_dataloader(self): 
        dataset = MyDataset(X[int(0.9 * nSamp):], y[int(0.9 * nSamp):])
        return DataLoader(dataset, batch_size = hparams['batch_size'])

In [5]:
hparams = {
    'learning_rate': 3e-4, 
    'batch_size': 16
}

In [6]:
model = MyModel(hparams)
trainer = pl.Trainer(gpus = 1, max_epochs = 2, progress_bar_refresh_rate = 20)

start = time.time()
trainer.fit(model)
end = time.time()
print(f'Total time: {end - start}s')

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name   | Type   | Params
----------------------------------
0 | layer1 | Linear | 60    
1 | layer2 | Linear | 88    
2 | layer3 | Linear | 9     


HBox(children=(HTML(value='Validation sanity check'), FloatProgress(value=1.0, bar_style='info', layout=Layout…

HBox(children=(HTML(value='Training'), FloatProgress(value=1.0, bar_style='info', layout=Layout(flex='2'), max…

HBox(children=(HTML(value='Validating'), FloatProgress(value=1.0, bar_style='info', layout=Layout(flex='2'), m…

HBox(children=(HTML(value='Validating'), FloatProgress(value=1.0, bar_style='info', layout=Layout(flex='2'), m…


Total time: 137.08406615257263s


In [7]:
trainer.test()

HBox(children=(HTML(value='Testing'), FloatProgress(value=1.0, bar_style='info', layout=Layout(flex='2'), max=…

--------------------------------------------------------------------------------



1

In [11]:
x1 = [1.,1.,2.,2.,3.]
x2 = X[0]

print(f'{x1} has actual norm {np.sqrt(np.sum(np.square(x1)))}, predicted {model(torch.tensor(x1).to("cuda"))[0]}')
print(f'{x2} has actual norm {np.sqrt(np.sum(np.square(x2)))}, predicted {model(torch.tensor(x2).to("cuda"))[0]}')

[1.0, 1.0, 2.0, 2.0, 3.0] has actual norm 4.358898943540674, predicted 4.561748027801514
[2.1738989610701736, 2.1518761822841026, 0.5708100791147508, 0.6935205395830435, 0.4567646764676432] has actual norm 3.2205335973615803, predicted 3.343106746673584


In [13]:
# Start tensorboard.
#%load_ext tensorboard
%reload_ext tensorboard
%tensorboard --logdir lightning_logs/

Reusing TensorBoard on port 6006 (pid 29172), started 0:02:38 ago. (Use '!kill 29172' to kill it.)