<a href="https://www.kaggle.com/code/ayushs9020/multiple-models-pytorch-lightning-w-b?scriptVersionId=139324862" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#FF0000; font-size:140%; text-align:left;padding: 0px; border-bottom: 3px solid #FF0000">Pytorch Lightning + WandB ✅</p>

In [1]:
import warnings
warnings.filterwarnings("ignore")

<div style="border-radius:10px; border:#FF0000 solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">
    
<img src = "https://media.tenor.com/pfcqgFEp2OsAAAAM/welcome.gif">
    
## $Pytorch$ $Lightning$

<img src = 'https://learnopencv.com/wp-content/uploads/2020/05/PTL.png' width = 400>

$PyTorch$ $Lightning$ is a `library` `built` on top of $PyTorch$ that `provides` a `simpler` and `more efficient way` to `build machine learning models`. It was developed by $Facebook's$ $AI$ $Research$ $Lab$ $(FAIR)$ and is `designed` to `streamline` the `machine learning workflow`, making it `easier` for `researchers`/`practitioners` to `build`/`train`/`deploy` `models`.

* $Automatic$ $Differentiation$ - PyTorch Lightning automatically computes gradients, which saves time and reduces the risk of errors compared to manual differentiation.
* $Faster$ $Training$ - PyTorch Lightning uses a modular architecture that allows for faster training times.
* $Better$ $Support$ $for$ $Multi-GPU$ $Training$ - PyTorch Lightning provides improved support for multi-GPU training, which allows users to utilize multiple GPUs within a single node.
* $Integrated$ $Visualizations$ - PyTorch Lightning integrates visualizations into the training process, allowing us to monitor our models' performance and understand how they are behaving during training.

## $WandB$

<img src = 'https://i.imgur.com/1sm6x8P.png' width = 400>

$Wandb$ is an `open-source` platform for `managing` and `monitoring` `machine learning experiments`. It provides a `simple` and `intuitive interface` for `tracking` and `analyzing` the `performance of machine learning` models during `training` and `deployment`

* $Real-Time$ $Monitoring$ - Wandb provides real-time monitoring of training metrics, such as loss and accuracy, allowing us to quickly identify issues and optimize your model.
* $Security$ - Wandb encrypts sensitive data at rest and in transit, ensuring that our intellectual property remains secure.

# <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#00FFFF; font-size:140%; text-align:left;padding: 0px; border-bottom: 3px solid #00FFFF">2 | Data 📊</p>

In [2]:
import pandas as pd 

<div style="border-radius:10px; border:#00FFFF solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">

<img src = 'https://crawlbase.com/blog/best-data-memes/where-is-my-data-meme.jpg'>

<div style="border-radius:10px; border:#00FFFF solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">

Lets just focus on the `Summaries Data` only $--->$ `/kaggle/input/commonlit-evaluate-student-summaries/summaries_train.csv`

In [3]:
train = pd.read_csv("/kaggle/input/commonlit-evaluate-student-summaries/summaries_train.csv")

print('Training Samples -----------------------------> ' , train.shape[0] , '\n')
train.head()

Training Samples ----------------------------->  7165 



Unnamed: 0,student_id,prompt_id,text,content,wording
0,000e8c3c7ddb,814d6b,The third wave was an experimentto see how peo...,0.205683,0.380538
1,0020ae56ffbf,ebad26,They would rub it up with soda to make the sme...,-0.548304,0.506755
2,004e978e639e,3b9047,"In Egypt, there were many occupations and soci...",3.128928,4.231226
3,005ab0199905,3b9047,The highest class was Pharaohs these people we...,-0.210614,-0.471415
4,0070c9e7af47,814d6b,The Third Wave developed rapidly because the ...,3.272894,3.219757


# <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#800080; font-size:140%; text-align:left;padding: 0px; border-bottom: 3px solid #800080">3 | Tokenization 📄️</p>

In [4]:
import numpy as np

from transformers import AutoTokenizer
import os 
import tqdm

<div style="border-radius:10px; border:#800080 solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">

<img src = "https://kratikal.com/blog/wp-content/uploads/2022/01/tokens-image.png" width = 400>

<div style="border-radius:10px; border:#800080 solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">

For the starting we will be using $Roberta$ $Base$

In [5]:
tokenizer = AutoTokenizer.from_pretrained("roberta-base")
tokenizer

Downloading (…)lve/main/config.json:   0%|          | 0.00/481 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

RobertaTokenizerFast(name_or_path='roberta-base', vocab_size=50265, model_max_length=512, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'bos_token': '<s>', 'eos_token': '</s>', 'unk_token': '<unk>', 'sep_token': '</s>', 'pad_token': '<pad>', 'cls_token': '<s>', 'mask_token': AddedToken("<mask>", rstrip=False, lstrip=True, single_word=False, normalized=False)}, clean_up_tokenization_spaces=True)

<div style="border-radius:10px; border:#800080 solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">

If we send a sample text like 
```
Your dreams are shining in the darkness. There is a intoxication in your eyes, You are in my dreams, in the answers, in the questions, Every day, I steal you in my thoughts
```

In [6]:
tokenizer("Your dreams are shining in the darkness. There is a intoxication in your eyes, You are in my dreams, in the answers, in the questions, Every day, I steal you in my thoughts" , return_tensors = "np")["input_ids"]

array([[    0, 12861,  7416,    32, 21003,    11,     5, 15073,     4,
          345,    16,    10, 34205,    11,   110,  2473,     6,   370,
           32,    11,   127,  7416,     6,    11,     5,  5274,     6,
           11,     5,  1142,     6,  4337,   183,     6,    38,  8052,
           47,    11,   127,  4312,     2]])

<div style="border-radius:10px; border:#800080 solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">

Now we will do the same thing for our all columns 

In [7]:
os.makedirs("/kaggle/working/Pseudo Dir/Embeds")

In [8]:
tokens = [
    tokenizer(train['text'][index])['input_ids']
    for index 
    in tqdm.tqdm(range(train.shape[0]) , total = train.shape[0] , desc = 'Tokenizing Input --->')
]

Tokenizing Input --->:  25%|██▍       | 1784/7165 [00:00<00:01, 3004.54it/s]Token indices sequence length is longer than the specified maximum sequence length for this model (598 > 512). Running this sequence through the model will result in indexing errors
Tokenizing Input --->: 100%|██████████| 7165/7165 [00:02<00:00, 3001.17it/s]


<div style="border-radius:10px; border:#800080 solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">

We will also save them in our `output directory` to use them later 

In [9]:
np.save('/kaggle/working/Pseudo Dir/Embeds/Hui Hui' , np.array(tokens))

# <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#F2C464; font-size:140%; text-align:left;padding: 0px; border-bottom: 3px solid #F2C464">4 | DataSet 📊</p>

In [10]:
from torch.utils.data import Dataset

import torch

<div style="border-radius:10px; border:#F2C464 solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">

## $---------------Init---------------$
```
def __init__(self):

    super().__init__()

    self.data = pd.read_csv('/kaggle/input/commonlit-evaluate-student-summaries/summaries_train.csv')
    self.tokens = np.load('/kaggle/working/Pseudo Dir/Embeds/Hui Hui.npy' , allow_pickle = True)

    self.wordings = self.data['wording']
```
`Initializes` the `object` by `reading` a $CSV$ file `containing summaries` and `loading` `pre-trained` `word embeddings` (np.load) into memory. It then `extracts` the `['wording']` column from the DataFrame and creates `tensors` out of the token and wording data.

## $---------------Len---------------$
```
def __len__(self) : return self.data.shape[0]
```

`Returns` the `number of rows` in the $DataFrame$

## $--------------GetItem--------------$

It takes an `integer` `index` as input and `returns` a `tuple` of two tensors - `r_tokens`/`r_wordings`. These tensors contain the `token` and `wording information`
```
def __getitem__(self , index):

    r_tokens = torch.tensor(self.tokens[index] , dtype = torch.long)
    r_wordings = torch.tensor(self.wordings[index] , dtype = torch.float32)

    return r_tokens , r_wordings
```

In [11]:
class data(Dataset):
    
    def __init__(self):
        
        super().__init__()
        
        self.data = pd.read_csv('/kaggle/input/commonlit-evaluate-student-summaries/summaries_train.csv')
        self.tokens = np.load('/kaggle/working/Pseudo Dir/Embeds/Hui Hui.npy' , allow_pickle = True)
        
        self.wordings = self.data['wording']
        
    def __len__(self) : return self.data.shape[0]
    
    def __getitem__(self , index):
        
        r_tokens = torch.tensor(self.tokens[index] , dtype = torch.long)
        r_wordings = torch.tensor(self.wordings[index] , dtype = torch.float32)
        
        return r_tokens , r_wordings

# <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#FF69B4; font-size:140%; text-align:left;padding: 0px; border-bottom: 3px solid #FF69B4">5 | DataLoader 💻‍✈️</p>

In [12]:
from torch.utils.data import DataLoader
from pytorch_lightning import LightningDataModule as LDM

<div style="border-radius:10px; border:#FF69B4 solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">

## $---------------Init---------------$
```
def __init__(self , batch_size = 1):

    super().__init__()

    self.batch_size = batch_size
```

`Initialize` the `object` by `setting` the `batch_size` attribute to the specified value 


## $---------------Setup---------------$
```
def setup(self , stage = None):self.train = data()
```

`Set up` the `data` for `training` by creating a `new instance` of a class `data()` 

## $------------Train$ $DataLoader------------$
```
def train_dataloader(self) : return DataLoader(self.train , batch_size = self.batch_size)
```

`Create` a `DataLoader` instance that `loads data` from the `train attribute` in `mini-batches` of the specified batch_size. It returns the `DataLoader` instance, which can be used to iterate over the training data in batches.

In [13]:
class Data(LDM):
    
    def __init__(self , batch_size = 1):
        
        super().__init__()
        
        self.batch_size = batch_size
        
    def setup(self , stage = None):self.train = data()
       
    def train_dataloader(self) : return DataLoader(self.train , batch_size = self.batch_size)

In [14]:
data_module = Data()

# <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#ACADAC; font-size:140%; text-align:left;padding: 0px; border-bottom: 3px solid #ACADAC">6 | Model 🤖</p>

In [15]:
from transformers import AutoModelForSequenceClassification
from pytorch_lightning import LightningModule as LM

import torch.nn as nn

<div style="border-radius:10px; border:#ACADAC solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">

## $---------------Init---------------$
```
def __init__(self):

    super().__init__()

    self.model = AutoModelForSequenceClassification.from_pretrained("roberta-base", num_labels = 1)
    self.loss_func = nn.MSELoss()
```

`Define` $2$ attributes:

* $Model$ `model` - An instance of the `AutoModelForSequenceClassification` class, which is a pre-trained RoBERTa model with a custom number of labels (set to 1 in this case).
* $Loss$ $Function$ `loss_func` - An instance of the `nn.MSELoss` class, which implements the mean squared error loss function.

## $--------------Forward--------------$
```
def forward(self , inps): 

    if inps.shape[1] > 512 : inps = inps[: , :512]

    return self.model(inps).logits
```

`Take` an `input tensor` `inps` and `pass` it through the `model instance` to obtain the `logits`. If the input tensor has more than $512$ columns, it is truncated to $512$ columns before passing it through the model.

## $-------------Training$ $Step-------------$
```
def training_step(self , batch , batch_idx):

    inputs , labels = batch
    outputs = self(inputs)

    loss = self.loss_func(outputs , labels)
    self.log('train_loss' , loss)

    return loss
```

`Called` during `training` and `takes a batch` of `input` and `label tensors batch` and a `batch index` `batch_idx` as arguments. `Pass` the `input tensor` through the `model instance` to obtain the `output logits`. Compute the `loss` using the `loss_func` instance, `logs` the `loss`, and `return the loss`.

## $-----------Configure$ $Optimizers-----------$
```
def configure_optimizers(self): return torch.optim.Adam(self.parameters())
```
`Return` an `instance` of the `torch.optim.Adam` `optimizer`, which is used to `update` the `model parameters` during training.

In [16]:
class lightning(LM):
    
    def __init__(self):
        
        super().__init__()
        
        self.model = AutoModelForSequenceClassification.from_pretrained("roberta-base", num_labels = 1)
        self.loss_func = nn.MSELoss()
        
    def forward(self , inps): 
        
        if inps.shape[1] > 512 : inps = inps[: , :512]
            
        return self.model(inps).logits
    
    def training_step(self , batch , batch_idx):
        
        inputs , labels = batch
        outputs = self(inputs)
        
        loss = self.loss_func(outputs , labels)
        self.log('train_loss' , loss)
        
        return loss
    
    def configure_optimizers(self): return torch.optim.Adam(self.parameters())

In [17]:
model = lightning()

Downloading model.safetensors:   0%|          | 0.00/499M [00:00<?, ?B/s]

Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.bias', 'lm_head.layer_norm.bias', 'lm_head.dense.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.dense.bias', 'classifier.out_proj.weight']
You should pr

# <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#FF3E3E; font-size:140%; text-align:left;padding: 0px; border-bottom: 3px solid #FF3E3E">5 | Trainer 🏋️‍♂️</p>

In [18]:
from pytorch_lightning import Trainer

from pytorch_lightning.loggers import WandbLogger

<div style="border-radius:10px; border:#FF3E3E solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">

Now we will define our Trainer. This trainer will use the $2$ $T4$ $GPUs$ provided for free of cost by $Kaggle$. $YAYYYYYY$

In [19]:
from kaggle_secrets import UserSecretsClient
import wandb
user_secrets = UserSecretsClient()
api_key = user_secrets.get_secret("API LOGIN KEY")

wandb.login(key = api_key)

[34m[1mwandb[0m: W&B API key is configured. Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


True

In [20]:
trainer = Trainer(
    max_epochs = 1 , 
    logger = WandbLogger(
        name = "PL | Roberta_Base | SC | ComonLit") , 
    accelerator = 'gpu' , devices = 2
)

[34m[1mwandb[0m: Currently logged in as: [33mayushsinghal659[0m. Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: wandb version 0.15.8 is available!  To upgrade, please run:
[34m[1mwandb[0m:  $ pip install wandb --upgrade
[34m[1mwandb[0m: Tracking run with wandb version 0.15.5
[34m[1mwandb[0m: Run data is saved locally in [35m[1m./wandb/run-20230808_173858-joy7e71v[0m
[34m[1mwandb[0m: Run [1m`wandb offline`[0m to turn off syncing.
[34m[1mwandb[0m: Syncing run [33mPL | Roberta_Base | SC | ComonLit[0m
[34m[1mwandb[0m: ⭐️ View project at [34m[4mhttps://wandb.ai/ayushsinghal659/lightning_logs[0m
[34m[1mwandb[0m: 🚀 View run at [34m[4mhttps://wandb.ai/ayushsinghal659/lightning_logs/runs/joy7e71v[0m


# <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#7A288A; font-size:140%; text-align:left;padding: 0px; border-bottom: 3px solid #7A288A">6 | Training 📚</p>

<div style="border-radius:10px; border:#7A288A solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">

Now we will start our training 

In [21]:
trainer.fit(model , data_module)

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


# <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#FFFF00; font-size:140%; text-align:left;padding: 0px; border-bottom: 3px solid #FFFF00">7 | Results 📈</p>

<div style="border-radius:10px; border:#FFFF00 solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">

Now lets see how our model performed

In [22]:
wandb.finish()

[34m[1mwandb[0m: Waiting for W&B process to finish... [32m(success).[0m
[34m[1mwandb[0m: 
[34m[1mwandb[0m: Run history:
[34m[1mwandb[0m:               epoch ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
[34m[1mwandb[0m:          train_loss ▂▁▁▁▁▁▁▁▂▁▁▁▁▁▅▂▄▂▆▂▁▄▁▂▄▁▁▁▂▃▃█▃▁▁▁▁▂▁▁
[34m[1mwandb[0m: trainer/global_step ▁▁▁▁▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
[34m[1mwandb[0m: 
[34m[1mwandb[0m: Run summary:
[34m[1mwandb[0m:               epoch 0
[34m[1mwandb[0m:          train_loss 0.01078
[34m[1mwandb[0m: trainer/global_step 3549
[34m[1mwandb[0m: 
[34m[1mwandb[0m: 🚀 View run [33mPL | Roberta_Base | SC | ComonLit[0m at: [34m[4mhttps://wandb.ai/ayushsinghal659/lightning_logs/runs/joy7e71v[0m
[34m[1mwandb[0m: Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
[34m[1mwandb[0m: Find logs at: [35m[1m./wandb/run-20230808_173858-joy7e71v/logs[0m


# <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#FFC0CB; font-size:140%; text-align:left;padding: 0px; border-bottom: 3px solid #FFC0CB">8 | TO DO LIST 📝</p>

<div style="border-radius:10px; border:#FFC0CB solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">
    
<img src = "https://i.imgflip.com/43iacv.jpg" width = 400>

* $TO$ $DO$ $1$ $:$ $ADD$ $MORE$ $MODELS$
* $TO$ $DO$ $2$ $:$ $DANCE$

# <p style="font-family:JetBrains Mono; font-weight:bold; letter-spacing: 2px; color:#FFA500; font-size:140%; text-align:left;padding: 0px; border-bottom: 3px solid #FFA500">9 | Ending 🏁</p>

<div style="border-radius:10px; border:#FFA500 solid; padding: 15px; background-color: #F3f9ed; font-size:100%; text-align:left">

**THIS IS NOT THE FULL IMPLEMENTATION, IT STILL LACKS MANY FUNCTIONALITIES AND IS VULENRABLE TO MANY EDGE CASES, WE WILL IMPROVE THIS IN THE UPCOMING VERSIONS**

**PLEASE COMMENT DOWN IF I DID ANY MISTAKES, OR IF CAN MAKE THIS MORE CONNECTED TO THE GROUND, OR SUGGESTIONS. YOUR ASSISTS ARE HIGHLY APPRECIABLE**

**THATS IT FOR TODAY GUYS**

**HOPE YOU UNDERSTOOD AND LIKED MY WORK**

**DONT FORGET TO MAKE AN UPVOTE $:)$**
    
<img src = "https://i.imgflip.com/19aadg.jpg">
   
**PEACE OUT**