<div style="background-color:#5D73F2; color:#19180F; font-size:40px; font-family:Arial; padding:10px; border: 5px solid #19180F; border-radius:10px"> Beginner friendly approach </div>
<div style="background-color:#D5D9F2; color:#19180F; font-size:15px; font-family:Arial; padding:10px; border: 5px solid #19180F; border-radius:10px">
<div style="background-color:#F0E3D2; color:#19180F; font-size:15px; font-family:Verdana; padding:10px; border: 2px solid #19180F; border-radius:10px"> 
📌
BERT based approach. Know more about the architecture of BERT via <a href="https://www.kaggle.com/code/suraj520/bert-know-fit-infer"> kernel </a>   </div>
</div>


<div style="background-color:#F0E3D2; color:#19180F; font-size:15px; font-family:Verdana; padding:10px; border: 2px solid #19180F; border-radius:10px"> 
📌
Importing modules
    </div>

In [1]:
import torch
import torch.nn as nn
from transformers import BertModel, BertTokenizer
import pandas as pd
from sklearn.model_selection import train_test_split


<div style="background-color:#F0E3D2; color:#19180F; font-size:15px; font-family:Verdana; padding:10px; border: 2px solid #19180F; border-radius:10px"> 
📌
Loading the data    </div>

In [2]:
"""train_data = pd.read_csv('/kaggle/input/commonlit-evaluate-student-summaries/summaries_train.csv')
test_data = pd.read_csv('/kaggle/input/commonlit-evaluate-student-summaries/summaries_test.csv')
"""

train_data = pd.read_csv('./Data/summaries_train.csv')
test_data = pd.read_csv('./Data/summaries_test.csv')


<div style="background-color:#F0E3D2; color:#19180F; font-size:15px; font-family:Verdana; padding:10px; border: 2px solid #19180F; border-radius:10px"> 
📌
Preprocessing the data    </div>

In [3]:
# tokenizer = BertTokenizer.from_pretrained('/kaggle/input/hugging-face-models-safe-tensors/bert-base-uncased')
tokenizer = BertTokenizer.from_pretrained('./Models/bert-base-uncased')

train_encodings = tokenizer.batch_encode_plus(
    train_data['text'].tolist(),
    truncation=True,
    padding=True
)

test_encodings = tokenizer.batch_encode_plus(
    test_data['text'].tolist(),
    truncation=True,
    padding=True
)

train_dataset = torch.utils.data.TensorDataset(
    torch.tensor(train_encodings['input_ids']),
    torch.tensor(train_encodings['attention_mask']),
    torch.tensor(train_data['content'].tolist()),
    torch.tensor(train_data['wording'].tolist())
)

test_dataset = torch.utils.data.TensorDataset(
    torch.tensor(test_encodings['input_ids']),
    torch.tensor(test_encodings['attention_mask'])
)



<div style="background-color:#F0E3D2; color:#19180F; font-size:15px; font-family:Verdana; padding:10px; border: 2px solid #19180F; border-radius:10px"> 
📌
Defining the BERT model    </div>

In [4]:
class BERTModel(nn.Module):
    def __init__(self):
        super(BERTModel, self).__init__()
        # self.bert = BertModel.from_pretrained('/kaggle/input/hugging-face-models-safe-tensors/bert-base-uncased')
        self.bert = BertModel.from_pretrained('./Models/bert-base-uncased')
        
        for param in self.bert.parameters():
            param.requires_grad = False

        # self.dropout = nn.Dropout(0.1)
        self.linear1 = nn.Linear(768, 256)
        self.linear2 = nn.Linear(256, 2)

    def forward(self, input_ids, attention_mask):
        outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask)
        pooled_output = outputs.pooler_output
        # pooled_output = self.dropout(pooled_output)
        output = self.linear1(pooled_output)
        output = nn.ReLU()(output)
        output = self.linear2(output)
        return output


<div style="background-color:#F0E3D2; color:#19180F; font-size:15px; font-family:Verdana; padding:10px; border: 2px solid #19180F; border-radius:10px"> 
📌
Training the BERT model    </div>

In [5]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = BERTModel().to(device)
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-5)
criterion = nn.MSELoss()


<div style="background-color:#F0E3D2; color:#19180F; font-size:15px; font-family:Verdana; padding:10px; border: 2px solid #19180F; border-radius:10px"> 
📌
Creating data loader and performing sanity check    </div>

In [6]:
batch_size= 16

In [7]:
# Splitting training data into train and validation sets
train_dataset, val_dataset = train_test_split(train_dataset, test_size=0.2, random_state=0)
# Creating train loader

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
# Creating validation loader

val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=batch_size, shuffle=False)




<div style="background-color:#F0E3D2; color:#19180F; font-size:15px; font-family:Verdana; padding:10px; border: 2px solid #19180F; border-radius:10px"> 
📌
Training the model for 30 epochs    </div>

In [8]:
# Training loop
model.train()
for epoch in range(80):
    running_loss = 0.0
    for step, (input_ids, attention_mask, content, wording) in enumerate(train_loader):
        input_ids = input_ids.to(device)
        attention_mask = attention_mask.to(device)
        content = content.to(device)
        wording = wording.to(device)

        optimizer.zero_grad()

        outputs = model(input_ids, attention_mask)
        loss = criterion(outputs[:, 0], content) + criterion(outputs[:, 1], wording)
        loss.backward()
        optimizer.step()
        if step % 500 == 0:
            print("Epoch {}, Step {}, Loss: {}".format(epoch+1, step, loss.item()))

        running_loss += loss.item()

    print(f"Epoch {epoch+1} Loss: {running_loss / len(train_loader)}")

    # Validation loop
    model.eval()
    with torch.no_grad():
        val_loss = 0.0
        for val_step, (input_ids, attention_mask, content, wording) in enumerate(val_loader):
            input_ids = input_ids.to(device)
            attention_mask = attention_mask.to(device)
            content = content.to(device)
            wording = wording.to(device)

            val_outputs = model(input_ids, attention_mask)
            val_loss += criterion(val_outputs[:, 0], content) + criterion(val_outputs[:, 1], wording)

        print(f"Validation Loss: {val_loss / len(val_loader)}")
    model.train()



Epoch 1, Step 0, Loss: 1.934661865234375
Epoch 1 Loss: 1.8913553687523335
Validation Loss: 1.560971975326538
Epoch 2, Step 0, Loss: 2.0942630767822266
Epoch 2 Loss: 1.561480257836557
Validation Loss: 1.3814074993133545
Epoch 3, Step 0, Loss: 1.3874781131744385
Epoch 3 Loss: 1.4435649223646414
Validation Loss: 1.3647301197052002
Epoch 4, Step 0, Loss: 1.5011587142944336
Epoch 4 Loss: 1.3965491476497278
Validation Loss: 1.3562963008880615
Epoch 5, Step 0, Loss: 0.7546758651733398
Epoch 5 Loss: 1.3804464217348018
Validation Loss: 1.3446253538131714
Epoch 6, Step 0, Loss: 0.9566013216972351
Epoch 6 Loss: 1.3460948241454314
Validation Loss: 1.2784602642059326
Epoch 7, Step 0, Loss: 1.7036848068237305
Epoch 7 Loss: 1.3205303080732775
Validation Loss: 1.248953104019165
Epoch 8, Step 0, Loss: 1.4160575866699219
Epoch 8 Loss: 1.2807072738917094
Validation Loss: 1.2307137250900269
Epoch 9, Step 0, Loss: 0.9908875823020935
Epoch 9 Loss: 1.2746793061577841
Validation Loss: 1.200814962387085
Epoch 

KeyboardInterrupt: 


<div style="background-color:#F0E3D2; color:#19180F; font-size:15px; font-family:Verdana; padding:10px; border: 2px solid #19180F; border-radius:10px"> 
📌
Creating test loader    </div>

In [None]:
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=16, shuffle=False)


<div style="background-color:#F0E3D2; color:#19180F; font-size:15px; font-family:Verdana; padding:10px; border: 2px solid #19180F; border-radius:10px"> 
📌
Generating predictions on test set    </div>

In [None]:
model.eval()
predictions = []
with torch.no_grad():
    for input_ids, attention_mask in test_loader:
        input_ids = input_ids.to(device)
        attention_mask = attention_mask.to(device)

        outputs = model(input_ids, attention_mask)
        predictions.extend(outputs.cpu().numpy())



<div style="background-color:#F0E3D2; color:#19180F; font-size:15px; font-family:Verdana; padding:10px; border: 2px solid #19180F; border-radius:10px"> 
📌
Generating submission
    </div>

In [None]:
submission_df = pd.DataFrame({
    'student_id': test_data['student_id'],
    'content': [pred[0] for pred in predictions],
    'wording': [pred[1] for pred in predictions]
})

submission_df.to_csv('submission.csv', index=False)

In [None]:
submission_df