Important: Run the code cell below to download the trained model stored in the cloud (GDrive) so that it can be loaded to this environment to make predictions.

In [1]:
# 1) Install gdown if necessary
!pip install gdown

# 2) Download your model file from Google Drive
!gdown "https://drive.google.com/uc?id=1RuBncwHrCyTw1ct686zWZWc3e7w6NYX7" -O best_deberta_bilstm_model.pt

Downloading...
From (original): https://drive.google.com/uc?id=1RuBncwHrCyTw1ct686zWZWc3e7w6NYX7
From (redirected): https://drive.google.com/uc?id=1RuBncwHrCyTw1ct686zWZWc3e7w6NYX7&confirm=t&uuid=9f9dcadf-1af3-4633-b239-56c1cc07fdbe
To: /content/best_deberta_bilstm_model.pt
100% 750M/750M [00:23<00:00, 32.2MB/s]


## 🧠 DeBERTa + Bi-LSTM + Linear Head

This notebook demonstrates the evaluation of a **hybrid DeBERTa + BiLSTM model** trained for a **binary Natural Language Inference (NLI)** task.

We aim to determine whether a given **hypothesis** is logically entailed by a **premise** using a pretrained transformer backbone (DeBERTa) with an LSTM classifier.

This demo performs the following:

- Loads and preprocesses the `dev.csv` evaluation dataset.
- Uses the DeBERTa tokenizer for encoding inputs.
- Defines a custom PyTorch Dataset and DataLoader.
- Loads our trained DeBERTa + LSTM model from disk.
- Runs inference to predict labels.
- Evaluates model performance using accuracy, precision, recall, and F1-score.
- Visualizes the confusion matrix.
- Saves predictions to `predictions.csv`.


In [2]:
# 1. Install dependencies (only needed for Colab)
!pip install transformers -q


In [12]:
# 2. Imports
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from transformers import AutoTokenizer, AutoModel
import pandas as pd
from sklearn.metrics import classification_report, confusion_matrix, ConfusionMatrixDisplay
import matplotlib.pyplot as plt


In [13]:
# 3. Device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")


## Load DeBERTa Tokenizer

We initialize the `microsoft/deberta-base` tokenizer for use in encoding premise-hypothesis pairs.

In [14]:
# 4. Tokenizer
tokenizer = AutoTokenizer.from_pretrained("microsoft/deberta-v3-base")

def tokenize_premise_hypothesis(premises, hypotheses, max_length=128):
    return tokenizer(
        premises,
        hypotheses,
        padding='max_length',
        truncation=True,
        max_length=max_length,
        return_tensors='pt'
    )




## Define Custom NLI Dataset

We create a custom PyTorch Dataset class that tokenizes each (premise, hypothesis) pair and prepares inputs for the model.


In [15]:
# 5. Dataset class
class NLIDataset(Dataset):
    def __init__(self, encodings, labels):
        self.encodings = encodings
        self.labels = labels

    def __getitem__(self, idx):
        return {
            'input_ids': self.encodings['input_ids'][idx],
            'attention_mask': self.encodings['attention_mask'][idx],
            'labels': self.labels[idx]
        }

    def __len__(self):
        return len(self.labels)


## Define the DeBERTa Model

Here we define our hybrid architecture:
- The **DeBERTa encoder** provides contextualized embeddings.
- A **BiLSTM layer** captures sequential information.
- A **linear head** performs classification.


In [16]:
# 6. Model class
class DeBERTaWithBiLSTM(nn.Module):
    def __init__(self, hidden_dim=384, dropout=0.3892):
        super().__init__()
        self.base_model = AutoModel.from_pretrained("microsoft/deberta-v3-base")
        self.bilstm = nn.LSTM(768, hidden_dim, num_layers=1, bidirectional=True, batch_first=True)
        self.dropout = nn.Dropout(dropout)
        self.classifier = nn.Linear(hidden_dim * 2, 2)

    def forward(self, input_ids, attention_mask, labels=None):
        outputs = self.base_model(input_ids=input_ids, attention_mask=attention_mask)
        sequence_output = outputs.last_hidden_state
        lstm_out, _ = self.bilstm(sequence_output)
        pooled_output = lstm_out[:, 0]
        out = self.dropout(pooled_output)
        logits = self.classifier(out)
        if labels is not None:
            loss_fn = nn.CrossEntropyLoss()
            loss = loss_fn(logits, labels)
            return {'loss': loss, 'logits': logits}
        return {'logits': logits}


## Load the Trained Model

We load the pretrained weights from disk and move the model to the appropriate device (CPU/GPU). We also set it to evaluation mode.


In [17]:
# 8. Load model
model = DeBERTaWithBiLSTM().to(device)
model.load_state_dict(torch.load("best_deberta_bilstm_model.pt", map_location=device))
model.eval()
print(" Model loaded")


 Model loaded


## Preprocess test data and Make predictions

We pass each batch through the model and collect the predicted labels for performance testing after tokenizing the test data.


In [21]:
# Load test data
test_df = pd.read_csv("test.csv")  # Make sure 'test.csv' has 'premise' and 'hypothesis' columns

# Tokenize premise-hypothesis pairs
test_encodings = tokenizer(
    test_df['premise'].tolist(),
    test_df['hypothesis'].tolist(),
    padding='max_length',
    truncation=True,
    max_length=128,
    return_tensors='pt'
)

# Create test dataset and loader
test_dataset = NLIDataset(test_encodings, labels=torch.zeros(len(test_df), dtype=torch.long))
test_loader = DataLoader(test_dataset, batch_size=32)

# Run model inference
test_preds = []

model.eval()
with torch.no_grad():
    for batch in test_loader:
        input_ids = batch["input_ids"].to(device)
        attention_mask = batch["attention_mask"].to(device)

        outputs = model(input_ids=input_ids, attention_mask=attention_mask)
        preds = torch.argmax(outputs["logits"], dim=1)
        test_preds.extend(preds.cpu().numpy())



## Save Predictions

Finally, we add the model’s predictions to the original DataFrame and save the results to `predictions.csv`.


In [22]:
# Save predictions to CSV for test data (only the 'prediction' column)
test_predictions_df = pd.DataFrame({'prediction': test_preds})
test_predictions_df.to_csv("Group_18_C.csv", index=False)
print("test predictions saved!")


test predictions saved!
