# Lab 2: Adapter Layers - Fine-Tuning a BERT Model for Classification
---
## Notebook 3: Inference

**Goal:** In this notebook, you will load the trained Adapter weights and use the fine-tuned model to make predictions on new sentence pairs.

**You will learn to:**
-   Reload the base model and tokenizer.
-   Load the trained Adapter weights from a checkpoint using `peft.PeftModel`.
-   Write a function to perform inference on a pair of sentences.
-   Interpret the model's output.


### Step 1: Reload Model and Adapter

First, we load the base model (`bert-base-uncased`) and the tokenizer, just as we did in the training notebook. Then, we use `PeftModel.from_pretrained` to load our saved adapter weights from the best checkpoint.

#### Key Hugging Face Components:

-   `transformers.AutoModelForSequenceClassification`: Loads the base model structure.
-   `peft.PeftModel.from_pretrained`: Loads the trained adapter weights and correctly attaches them to the base model.


In [None]:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from peft import PeftModel
import torch
import os

# --- Load Base Model and Tokenizer ---
model_checkpoint = "bert-base-uncased"
num_labels = 2

base_model = AutoModelForSequenceClassification.from_pretrained(
    model_checkpoint, 
    num_labels=num_labels
)
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint, use_fast=True)

# --- Load PEFT Adapter ---
output_dir = "./bert-adapters-mrpc"
# Find the latest checkpoint
latest_checkpoint = max(
    [os.path.join(output_dir, d) for d in os.listdir(output_dir) if d.startswith("checkpoint-")],
    key=os.path.getmtime
)
print(f"Loading adapter from: {latest_checkpoint}")

inference_model = PeftModel.from_pretrained(base_model, latest_checkpoint)

# Move model to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
inference_model.to(device)
inference_model.eval() # Set the model to evaluation mode

print("✅ Inference model loaded successfully!")


### Step 2: Perform Inference

Now, let's create a simple function to test our model. The function will take two sentences, tokenize them, pass them through the model, and return the predicted label.

The MRPC dataset labels are:
-   `0`: Not a paraphrase
-   `1`: Is a paraphrase


In [None]:
import torch.nn.functional as F

# Define the labels
id2label = {0: "Not a Paraphrase", 1: "Is a Paraphrase"}

def predict_paraphrase(sentence1, sentence2):
    """
    Takes two sentences and predicts if they are paraphrases.
    """
    # Tokenize the input
    inputs = tokenizer(sentence1, sentence2, return_tensors="pt", truncation=True, padding=True)
    
    # Move inputs to the correct device
    inputs = {k: v.to(device) for k, v in inputs.items()}
    
    # Get model output
    with torch.no_grad():
        outputs = inference_model(**inputs)
    
    # Get probabilities and prediction
    logits = outputs.logits
    probabilities = F.softmax(logits, dim=1).cpu().numpy()[0]
    prediction = torch.argmax(logits, dim=-1).cpu().item()
    
    print(f"Sentence 1: '{sentence1}'")
    print(f"Sentence 2: '{sentence2}'")
    print(f"Prediction: {id2label[prediction]}")
    print(f"Probabilities:")
    print(f"  - {id2label[0]}: {probabilities[0]:.4f}")
    print(f"  - {id2label[1]}: {probabilities[1]:.4f}")

# --- Test Cases ---
print("--- Test Case 1 (Should be a paraphrase) ---")
predict_paraphrase(
    "The company said the merger was subject to the approval of its shareholders.",
    "The company said the deal was subject to the approval of its shareholders."
)

print("\n--- Test Case 2 (Should NOT be a paraphrase) ---")
predict_paraphrase(
    "The cat sat on the mat.",
    "The dog played in the garden."
)
