### Base Model

In [1]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "gpt2"
device = "cuda" if torch.cuda.is_available() else "cpu"

# Base model
base_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
).to(device)

tokenizer = AutoTokenizer.from_pretrained(model_name)

# Input text
prompt = "Explain why the sky is blue"
inputs = tokenizer(prompt, return_tensors="pt").to(device)

# Generate without LoRA
with torch.no_grad():
    base_out = base_model.generate(**inputs, max_new_tokens=20)

print("=== Base model output ===")
print(tokenizer.decode(base_out[0], skip_special_tokens=True))


  from .autonotebook import tqdm as notebook_tqdm
W0904 16:04:16.768000 20208 Lib\site-packages\torch\distributed\elastic\multiprocessing\redirects.py:29] NOTE: Redirects are currently not supported in Windows or MacOs.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


=== Base model output ===
Explain why the sky is blue.

The sky is blue.

The sky is blue.

The sky is


### LoRA Finetuned Model

In [2]:
from peft import PeftModel

# Load base model again
model_with_lora = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Attach LoRA adapter
model_with_lora = PeftModel.from_pretrained(model_with_lora, "outputs/lora_adapter")
model_with_lora.to(device)

# Generate with LoRA
with torch.no_grad():
    lora_out = model_with_lora.generate(**inputs, max_new_tokens=20)

print("\n=== Model with LoRA output ===")
print(tokenizer.decode(lora_out[0], skip_special_tokens=True))


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



=== Model with LoRA output ===
Explain why the sky is blue and why it's blue.

The sky is blue because it is a complex system of colors


In [7]:
import torch
import torch.nn.functional as F

prompts = [
    "Explain why the sky is blue",
    "List the steps for making a peanut butter and jelly sandwich",
    "Write a short story about a robot who learns to paint.",
    "Compare cats and dogs as pets in a few sentences.",
    "Create a story about a robot that falls in love with a human."
]

for prompt in prompts:
    inputs = tokenizer(prompt, return_tensors="pt").to(device)

    with torch.no_grad():
        # Get raw logits for base and LoRA models
        base_logits = base_model(**inputs).logits[:, -1, :]   # [batch, vocab]
        lora_logits = model_with_lora(**inputs).logits[:, -1, :]

    # Normalize to probability distributions
    base_probs = F.softmax(base_logits, dim=-1)
    lora_probs = F.softmax(lora_logits, dim=-1)

    lora_log_probs = F.log_softmax(lora_logits, dim=-1)

    # Compute metrics
    kl_div = F.kl_div(
        lora_log_probs,  # log(P)
        base_probs,        # Q
        reduction="batchmean"
    ).item()

    l2_dist = torch.norm(base_logits - lora_logits, p=2).item()

    print(f"\nPrompt: {prompt}")
    print(f"KL divergence: {kl_div:.6f}")
    print(f"L2 distance:   {l2_dist:.6f}")

    # (Optional) also print text generations like before
    base_out = base_model.generate(**inputs, max_new_tokens=50)
    lora_out = model_with_lora.generate(**inputs, max_new_tokens=50)
    print("\n=== Base model output ===")
    print(tokenizer.decode(base_out[0], skip_special_tokens=True))
    print("\n=== LoRA model output ===")
    print(tokenizer.decode(lora_out[0], skip_special_tokens=True))


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



Prompt: Explain why the sky is blue
KL divergence: 0.228882
L2 distance:   3722.000000


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



=== Base model output ===
Explain why the sky is blue.

The sky is blue.

The sky is blue.

The sky is blue.

The sky is blue.

The sky is blue.

The sky is blue.

The sky is blue.

=== LoRA model output ===
Explain why the sky is blue and why it's blue.

The sky is blue because it is a complex system of colors, which means that it is composed of many different shades of blue, including red, green, and blue. The colors of the sky are also complex

Prompt: List the steps for making a peanut butter and jelly sandwich
KL divergence: 0.120178
L2 distance:   5072.000000


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



=== Base model output ===
List the steps for making a peanut butter and jelly sandwich.

Step 1:

1. In a large bowl, combine the butter, sugar, and salt.

2. Add the eggs, milk, and vanilla.

3. Add the flour and mix well.



=== LoRA model output ===
List the steps for making a peanut butter and jelly sandwich.

1. Preheat oven to 350 degrees F.

2. In a large bowl, whisk together the butter, sugar, and eggs.

3. Pour the mixture into a large bowl and whisk until smooth.



Prompt: Write a short story about a robot who learns to paint.
KL divergence: 0.407227
L2 distance:   8680.000000


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



=== Base model output ===
Write a short story about a robot who learns to paint.

The story is about a robot who learns to paint.

The story is about a robot who learns to paint.

The story is about a robot who learns to paint.

The story is about a robot who learns to

=== LoRA model output ===
Write a short story about a robot who learns to paint.

"I'm a robot, and I'm not a painter," says the robot, who is wearing a white suit and a blue tie. "I'm a robot, and I'm not a painter."

The robot is a robot

Prompt: Compare cats and dogs as pets in a few sentences.
KL divergence: 0.207520
L2 distance:   8080.000000


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



=== Base model output ===
Compare cats and dogs as pets in a few sentences.

"I'm not going to say that I'm a cat lover, but I'm not going to say that I'm a dog lover either," he said. "I'm not going to say that I'm a dog lover either."


=== LoRA model output ===
Compare cats and dogs as pets in a few sentences.

"I'm sorry, but I'm sorry, but I'm sorry, but I'm sorry, but I'm sorry, but I'm sorry, but I'm sorry, but I'm sorry, but I'm sorry, but I'm

Prompt: Create a story about a robot that falls in love with a human.
KL divergence: 0.347168
L2 distance:   11000.000000


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



=== Base model output ===
Create a story about a robot that falls in love with a human.

The story is about a robot that falls in love with a human.

The story is about a robot that falls in love with a human.

The story is about a robot that falls in love with a human.



=== LoRA model output ===
Create a story about a robot that falls in love with a human.

The robot, named "Bumblebee," is a robotic robot that has been programmed to perform tasks such as picking up objects and moving them around. It's also capable of making complex calculations and making complex calculations on its own.


