# Module 5: Testing the Fine-Tuned Model

**Goal**: Prove that our training actually worked.

**Steps**:
1.  Load Base Model.
2.  Load our trained LoRA adapter from `final_adapter` folder.
3.  Ask it a query it hasn't seen, and check if it follows the JSON format.

In [None]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import gc

gc.collect()
torch.cuda.empty_cache()

## 1. Load Base Model
Just like before, we load the empty, dumb base model first.

In [None]:
model_id = "Qwen/Qwen2.5-1.5B-Instruct"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16
)

base_model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    quantization_config=bnb_config, 
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

## 2. Load and Attach Adapters
This is where we inject the "brain" we trained.

In [None]:
adapter_path = "./final_adapter" 

print(f"Loading adapter from {adapter_path}...")
model = PeftModel.from_pretrained(base_model, adapter_path)
print("Adapter loaded successfully!")

## 3. Rate some Sentiment!
We try a new sentence the model hasn't seen.

In [None]:
def get_response(input_text):
    messages = [
        {"role": "user", "content": input_text}
    ]
    text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
    
    with torch.no_grad():
        generated_ids = model.generate(**model_inputs, max_new_tokens=50)
        
    generated_ids = [ 
        output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
    ]
    return tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

# Test Case 1: Positive
print("User: The workshop was absolutely brilliant!")
print("Assistant:", get_response("The workshop was absolutely brilliant!"))

# Test Case 2: Negative
print("\nUser: I am very disappointed with the delay.")
print("Assistant:", get_response("I am very disappointed with the delay."))