[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Showmick119/Fine-Tuning-Open-Source-LLM/blob/main/notebooks/test_finetuned.ipynb)

# Testing the Fine-tuned Model

This notebook demonstrates how to use your fine-tuned model for text generation. We'll load the model with the trained LoRA adapter and test it with various prompts.

## Setup

First, let's install the required packages and set up our environment.

In [None]:
%pip install -q torch transformers peft bitsandbytes accelerate

### Import Dependencies

Let's import our text generation module and other required libraries.

In [None]:
import sys
sys.path.append('.')

from inference.generate_text import TextGenerator

## Initialize Text Generator

Now let's initialize our text generator with the fine-tuned model. Make sure to update the `adapter_path` to point to your trained LoRA adapter.

In [None]:
# Initialize the text generator
generator = TextGenerator(
    base_model_name="mistralai/Mistral-7B-v0.1",
    adapter_path="outputs/checkpoints",  # Update this path
    device="auto",
    load_8bit=True,  # Set to False if running on CPU
    temperature=0.7
)

## Test Single Generation

Let's try generating text with a single instruction.

In [None]:
# Generate text with a single instruction
instruction = "Write a poem about artificial intelligence"
response = generator.generate(instruction=instruction)

print("Generated Response:")
print("-" * 40)
print(response)
print("-" * 40)

## Test Batch Generation

We can also generate responses for multiple instructions at once using batch generation.

In [None]:
# Generate text for multiple instructions
instructions = [
    "Explain quantum computing in simple terms",
    "Write a short story about space exploration",
    "Describe the concept of machine learning"
]

responses = generator.generate_batch(instructions=instructions)

for i, (instruction, response) in enumerate(zip(instructions, responses), 1):
    print(f"\nExample {i}:")
    print("-" * 40)
    print(f"Instruction: {instruction}")
    print("\nResponse:")
    print(response)
    print("-" * 40)

## Test with Input Context

Let's try generating text with both instruction and input context.

In [None]:
# Generate text with instruction and input
instruction = "Summarize the following text"
input_text = """
Artificial intelligence has transformed various industries, from healthcare to transportation.
Machine learning algorithms can now diagnose diseases, drive cars, and even create art.
However, these advancements also raise important ethical questions about privacy, bias,
and the future of human work.
"""

response = generator.generate(
    instruction=instruction,
    input_text=input_text,
    max_new_tokens=200
)

print("Generated Response:")
print("-" * 40)
print(response)
print("-" * 40)

## Experiment with Generation Parameters

Finally, let's try different generation parameters to see how they affect the output.

In [None]:
# Test with different generation parameters
instruction = "Write a creative story about a time traveler"

# More deterministic output
response_deterministic = generator.generate(
    instruction=instruction,
    temperature=0.3,
    top_p=0.85,
    top_k=40,
    max_new_tokens=200
)

# More creative output
response_creative = generator.generate(
    instruction=instruction,
    temperature=0.9,
    top_p=0.95,
    top_k=60,
    max_new_tokens=200
)

print("Deterministic Response (temperature=0.3):")
print("-" * 40)
print(response_deterministic)
print("\nCreative Response (temperature=0.9):")
print("-" * 40)
print(response_creative)