# AI Me - Model Testing & Comparison

This notebook tests and compares the base model against your fine-tuned LoRA model.
Perfect for demonstrating how your model can rewrite text in your personal style.

## What This Notebook Does

1. **Loads both models**: Base Llama and your fine-tuned version
2. **Compares responses** to the same prompts
3. **Tests rewrite capabilities** - generate with base model, rewrite with yours
4. **Interactive testing** with custom prompts

## Use Cases
- **Style transfer**: Take formal text and make it conversational
- **Tone adjustment**: Convert professional text to your personal voice
- **Content refinement**: Improve and personalize existing text
- **A/B testing**: See the difference between base and fine-tuned models

## 1. Check GPU and Install Dependencies

In [None]:
# Check GPU
!nvidia-smi

# Install required packages
!pip -q install -U "transformers>=4.43.3" "accelerate>=0.32.0" "peft>=0.11.1" \
  "datasets>=2.20.0" "tokenizers>=0.19.1" "bitsandbytes>=0.43.2" sentencepiece huggingface_hub

## 2. Import Libraries and Setup

In [None]:
from huggingface_hub import login, InferenceClient
from google.colab import userdata, files
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel, PeftConfig
import json
from pathlib import Path

# Check PyTorch version
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA device: {torch.cuda.get_device_name()}")

## 3. Login to Hugging Face

**Important**: Set your HF key in Colab's userdata first:
1. Go to Colab menu: Runtime → Manage secrets
2. Add a new secret with key: `hf_key`
3. Value: Your Hugging Face API token
4. Click 'Add secret'

In [None]:
# Login to Hugging Face
try:
    hf_key = userdata.get('hf_key')
    if hf_key:
        login(hf_key)
        print("✅ Successfully logged in to Hugging Face")
    else:
        print("❌ HF key not found in userdata. Please set it in Runtime → Manage secrets")
        print("Key name should be: hf_key")
except Exception as e:
    print(f"❌ Error logging in: {e}")
    print("Please check your HF key in userdata")

## 4. Clone Repository

In [None]:
%cd /content
!rm -rf ai_me
!git clone https://github.com/sandronatchkebia/ai_me.git
%cd ai_me
print(f"Repository cloned to: {!pwd}")

## 5. Upload Your Fine-tuned Model

**Upload the zip file generated by the training script** (e.g., `ai_me_lora_latest.zip`)

In [None]:
# Upload fine-tuned model
print("📁 Please upload your fine-tuned model zip file:")
uploaded = files.upload()

# Extract model
print("📂 Extracting model...")
for filename in uploaded.keys():
    if filename.endswith('.zip'):
        !unzip -o "{filename}" -d /content/ai_me/fine_tuning/out/ >/dev/null
        print(f"✅ Extracted {filename}")
        break

# List extracted models
print("\n�� Available models:")
!ls -la /content/ai_me/fine_tuning/out/

## 6. Load Both Models

Load the base model and your fine-tuned LoRA model for comparison.

In [None]:
# Model configuration
BASE_MODEL_ID = "meta-llama/Meta-Llama-3.1-8B-Instruct"
LORA_MODEL_PATH = "/content/ai_me/fine_tuning/out/ai_me_lora_llama3p1_8b"

print("�� Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL_ID, trust_remote_code=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

print("🔧 Loading base model...")
base_model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL_ID,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

print("🔧 Loading fine-tuned LoRA model...")
lora_model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL_ID,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

# Load LoRA weights
lora_model = PeftModel.from_pretrained(lora_model, LORA_MODEL_PATH)

print("✅ Both models loaded successfully!")
print(f"Base model: {BASE_MODEL_ID}")
print(f"LoRA model: {LORA_MODEL_PATH}")

## 7. Text Generation Function

Helper function to generate text from both models.

In [None]:
def generate_text(model, tokenizer, prompt, max_length=512, temperature=0.7):
    """Generate text from a model with the given prompt."""
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=max_length,
            temperature=temperature,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id,
            eos_token_id=tokenizer.eos_token_id
        )
    
        generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    # Remove the input prompt from the output
    response = generated_text[len(prompt):].strip()
    return response

def format_chat_prompt(user_message, system_message=""):
    """Format a chat prompt for Llama models."""
    if system_message:
        return f"<|system|>\n{system_message}<|end|>\n<|user|>\n{user_message}<|end|>\n<|assistant|>\n"
    else:
        return f"<|user|>\n{user_message}<|end|>\n<|assistant|>\n"

print("✅ Generation functions ready!")

## 8. Basic Model Comparison

Test both models with simple prompts to see the difference.

In [None]:
# Test prompts
test_prompts = [
    "Tell me about yourself in a casual way.",
    "Explain machine learning in simple terms.",
    "Write a short story about a robot."
]

print("🔄 Testing basic prompts...\n")

for i, prompt in enumerate(test_prompts, 1):
    print(f"\n{'='*60}")
    print(f"PROMPT {i}: {prompt}")
    print(f"{'='*60}")
    
    # Generate with base model
    print("\n🤖 BASE MODEL:")
    base_response = generate_text(base_model, tokenizer, prompt, max_length=200)
    print(base_response)
    
    # Generate with fine-tuned model
    print("\n🎯 FINE-TUNED MODEL:")
    lora_response = generate_text(lora_model, tokenizer, prompt, max_length=200)
    print(lora_response)
    
    print("\n" + "-"*60)

## 9. Rewrite Tool Testing

This is the main value proposition - use your fine-tuned model to rewrite text in your style!

In [None]:
# Sample texts to rewrite
texts_to_rewrite = [
    "The implementation of machine learning algorithms requires careful consideration of data preprocessing techniques and hyperparameter optimization strategies to achieve optimal performance metrics.",
    
    "In accordance with company policy, all employees must submit their expense reports by the 15th of each month. Failure to comply with this requirement may result in delayed reimbursement processing.",
    
    "The weather forecast indicates a 60% probability of precipitation during the afternoon hours, with temperatures ranging from 18 to 22 degrees Celsius."
]

print("🔄 Testing rewrite capabilities...\n")

for i, formal_text in enumerate(texts_to_rewrite, 1):
    print(f"\n{'='*80}")
    print(f"ORIGINAL TEXT {i}:")
    print(f"{formal_text}")
    print(f"{'='*80}")
    
    # Generate with base model
    print("\n🤖 BASE MODEL REWRITE:")
    base_prompt = f"Rewrite this text in a more casual, conversational style: {formal_text}"
    base_response = generate_text(base_model, tokenizer, base_prompt, max_length=150)
    print(base_response)
    
    # Generate with fine-tuned model (your style)
    print("\n🎯 YOUR STYLE REWRITE:")
    lora_prompt = f"Rewrite this text in my personal style: {formal_text}"
    lora_response = generate_text(lora_model, tokenizer, lora_prompt, max_length=150)
    print(lora_response)
    
    print("\n" + "-"*80)

## 10. Interactive Testing

Test your own prompts and see how both models respond.

In [None]:
def interactive_test():
    """Interactive testing function."""
    print("🎯 Interactive Testing Mode")
    print("Type 'quit' to exit\n")
    
    while True:
        user_input = input("\nEnter your prompt (or 'quit'): ")
        if user_input.lower() == 'quit':
            break
        
        if not user_input.strip():
            continue
        
        print(f"\n{'='*60}")
        print(f"PROMPT: {user_input}")
        print(f"{'='*60}")
        
        # Base model
        print("\n🤖 BASE MODEL:")
        try:
            base_response = generate_text(base_model, tokenizer, user_input, max_length=300)
            print(base_response)
        except Exception as e:
            print(f"Error: {e}")
        
        # Fine-tuned model
        print("\n🎯 YOUR STYLE:")
        try:
            lora_response = generate_text(lora_model, tokenizer, user_input, max_length=300)
            print(lora_response)
        except Exception as e:
            print(f"Error: {e}")
        
        print("\n" + "-"*60)

# Run interactive testing
interactive_test()

## 11. Style Analysis

Analyze the differences between base and fine-tuned model outputs.

In [None]:
# Style comparison prompts
style_prompts = [
    "Write a short email to a colleague about a project update.",
    "Explain a technical concept to a beginner.",
    "Give advice to someone starting their career."
]

print("🔍 Style Analysis - Comparing Writing Styles\n")

for prompt in style_prompts:
    print(f"\n{'='*70}")
    print(f"PROMPT: {prompt}")
    print(f"{'='*70}")
    
    # Generate responses
    base_response = generate_text(base_model, tokenizer, prompt, max_length=200)
    lora_response = generate_text(lora_model, tokenizer, prompt, max_length=200)
    
    print("\n🤖 BASE MODEL STYLE:")
    print(base_response)
    
        print("\n🎯 YOUR PERSONAL STYLE:")
    print(lora_response)
    
    print("\n💡 STYLE DIFFERENCES:")
    print("• Base model: More formal, generic, standard AI responses")
    print("• Your model: Personal tone, conversational, your unique voice")
    
    print("\n" + "-"*70)

## 12. Summary and Next Steps

Your fine-tuned model is now ready to use as a personal rewrite tool!

In [None]:
print("🎉 Testing Complete!\n")
print("📊 WHAT YOU'VE ACCOMPLISHED:")
print("✅ Loaded both base and fine-tuned models")
print("✅ Compared responses side-by-side")
print("✅ Tested rewrite capabilities")
print("✅ Analyzed style differences")
print("✅ Interactive testing mode")

print("\n🚀 NEXT STEPS:")
print("1. Use your model for personal content creation")
print("2. Rewrite formal text in your conversational style")
print("3. Generate content that sounds like you wrote it")
print("4. Share your fine-tuned model with others")
print("5. Continue training with more data to improve further")

print("\n💡 KEY BENEFIT:")
print("Your model now captures your personal writing style and can be used")
print("as a powerful rewrite tool to make any text sound like you wrote it!")

# Clean up memory
del base_model, lora_model
torch.cuda.empty_cache()
print("\n🧹 Memory cleaned up!")