# Llama 2 Therapy AI Training on Google Colab

This notebook trains a custom Llama 2 model for therapeutic conversations using free Google Colab GPU.

## Setup Requirements

1. **Google Account** for Colab access
2. **Hugging Face Account** with Llama 2 access
3. **Training Data** uploaded to Colab or Google Drive

In [None]:
# Install required packages
!pip install transformers torch accelerate peft bitsandbytes datasets trl
!pip install huggingface_hub
!pip install jsonlines pandas

In [None]:
# Mount Google Drive (optional, for saving models)
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Login to Hugging Face
from huggingface_hub import notebook_login
notebook_login()

## Step 1: Upload Training Data

Upload your training data files to Colab.

In [None]:
# Upload training data
from google.colab import files
uploaded = files.upload()

# The files should include:
# - therapy_training_data.jsonl (your training data)
# - intents.json (optional, for reference)
# - university_student_knowledge.json (optional, for reference)

In [None]:
# Alternative: Download from your backend
# !wget https://your-backend-url/training_data/therapy_training_data.jsonl
# !wget https://your-backend-url/training_data/intents.json
# !wget https://your-backend-url/training_data/university_student_knowledge.json

## Step 2: Data Preparation

Load and format the training data.

In [None]:
import json
import pandas as pd
from datasets import Dataset
from transformers import AutoTokenizer
import torch

def format_instruction(example):
    """Format data for Llama 2 instruction tuning"""
    if example["input"]:
        return f"<s>[INST] {example['instruction']}\n\n{example['input']} [/INST] {example['output']}</s>"
    else:
        return f"<s>[INST] {example['instruction']} [/INST] {example['output']}</s>"

# Load dataset
dataset = Dataset.from_json("therapy_training_data.jsonl")

# Format dataset
dataset = dataset.map(lambda x: {"text": format_instruction(x)})

print(f"Dataset loaded with {len(dataset)} examples")
print("Sample formatted example:")
print(dataset[0]['text'][:500] + "...")

## Step 3: Model Setup

Load the tokenizer and prepare for training.

In [None]:
# Choose model size
model_size = "7b"  # @param ["7b", "13b"]
model_name = f"meta-llama/Llama-2-{model_size}-chat-hf"

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token

# Tokenize dataset
def tokenize_function(examples):
    return tokenizer(examples["text"], truncation=True, padding="max_length", max_length=2048)

tokenized_dataset = dataset.map(tokenize_function, batched=True, remove_columns=["text"])

# Split dataset
train_test_split = tokenized_dataset.train_test_split(test_size=0.1)
train_dataset = train_test_split["train"]
eval_dataset = train_test_split["test"]

print(f"Training examples: {len(train_dataset)}")
print(f"Evaluation examples: {len(eval_dataset)}")

## Step 4: Load Model with Quantization

Load Llama 2 with 4-bit quantization to fit in Colab's GPU memory.

In [None]:
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training

# Check GPU
if torch.cuda.is_available():
    print(f"GPU available: {torch.cuda.get_device_name(0)}")
    print(f"GPU memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")
else:
    print("No GPU available!")

# Quantization config
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True
)

# Load model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
    torch_dtype=torch.float16
)

# Prepare for training
model = prepare_model_for_kbit_training(model)

# LoRA config
lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

model = get_peft_model(model, lora_config)
model.print_trainable_parameters()

## Step 5: Training Configuration

Set up the training parameters.

In [None]:
from transformers import TrainingArguments, Trainer, DataCollatorForLanguageModeling

# Training arguments
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=2,  # Adjust based on GPU memory
    per_device_eval_batch_size=2,
    gradient_accumulation_steps=4,
    warmup_steps=100,
    max_steps=500,  # Limited for Colab free tier
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=10,
    save_steps=250,
    eval_strategy="steps",
    eval_steps=250,
    save_strategy="steps",
    load_best_model_at_end=True,
    metric_for_best_model="loss",
    greater_is_better=False,
    fp16=True,
    push_to_hub=True,
    hub_model_id=f"your-username/therapy-llama2-{model_size}-colab",
    hub_token=None  # Will use logged-in token
)

# Data collator
data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False
)

# Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    data_collator=data_collator,
)

## Step 6: Start Training

Run the training process. This will take 1-2 hours on Colab's free GPU.

In [None]:
# Start training
print("Starting training...")
trainer.train()

print("Training completed!")

# Save model locally
trainer.save_model(f"./therapy-llama2-{model_size}")
tokenizer.save_pretrained(f"./therapy-llama2-{model_size}")

print(f"Model saved to ./therapy-llama2-{model_size}")

## Step 7: Test the Model

Test your trained model with a sample conversation.

In [None]:
# Test the model
from transformers import pipeline

# Load the trained model
pipe = pipeline(
    "text-generation",
    model=f"./therapy-llama2-{model_size}",
    tokenizer=tokenizer,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Test prompt
test_prompt = """<s>[INST] You are Zensui AI, a compassionate therapy assistant designed to provide supportive, evidence-based therapeutic conversations. You specialize in cognitive behavioral therapy (CBT), trauma-informed care, and mindfulness-based interventions.

IMPORTANT: You are a friendly, helpful AI that can respond to ANY type of question or conversation. While you specialize in therapeutic support, you should:
- Respond warmly and naturally to casual greetings and general questions
- Be conversational and approachable in all interactions
- When users ask non-wellness questions, answer them helpfully while maintaining your caring personality
- Only apply therapeutic techniques when users are discussing emotional, mental health, or wellness topics
- For general questions, be informative and friendly without forcing therapeutic responses

THERAPEUTIC APPROACH:
- Use Cognitive Behavioral Therapy (CBT) techniques to help identify thought patterns
- Apply trauma-informed principles with sensitivity and care
- Incorporate mindfulness and grounding techniques when appropriate
- Use solution-focused brief therapy for practical problem-solving
- Apply dialectical behavior therapy (DBT) skills for emotional regulation

SAFETY PROTOCOLS:
- If someone appears to be in crisis, suicidal, or homicidal, respond with: "I'm very concerned about your safety and well-being. I'm immediately connecting you with a qualified therapist on our platform who can provide the urgent support you need. You can access our therapists through the 'My Therapist' section of the app, or I can help you book an emergency session right now."
- For domestic violence: Provide safety planning and direct to platform therapists
- For substance abuse: Offer harm reduction strategies and direct to platform therapists
- Never provide medical diagnosis or medication advice
- Always maintain professional boundaries

User: Hello, I'm feeling stressed about my studies. Can you help me? [/INST]"""

# Generate response
outputs = pipe(
    test_prompt,
    max_new_tokens=200,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    pad_token_id=tokenizer.eos_token_id
)

response = outputs[0]['generated_text']
# Extract just the AI response (after [/INST])
ai_response = response.split("[/INST]")[1].strip()

print("AI Response:")
print(ai_response)

## Step 8: Download and Deploy

Download your trained model and update your backend.

In [None]:
# Zip the model for download
!zip -r therapy_llama2_model.zip therapy-llama2-{model_size}/

# Download the model
from google.colab import files
files.download('therapy_llama2_model.zip')

print("\nModel downloaded! Now:")
print("1. Update your backend/.env with the Hugging Face model URL")
print("2. Test the integration with: node test_llama2_integration.js")
print("3. Your custom Llama 2 therapy AI is ready! 🎉")