# Product Owner LoRA Fine-tuning en Google Colab

**Objetivo**: Entrenar modelo Product Owner usando Qwen2.5-7B + LoRA

**Requirements**: Google Colab Free con GPU T4 (15GB VRAM)

**Tiempo estimado**: 3-5 horas para 3 epochs con ~200 samples

**Setup**: Solo necesitas clonar el repo (el dataset está incluido)

---

## 1️⃣ Verificar GPU y CUDA

In [None]:
# Verificar GPU disponible
!nvidia-smi

# Verificar PyTorch y CUDA
import torch
print(f"\n{'='*60}")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA version: {torch.version.cuda}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"VRAM: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")
    print(f"\n✅ GPU is ready!")
else:
    print(f"\n❌ No GPU detected!")
    print(f"   Runtime → Change runtime type → Hardware accelerator: T4 GPU")
print(f"{'='*60}")

## 2️⃣ Instalar Dependencias

In [None]:
# Instalar todas las librerías necesarias (compatibles con CUDA 12.x)
print("📦 Installing dependencies...\n")

# NO reinstalar PyTorch - Colab ya tiene PyTorch con CUDA 12.x
# Instalar versiones compatibles con CUDA 12.x
!pip install -q transformers>=4.40.0 peft>=0.10.0 bitsandbytes>=0.43.0 accelerate>=0.28.0 datasets>=2.18.0 typer

# Verificar instalación
print("\n🔍 Verifying installation...\n")
import torch
import transformers
import peft
import bitsandbytes
import datasets
import typer

print(f"✅ torch: {torch.__version__}")
print(f"✅ CUDA available: {torch.cuda.is_available()}")
print(f"✅ CUDA version: {torch.version.cuda}")
print(f"✅ transformers: {transformers.__version__}")
print(f"✅ peft: {peft.__version__}")
print(f"✅ bitsandbytes: {bitsandbytes.__version__}")
print(f"✅ datasets: {datasets.__version__}")
print(f"\n✅ All packages installed successfully!")

## 2.5️⃣ Configurar entorno (W&B / memoria CUDA)

Evita prompts de Weights & Biases y reduce la fragmentación de VRAM fijando estas variables globales antes de entrenar.

In [None]:
import os

os.environ["WANDB_DISABLED"] = "true"
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"

print(f"WANDB_DISABLED = {os.environ['WANDB_DISABLED']}")
print(f"PYTORCH_CUDA_ALLOC_CONF = {os.environ['PYTORCH_CUDA_ALLOC_CONF']}")

## 3️⃣ Clonar Repositorio desde GitHub

In [None]:
# Clonar el repositorio desde el branch correcto (incluye el dataset)
print("📥 Clonando repositorio desde GitHub...\n")

!git clone -b dspy-multi-role https://github.com/krukmat/agnostic-ai-pipeline.git
%cd agnostic-ai-pipeline

# Verificar branch y dataset
print("\n🔍 Verificando branch y dataset...\n")
!git branch --show-current
!ls -lh artifacts/distillation/po_teacher_supervised.jsonl

print("\n✅ Repositorio clonado exitosamente en branch dspy-multi-role!")

## 3.5️⃣ Montar Google Drive (IMPORTANTE - Evita pérdida de datos)

**Motivo**: Si Colab se desconecta, todo en `/content/` se borra. Guardando en Drive, el modelo sobrevive.

In [None]:
# Montar Google Drive para guardar el modelo automáticamente
from google.colab import drive
drive.mount('/content/drive')

# Crear directorio para modelos entrenados
!mkdir -p /content/drive/MyDrive/colab_models

print("✅ Google Drive montado exitosamente!")
print("📁 Ubicación de backup: /content/drive/MyDrive/colab_models/")
print("\n⚠️ IMPORTANTE: El modelo se guardará DIRECTAMENTE en Drive")
print("   Si Colab se desconecta, tus checkpoints estarán seguros en Drive")

## 4️⃣ Verificar Dataset y Script

In [None]:
import json
import os
from pathlib import Path

print("🔍 Verificando archivos necesarios...\n")

# Paths
script_path = Path("/content/agnostic-ai-pipeline/scripts/train_po_lora.py")
dataset_path = Path("/content/agnostic-ai-pipeline/artifacts/distillation/po_teacher_supervised.jsonl")

# Verificar script
if script_path.exists():
    print(f"✅ Script encontrado: {script_path}")
else:
    print(f"❌ Script NO encontrado: {script_path}")

# Verificar dataset
if dataset_path.exists():
    print(f"✅ Dataset encontrado: {dataset_path}")
    
    # Contar registros
    with open(dataset_path) as f:
        samples = [line for line in f if line.strip()]
        print(f"✅ Dataset tiene {len(samples)} registros")
    
    # Ver primer sample
    with open(dataset_path) as f:
        first = json.loads(f.readline())
        print(f"\n📝 Keys: {list(first.keys())}")
        print(f"📏 Prompt length: {len(first['prompt'])} chars")
        print(f"📏 Response length: {len(first['response'])} chars")
        print(f"\n🔍 Preview prompt:")
        print(first['prompt'][:300] + "...")
        print(f"\n✅ Dataset format is correct!")
else:
    print(f"❌ Dataset NO encontrado: {dataset_path}")
    print(f"\n🔍 Buscando dataset en el repo...")
    !find /content/agnostic-ai-pipeline -name "po_teacher_supervised.jsonl" 2>/dev/null

# Ejecutar training con BACKUP AUTOMÁTICO a Google Drive
# ⚠️ IMPORTANTE: Este comando guarda el modelo directamente en Drive
# Si Colab se desconecta, tus checkpoints estarán seguros

!python scripts/train_po_lora.py \
    --data-path /content/agnostic-ai-pipeline/artifacts/distillation/po_teacher_supervised.jsonl \
    --base-model Qwen/Qwen2.5-7B-Instruct \
    --output-dir /content/drive/MyDrive/colab_models/po_student_v1 \
    --rank 32 \
    --alpha 64 \
    --dropout 0.05 \
    --epochs 3 \
    --batch-size 1 \
    --gradient-accumulation-steps 8 \
    --lr 1e-4 \
    --max-length 2048 \
    --load-4bit \
    --bnb-compute-dtype float16 \
    --gradient-checkpointing

# Copiar también a /content por si quieres usar el modelo inmediatamente
print("\n📋 Copiando modelo a /content para acceso rápido...")
!cp -r /content/drive/MyDrive/colab_models/po_student_v1 /content/agnostic-ai-pipeline/artifacts/models/ 2>/dev/null || true
print("✅ Training completado! Modelo guardado en Drive y /content")

In [None]:
# Ejecutar training
!python scripts/train_po_lora.py \
    --data-path /content/agnostic-ai-pipeline/artifacts/distillation/po_teacher_supervised.jsonl \
    --base-model Qwen/Qwen2.5-7B-Instruct \
    --output-dir /content/agnostic-ai-pipeline/artifacts/models/po_student_v1 \
    --rank 32 \
    --alpha 64 \
    --dropout 0.05 \
    --epochs 3 \
    --batch-size 1 \
    --gradient-accumulation-steps 8 \
    --lr 1e-4 \
    --max-length 2048 \
    --load-4bit \
    --bnb-compute-dtype float16 \
    --gradient-checkpointing

# Comando ALTERNATIVO con parámetros más conservadores
# Descomenta y ejecuta si el comando anterior falló con OOM

# !python scripts/train_po_lora.py \
#     --data-path /content/agnostic-ai-pipeline/artifacts/distillation/po_teacher_supervised.jsonl \
#     --base-model Qwen/Qwen2.5-7B-Instruct \
#     --output-dir /content/drive/MyDrive/colab_models/po_student_v1 \
#     --rank 16 \
#     --alpha 32 \
#     --dropout 0.05 \
#     --epochs 3 \
#     --batch-size 1 \
#     --gradient-accumulation-steps 16 \
#     --lr 1e-4 \
#     --max-length 1536 \
#     --load-4bit \
#     --bnb-compute-dtype float16 \
#     --gradient-checkpointing

# # Copiar también a /content
# !cp -r /content/drive/MyDrive/colab_models/po_student_v1 /content/agnostic-ai-pipeline/artifacts/models/ 2>/dev/null || true

In [None]:
# Comando ALTERNATIVO con parámetros más conservadores
# Descomenta y ejecuta si el comando anterior falló con OOM

# !python scripts/train_po_lora.py \
#     --data-path /content/agnostic-ai-pipeline/artifacts/distillation/po_teacher_supervised.jsonl \
#     --base-model Qwen/Qwen2.5-7B-Instruct \
#     --output-dir /content/agnostic-ai-pipeline/artifacts/models/po_student_v1 \
#     --rank 16 \
#     --alpha 32 \
#     --dropout 0.05 \
#     --epochs 3 \
#     --batch-size 1 \
#     --gradient-accumulation-steps 16 \
#     --lr 1e-4 \
#     --max-length 1536 \
#     --load-4bit \
#     --bnb-compute-dtype float16 \
#     --gradient-checkpointing

## 6️⃣ Monitorear GPU (Ejecutar en otra celda mientras entrena)

In [None]:
# Ver uso de GPU en tiempo real (actualiza cada 5 segundos)
!nvidia-smi --query-gpu=utilization.gpu,utilization.memory,memory.used,memory.total --format=csv -l 5

# Ver checkpoints guardados después del training
print("📁 Checkpoints guardados:\n")

# Ubicación principal en Drive (persistente)
print("🔐 DRIVE (Ubicación principal - sobrevive desconexiones):")
!ls -lh /content/drive/MyDrive/colab_models/po_student_v1/ 2>/dev/null || echo "   ⚠️ No encontrado en Drive"

# Ubicación temporal en /content (solo si se copió)
print("\n💾 /content (Ubicación temporal):")
!ls -lh /content/agnostic-ai-pipeline/artifacts/models/po_student_v1/ 2>/dev/null || echo "   ⚠️ No encontrado en /content"

print("\n✅ Tu modelo está guardado en Drive de forma permanente")

In [None]:
# Ver checkpoints guardados después del training
print("📁 Checkpoints guardados:\n")
!ls -lh /content/agnostic-ai-pipeline/artifacts/models/po_student_v1/

## 8️⃣ Test de Inferencia Rápido

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

print("🔄 Loading model...\n")

base_model = "Qwen/Qwen2.5-7B-Instruct"
adapter_path = "/content/agnostic-ai-pipeline/artifacts/models/po_student_v1"

# Cargar tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model)

# Cargar modelo base en 4-bit
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config,
    device_map="auto",
)

# Cargar adapter LoRA
model = PeftModel.from_pretrained(model, adapter_path)

print("✅ Model loaded!\n")

# Test prompt
test_prompt = """[INSTRUCTIONS]
You are a Product Owner AI agent validating business requirements.

[REQUIREMENTS]
business_domain: Blog Platform
primary_features:
  - Users can create blog posts with title and content
  - Posts must support markdown formatting
  - User authentication is required

[YOUR RESPONSE]"""

print("💭 Test prompt:")
print(test_prompt)
print("\n" + "="*60 + "\n")

# Generate
inputs = tokenizer(test_prompt, return_tensors="pt").to("cuda")
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id,
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("🤖 Model response:")
print(response[len(test_prompt):])

## 9️⃣ Evaluar baseline vs student (`scripts/eval_po_student.py`)

Estas celdas ejecutan el nuevo evaluador con el mismo prompt supervisado, generan ≥20 ejemplos y guardan los resultados en Google Drive (`/content/drive/MyDrive/colab_models/inference_results`).

In [None]:
import os
import subprocess

repo_dir = "/content/agnostic-ai-pipeline"
output_dir = "/content/drive/MyDrive/colab_models/inference_results"
os.makedirs(output_dir, exist_ok=True)

env = os.environ.copy()
env["PYTHONPATH"] = repo_dir

cmd = [
    "python",
    "scripts/eval_po_student.py",
    "--dataset-path", "artifacts/synthetic/product_owner/product_owner_val.jsonl",
    "--output-dir", output_dir,
    "--tag", "baseline",
    "--base-model", "Qwen/Qwen2.5-7B-Instruct",
    "--max-samples", "20",
    "--temperature", "0.2",
    "--top-p", "0.9",
    "--max-new-tokens", "900",
    "--retries", "1",
    "--load-4bit",
    "--bnb-compute-dtype", "float16"
]

print("
🚀 Running baseline evaluation...
")
subprocess.run(cmd, cwd=repo_dir, check=True, env=env)

In [None]:
import os
import subprocess

repo_dir = "/content/agnostic-ai-pipeline"
output_dir = "/content/drive/MyDrive/colab_models/inference_results"
os.makedirs(output_dir, exist_ok=True)

env = os.environ.copy()
env["PYTHONPATH"] = repo_dir

cmd = [
    "python",
    "scripts/eval_po_student.py",
    "--dataset-path", "artifacts/synthetic/product_owner/product_owner_val.jsonl",
    "--output-dir", output_dir,
    "--tag", "student",
    "--base-model", "Qwen/Qwen2.5-7B-Instruct",
    "--adapter-path", "artifacts/models/po_student_v1",
    "--max-samples", "20",
    "--temperature", "0.2",
    "--top-p", "0.9",
    "--max-new-tokens", "900",
    "--retries", "1",
    "--load-4bit",
    "--bnb-compute-dtype", "float16"
]

print("
🚀 Running student evaluation...
")
subprocess.run(cmd, cwd=repo_dir, check=True, env=env)

In [None]:
import json
from pathlib import Path

results_dir = Path("/content/drive/MyDrive/colab_models/inference_results")
print(f"📁 Resultados guardados en: {results_dir}")
for path in sorted(results_dir.glob("*.json")):
    data = json.loads(path.read_text())
    metrics = data.get("metrics", {})
    mean = metrics.get("mean")
    status = data.get("valid_samples", 0)
    print(f"- {path.name}: valid={status} mean={mean}")

## 🔽 Descargar resultados de inferencia (opcional)

Comprímelos desde Drive para subirlos al repo o compartirlos.

In [None]:
import os
import subprocess
from google.colab import files

results_root = "/content/drive/MyDrive/colab_models"
results_dir = os.path.join(results_root, "inference_results")
zip_path = os.path.join(results_root, "inference_results.zip")

if not os.path.isdir(results_dir):
    raise SystemExit(f"⚠️ No se encontró {results_dir}. Ejecuta antes las celdas de evaluación.")

print("📦 Empaquetando inference_results en Drive...")
subprocess.run(["zip", "-r", "inference_results.zip", "inference_results"], cwd=results_root, check=True)

print("⬇️ Descargando zip...")
files.download(zip_path)

from google.colab import files

print("📦 Compressing model from Drive...\n")

# Comprimir desde Drive (ubicación permanente)
!cd /content/drive/MyDrive/colab_models && zip -r po_student_v1.zip po_student_v1

print("\n⬇️ Downloading...")
files.download('/content/drive/MyDrive/colab_models/po_student_v1.zip')

print("\n✅ Download complete!")
print("💡 TIP: Puedes acceder al modelo directamente desde Drive en cualquier momento")

## 🔁 Subir resultados al repo (opcional)

Si quieres conservar los JSON/zip directamente en tu branch, usa estas celdas. Requieren un PAT o credenciales GitHub válidas.

In [None]:
# Configurar credenciales Git (ejecuta una vez por sesión)
!git config --global user.name "Tu Nombre"
!git config --global user.email "tu.email@example.com"

In [None]:
import os
from pathlib import Path

repo_dir = Path("/content/agnostic-ai-pipeline")
results_dir = Path("/content/drive/MyDrive/colab_models/inference_results")
repo_results = repo_dir / "inference_results"
repo_results.mkdir(exist_ok=True)

# Copiar todos los JSON generados en Drive hacia el repo para versionarlos
for src in results_dir.glob("*.json"):
    dst = repo_results / src.name
    print(f"📄 Copiando {src.name} -> {dst}")
    dst.write_text(src.read_text())

# Copiar también el zip si existe
zip_src = results_dir.parent / "inference_results.zip"
if zip_src.exists():
    dst = repo_results / zip_src.name
    print(f"📦 Copiando {zip_src.name} -> {dst}")
    dst.write_bytes(zip_src.read_bytes())

print("✅ Archivos listos en el repo. Ahora haz commit/push con la celda siguiente.")

In [None]:
%%bash
set -e
cd /content/agnostic-ai-pipeline

# Asegúrate de estar en el branch correcto
BRANCH=$(git branch --show-current)
echo "📌 Branch actual: $BRANCH"

git status -sb

git add inference_results

git commit -m "chore(po): add latest evaluation artifacts" || echo "⚠️ Nada para commitear"

git push origin "$BRANCH"

In [None]:
from google.colab import files

print("📦 Compressing model...\n")
!zip -r po_student_v1.zip /content/agnostic-ai-pipeline/artifacts/models/po_student_v1

print("\n⬇️ Downloading...")
files.download('po_student_v1.zip')

print("\n✅ Download complete!")

---

## 🔧 Troubleshooting

### Error: CUDA out of memory
**Solución**: Ejecuta la celda alternativa (Cell 5.2) con parámetros más conservadores

### Error: No module named 'bitsandbytes'
**Solución**: Reinstala bitsandbytes
```python
!pip uninstall -y bitsandbytes
!pip install bitsandbytes==0.41.0 --no-cache-dir
```

### Error: No GPU detected
**Solución**: 
1. Runtime → Change runtime type
2. Hardware accelerator: **T4 GPU**
3. Save y reconectar

### Training muy lento (< 1 it/s)
**Solución**: Usa la celda alternativa con `--max-length 1536` o `--gradient-accumulation-steps 4`

---

## 📊 Métricas Esperadas

```
Epoch 1/3: 100%|██████████| 25/25 [12:34<00:00,  0.05it/s]
{'loss': 1.2345, 'learning_rate': 9.5e-05, 'epoch': 1.0}

Epoch 2/3: 100%|██████████| 25/25 [12:31<00:00,  0.05it/s]
{'loss': 0.8912, 'learning_rate': 5.0e-05, 'epoch': 2.0}

Epoch 3/3: 100%|██████████| 25/25 [12:29<00:00,  0.05it/s]
{'loss': 0.6234, 'learning_rate': 5.0e-06, 'epoch': 3.0}
```

**Indicadores de éxito**:
- ✅ Loss decreciente (1.2 → 0.6)
- ✅ No crashes por OOM
- ✅ ~0.03-0.05 it/s en T4
- ✅ Checkpoints guardados cada epoch

---

## 📚 Referencias

- **Setup Guide**: `docs/COLAB_TRAINING_SETUP.md`
- **Training Script**: `scripts/train_po_lora.py`
- **HuggingFace Qwen2.5**: https://huggingface.co/Qwen/Qwen2.5-7B-Instruct
- **PEFT Docs**: https://huggingface.co/docs/peft/

---

**Última actualización**: 2025-11-13