# Product Owner LoRA Fine-tuning en Google Colab

**Objetivo**: Entrenar modelo Product Owner usando Qwen2.5-7B + LoRA

**Requirements**: Google Colab Free con GPU T4 (15GB VRAM)

**Tiempo estimado**: 3-5 horas para 3 epochs con ~200 samples

**Setup**: Solo necesitas clonar el repo (el dataset est√° incluido)

---

## 1Ô∏è‚É£ Verificar GPU y CUDA

In [None]:
# Verificar GPU disponible
!nvidia-smi

# Verificar PyTorch y CUDA
import torch
print(f"\n{'='*60}")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"CUDA version: {torch.version.cuda}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"VRAM: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")
    print(f"\n‚úÖ GPU is ready!")
else:
    print(f"\n‚ùå No GPU detected!")
    print(f"   Runtime ‚Üí Change runtime type ‚Üí Hardware accelerator: T4 GPU")
print(f"{'='*60}")

## 2Ô∏è‚É£ Instalar Dependencias

In [None]:
# Instalar todas las librer√≠as necesarias
print("üì¶ Installing dependencies...\n")

!pip install -q torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
!pip install -q transformers==4.36.0 peft==0.7.0 bitsandbytes==0.41.0 accelerate==0.25.0 datasets==2.16.0 typer

# Verificar instalaci√≥n
print("\nüîç Verifying installation...\n")
import transformers
import peft
import bitsandbytes
import datasets
import typer

print(f"‚úÖ transformers: {transformers.__version__}")
print(f"‚úÖ peft: {peft.__version__}")
print(f"‚úÖ bitsandbytes: {bitsandbytes.__version__}")
print(f"‚úÖ datasets: {datasets.__version__}")
print(f"\n‚úÖ All packages installed successfully!")

## 3Ô∏è‚É£ Clonar Repositorio desde GitHub

In [None]:
# Clonar el repositorio desde el branch correcto (incluye el dataset)
print("üì• Clonando repositorio desde GitHub...\n")

!git clone -b dspy-multi-role https://github.com/krukmat/agnostic-ai-pipeline.git
%cd agnostic-ai-pipeline

# Verificar branch y dataset
print("\nüîç Verificando branch y dataset...\n")
!git branch --show-current
!ls -lh artifacts/distillation/po_teacher_supervised.jsonl

print("\n‚úÖ Repositorio clonado exitosamente en branch dspy-multi-role!")

## 4Ô∏è‚É£ Verificar Dataset y Script

In [None]:
import json
import os
from pathlib import Path

print("üîç Verificando archivos necesarios...\n")

# Paths
script_path = Path("/content/agnostic-ai-pipeline/scripts/train_po_lora.py")
dataset_path = Path("/content/agnostic-ai-pipeline/artifacts/distillation/po_teacher_supervised.jsonl")

# Verificar script
if script_path.exists():
    print(f"‚úÖ Script encontrado: {script_path}")
else:
    print(f"‚ùå Script NO encontrado: {script_path}")

# Verificar dataset
if dataset_path.exists():
    print(f"‚úÖ Dataset encontrado: {dataset_path}")
    
    # Contar registros
    with open(dataset_path) as f:
        samples = [line for line in f if line.strip()]
        print(f"‚úÖ Dataset tiene {len(samples)} registros")
    
    # Ver primer sample
    with open(dataset_path) as f:
        first = json.loads(f.readline())
        print(f"\nüìù Keys: {list(first.keys())}")
        print(f"üìè Prompt length: {len(first['prompt'])} chars")
        print(f"üìè Response length: {len(first['response'])} chars")
        print(f"\nüîç Preview prompt:")
        print(first['prompt'][:300] + "...")
        print(f"\n‚úÖ Dataset format is correct!")
else:
    print(f"‚ùå Dataset NO encontrado: {dataset_path}")
    print(f"\nüîç Buscando dataset en el repo...")
    !find /content/agnostic-ai-pipeline -name "po_teacher_supervised.jsonl" 2>/dev/null

## 5Ô∏è‚É£ Ejecutar Training

**Par√°metros optimizados para T4 (15GB VRAM)**:
- 4-bit quantization para reducir memoria
- Gradient checkpointing activado
- Batch size 1 + gradient accumulation 8 = batch efectivo de 8
- Max length 2048 (ajusta a 1536 si hay OOM)

**Tiempo estimado**: 3-5 horas

In [None]:
# Ejecutar training
!python scripts/train_po_lora.py \
    --data-path /content/agnostic-ai-pipeline/artifacts/distillation/po_teacher_supervised.jsonl \
    --base-model Qwen/Qwen2.5-7B-Instruct \
    --output-dir /content/agnostic-ai-pipeline/artifacts/models/po_student_v1 \
    --rank 32 \
    --alpha 64 \
    --dropout 0.05 \
    --epochs 3 \
    --batch-size 1 \
    --gradient-accumulation-steps 8 \
    --lr 1e-4 \
    --max-length 2048 \
    --load-4bit \
    --bnb-compute-dtype float16 \
    --gradient-checkpointing

### üö® Si el training falla con "CUDA out of memory", ejecuta esta celda alternativa:

In [None]:
# Comando ALTERNATIVO con par√°metros m√°s conservadores
# Descomenta y ejecuta si el comando anterior fall√≥ con OOM

# !python scripts/train_po_lora.py \
#     --data-path /content/agnostic-ai-pipeline/artifacts/distillation/po_teacher_supervised.jsonl \
#     --base-model Qwen/Qwen2.5-7B-Instruct \
#     --output-dir /content/agnostic-ai-pipeline/artifacts/models/po_student_v1 \
#     --rank 16 \
#     --alpha 32 \
#     --dropout 0.05 \
#     --epochs 3 \
#     --batch-size 1 \
#     --gradient-accumulation-steps 16 \
#     --lr 1e-4 \
#     --max-length 1536 \
#     --load-4bit \
#     --bnb-compute-dtype float16 \
#     --gradient-checkpointing

## 6Ô∏è‚É£ Monitorear GPU (Ejecutar en otra celda mientras entrena)

In [None]:
# Ver uso de GPU en tiempo real (actualiza cada 5 segundos)
!nvidia-smi --query-gpu=utilization.gpu,utilization.memory,memory.used,memory.total --format=csv -l 5

## 7Ô∏è‚É£ Verificar Checkpoints Guardados

In [None]:
# Ver checkpoints guardados despu√©s del training
print("üìÅ Checkpoints guardados:\n")
!ls -lh /content/agnostic-ai-pipeline/artifacts/models/po_student_v1/

## 8Ô∏è‚É£ Test de Inferencia R√°pido

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

print("üîÑ Loading model...\n")

base_model = "Qwen/Qwen2.5-7B-Instruct"
adapter_path = "/content/agnostic-ai-pipeline/artifacts/models/po_student_v1"

# Cargar tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model)

# Cargar modelo base en 4-bit
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config,
    device_map="auto",
)

# Cargar adapter LoRA
model = PeftModel.from_pretrained(model, adapter_path)

print("‚úÖ Model loaded!\n")

# Test prompt
test_prompt = """[INSTRUCTIONS]
You are a Product Owner AI agent validating business requirements.

[REQUIREMENTS]
business_domain: Blog Platform
primary_features:
  - Users can create blog posts with title and content
  - Posts must support markdown formatting
  - User authentication is required

[YOUR RESPONSE]"""

print("üí≠ Test prompt:")
print(test_prompt)
print("\n" + "="*60 + "\n")

# Generate
inputs = tokenizer(test_prompt, return_tensors="pt").to("cuda")
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id,
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("ü§ñ Model response:")
print(response[len(test_prompt):])

## 9Ô∏è‚É£ Descargar Modelo Entrenado

**Opci√≥n recomendada**: Comprimir y descargar

In [None]:
from google.colab import files

print("üì¶ Compressing model...\n")
!zip -r po_student_v1.zip /content/agnostic-ai-pipeline/artifacts/models/po_student_v1

print("\n‚¨áÔ∏è Downloading...")
files.download('po_student_v1.zip')

print("\n‚úÖ Download complete!")

---

## üîß Troubleshooting

### Error: CUDA out of memory
**Soluci√≥n**: Ejecuta la celda alternativa (Cell 5.2) con par√°metros m√°s conservadores

### Error: No module named 'bitsandbytes'
**Soluci√≥n**: Reinstala bitsandbytes
```python
!pip uninstall -y bitsandbytes
!pip install bitsandbytes==0.41.0 --no-cache-dir
```

### Error: No GPU detected
**Soluci√≥n**: 
1. Runtime ‚Üí Change runtime type
2. Hardware accelerator: **T4 GPU**
3. Save y reconectar

### Training muy lento (< 1 it/s)
**Soluci√≥n**: Usa la celda alternativa con `--max-length 1536` o `--gradient-accumulation-steps 4`

---

## üìä M√©tricas Esperadas

```
Epoch 1/3: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 25/25 [12:34<00:00,  0.05it/s]
{'loss': 1.2345, 'learning_rate': 9.5e-05, 'epoch': 1.0}

Epoch 2/3: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 25/25 [12:31<00:00,  0.05it/s]
{'loss': 0.8912, 'learning_rate': 5.0e-05, 'epoch': 2.0}

Epoch 3/3: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 25/25 [12:29<00:00,  0.05it/s]
{'loss': 0.6234, 'learning_rate': 5.0e-06, 'epoch': 3.0}
```

**Indicadores de √©xito**:
- ‚úÖ Loss decreciente (1.2 ‚Üí 0.6)
- ‚úÖ No crashes por OOM
- ‚úÖ ~0.03-0.05 it/s en T4
- ‚úÖ Checkpoints guardados cada epoch

---

## üìö Referencias

- **Setup Guide**: `docs/COLAB_TRAINING_SETUP.md`
- **Training Script**: `scripts/train_po_lora.py`
- **HuggingFace Qwen2.5**: https://huggingface.co/Qwen/Qwen2.5-7B-Instruct
- **PEFT Docs**: https://huggingface.co/docs/peft/

---

**√öltima actualizaci√≥n**: 2025-11-13