# Product Owner LoRA Training v2
**Date**: 2025-11-15  
**Task**: Train Qwen2.5-7B-Instruct with LoRA using optimized hyperparameters

**Dataset**: 359 samples (aligned: 66.9%, needs: 30.4%, conflicts: 2.8%)  
**Changes from v1**:
- epochs: 3 → **4**
- learning_rate: 1e-4 → **8e-5**
- lr_scheduler: linear → **cosine**
- warmup_ratio: 0.1 → **0.05**
- gradient_accumulation: 8 → **12**

## 1. Setup and Configuration

### 1.1 Clonar repositorio y montar Drive

In [None]:
# Clonar el repositorio en el entorno actual
!git clone -b dspy-multi-role https://github.com/krukmat/agnostic-ai-pipeline.git
%cd agnostic-ai-pipeline

print("\n📁 Directorio actual:")
!pwd

try:
    from google.colab import drive  # type: ignore
    drive.mount('/content/drive')
    print("✅ Google Drive montado")
except Exception:
    print("ℹ️ Google Drive no disponible (Lightning u otro entorno). Continuando sin montarlo.")


In [None]:
from pathlib import Path

if Path('/content').exists():
    ENV_ROOT = Path('/content')
elif Path('/workspace').exists():
    ENV_ROOT = Path('/workspace')
else:
    ENV_ROOT = Path.cwd()

PROJECT_ROOT = Path.cwd()
DATASET_PATH = PROJECT_ROOT / 'artifacts/distillation/po_teacher_supervised.jsonl'
RUN_OUTPUT_DIR = ENV_ROOT / 'po_student_v2'
ADAPTER_DIR = ENV_ROOT / 'po_student_v2_adapter'

print(f"📁 Proyecto: {PROJECT_ROOT}")
print(f"💾 Carpeta temporal: {ENV_ROOT}")
print(f"📄 Dataset esperado: {DATASET_PATH}")


In [None]:
# Check GPU availability
!nvidia-smi

In [None]:
# Install/update dependencies (compatible with Qwen2.5 models)
%%bash
pip install -q --upgrade --no-cache-dir \\\
  "transformers>=4.38.0" \\\
  "peft>=0.11.1" \\\
  "bitsandbytes>=0.43.2" \\\
  "accelerate>=0.28.0" \\\
  "datasets>=2.19.0"
pip install -q --upgrade --no-cache-dir "transformers @ git+https://github.com/huggingface/transformers.git"

python - <<'PYBLOCK'
import importlib
importlib.invalidate_caches()
import transformers, peft, datasets
print(f'✅ transformers {transformers.__version__}')
print(f'✅ peft {peft.__version__}')
print(f'✅ datasets {datasets.__version__}')
PYBLOCK

import importlib
importlib.invalidate_caches()

try:
    import transformers, peft, datasets
    print(f'✅ transformers {transformers.__version__}')
    print(f'✅ peft {peft.__version__}')
    print(f'✅ datasets {datasets.__version__}')
except Exception as exc:
    print('⚠️ Tras instalar, reinicia el runtime y vuelve a ejecutar esta celda si persiste un error.')
    raise


## 2. Upload Dataset

**Option A**: Click the folder icon on the left, then upload `po_teacher_supervised.jsonl`  
**Option B**: Mount Google Drive and copy from there  
**Option C**: Use the code below to upload directly

In [None]:
from pathlib import Path

if 'ENV_ROOT' not in globals():
    if Path('/content').exists():
        ENV_ROOT = Path('/content')
    elif Path('/workspace').exists():
        ENV_ROOT = Path('/workspace')
    else:
        ENV_ROOT = Path.cwd()

PROJECT_ROOT = Path.cwd()
DATASET_PATH = PROJECT_ROOT / 'artifacts/distillation/po_teacher_supervised.jsonl'
RUN_OUTPUT_DIR = ENV_ROOT / 'po_student_v2'
ADAPTER_DIR = ENV_ROOT / 'po_student_v2_adapter'

print(f"📄 Dataset esperado: {DATASET_PATH}")


In [None]:
# Option A: Direct upload (Colab only)
try:
    import google.colab  # type: ignore
    from google.colab import files  # type: ignore
    uploaded = files.upload()
    print('Sube po_teacher_supervised.jsonl y se guardará en /content/')
except Exception:
    print('⚠️ Esta opción solo está disponible en Colab. Usa Option B/C o coloca el archivo en DATASET_PATH manualmente.')


In [None]:
# Option B: From Google Drive (solo Colab; ignora en Lightning)
try:
    import google.colab  # type: ignore
    from google.colab import drive  # type: ignore
    drive.mount('/content/drive')
    !cp /content/drive/MyDrive/path/to/po_teacher_supervised.jsonl /content/
except Exception:
    print('Google Drive no disponible. Usa Option A/C o coloca el archivo manualmente en DATASET_PATH.')


In [None]:
from pathlib import Path

if 'DATASET_PATH' not in globals():
    raise RuntimeError('Define DATASET_PATH en la celda anterior antes de ejecutar esta verificación.')

if DATASET_PATH.exists():
    print(f"✅ Dataset encontrado: {DATASET_PATH}")
else:
    print(f"⚠️ Dataset no encontrado en {DATASET_PATH}.")
    print('   Sube el archivo o actualiza DATASET_PATH si estás usando otra ubicación.')


## 3. Load and Prepare Dataset

In [None]:
import json
from datasets import Dataset

# Load supervised dataset
data = []
with open(DATASET_PATH, "r") as f:
    for line in f:
        if line.strip():
            data.append(json.loads(line))

print(f"Loaded {len(data)} training examples")

# Convert to HuggingFace Dataset
train_dataset = Dataset.from_list(data)
print(train_dataset)

# Verify data structure
print("\nSample prompt (first 200 chars):")
print(train_dataset[0]["prompt"][:200])
print("\nSample response (first 200 chars):")
print(train_dataset[0]["response"][:200])


## 4. Load Base Model

In [None]:
import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig
)

model_name = "Qwen/Qwen2.5-7B-Instruct"

# Configure 4-bit quantization
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

print(f"Loading model: {model_name}")
print("This will take 5-7 minutes...")

# Load model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(
    model_name,
    trust_remote_code=True
)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

print(f"\n✅ Model loaded: {model_name}")
print(f"Model device: {model.device}")

## 5. Configure LoRA

In [None]:
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training

# Prepare model for training
model = prepare_model_for_kbit_training(model)

# LoRA configuration
lora_config = LoraConfig(
    r=32,                      # Rank
    lora_alpha=64,             # Alpha
    target_modules=[           # Target all linear layers
        "q_proj",
        "k_proj",
        "v_proj",
        "o_proj",
        "gate_proj",
        "up_proj",
        "down_proj",
    ],
    lora_dropout=0.05,         # Dropout
    bias="none",
    task_type="CAUSAL_LM",
)

# Apply LoRA
model = get_peft_model(model, lora_config)
print("\n✅ LoRA configuration applied")
print("\nTrainable parameters:")
model.print_trainable_parameters()

## 6. Tokenize Dataset

In [None]:
def tokenize_function(examples):
    full_texts = []
    for prompt, response in zip(examples['prompt'], examples['response']):
        messages = [
            {'role': 'user', 'content': prompt},
            {'role': 'assistant', 'content': response}
        ]
        text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False)
        full_texts.append(text)

    tokenized = tokenizer(
        full_texts,
        truncation=True,
        padding='max_length',
        max_length=1536,
    )

    tokenized['labels'] = tokenized['input_ids'].copy()
    return tokenized

tokenized_dataset = train_dataset.map(
    tokenize_function,
    batched=True,
    remove_columns=train_dataset.column_names,
)

print(f'Tokenized dataset: {len(tokenized_dataset)} examples')
print(f'Sample token count: {len(tokenized_dataset[0]["input_ids"])}')


## 7. Configure Training Arguments (OPTIMIZED)

In [None]:
from pathlib import Path
globals_dict = globals()
if 'RUN_OUTPUT_DIR' not in globals_dict:
    if Path('/workspace').exists():
        RUN_OUTPUT_DIR = Path('/workspace/po_student_v2')
    elif Path('/content').exists():
        RUN_OUTPUT_DIR = Path('/content/po_student_v2')
    else:
        RUN_OUTPUT_DIR = Path.cwd() / 'po_student_v2'
if 'ADAPTER_DIR' not in globals_dict:
    ADAPTER_DIR = RUN_OUTPUT_DIR.with_name('po_student_v2_adapter')

from transformers import TrainingArguments, Trainer, DataCollatorForLanguageModeling

training_args = TrainingArguments(
    # Output
    output_dir=str(RUN_OUTPUT_DIR),

    # Training schedule (OPTIMIZED)
    num_train_epochs=4,                    # ← CHANGED from 3 to 4
    per_device_train_batch_size=1,
    gradient_accumulation_steps=24,        # ← CHANGED from 8 to 12

    # Learning rate (OPTIMIZED)
    learning_rate=8e-5,                    # ← CHANGED from 1e-4 to 8e-5
    lr_scheduler_type="cosine",            # ← CHANGED from "linear" to "cosine"
    warmup_ratio=0.05,                     # ← CHANGED from 0.1 to 0.05

    # Optimization
    optim="paged_adamw_8bit",
    fp16=True,
    max_grad_norm=1.0,

    # Logging and saving
    logging_steps=10,
    save_strategy="epoch",
    save_total_limit=2,
    torch_empty_cache_steps=10,
    # evaluation disabled by default

    # Evaluation (none for now)

    # Performance
    dataloader_num_workers=2,
    remove_unused_columns=False,
)

# Data collator
data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False,
)

# Create trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset,
    data_collator=data_collator,
)

print("\n✅ Trainer configured with OPTIMIZED hyperparameters:")
print("="*80)
print(f"  Epochs:                  {training_args.num_train_epochs}")
print(f"  Learning rate:           {training_args.learning_rate}")
print(f"  LR scheduler:            {training_args.lr_scheduler_type}")
print(f"  Warmup ratio:            {training_args.warmup_ratio}")
print(f"  Gradient accumulation:   {training_args.gradient_accumulation_steps}")
print(f"  Batch size per device:   {training_args.per_device_train_batch_size}")
print(f"  Effective batch size:    {training_args.per_device_train_batch_size * training_args.gradient_accumulation_steps}")
print("="*80)

## 8. Train Model

**Expected time**: 30-45 minutes on T4 GPU  
**Expected final loss**: < 0.7 (ideally ~0.5-0.6)

**What to watch for**:
- Loss should decrease steadily from ~1.5-2.0 to ~0.5-0.8
- Learning rate follows cosine curve (smooth decay)
- No OOM errors (if OOM, reduce batch size to 1 and increase grad_accum to 24)

In [None]:
# Start training
print("\n🚀 Starting training...\n")
print("="*80)
trainer.train()
print("="*80)
print("\n✅ Training completed!")

## 9. Save Adapter

In [None]:
# Save LoRA adapter
output_dir = str(ADAPTER_DIR)
trainer.model.save_pretrained(output_dir)
tokenizer.save_pretrained(output_dir)

print(f"\n✅ Adapter saved to {output_dir}")
print("\nFiles:")
!ls -lh {output_dir}

## 10. Download Adapter

Choose one of the methods below to download the adapter to your local machine.

### Opcional: Guardar resultados y subir cambios al repositorio

In [None]:
# Configurar identidad git (ejecutar una vez por sesión)
!git config --global user.name "TU_NOMBRE"
!git config --global user.email "tu.email@example.com"

# Verificar estado
!git status -sb

In [None]:
# Copiar resultados (adapter y logs) hacia el repositorio local
from pathlib import Path

repo_dir = Path.cwd()
adapter_src = ADAPTER_DIR
adapter_dst = repo_dir / 'artifacts/models/po_student_v2_adapter'

if adapter_src.exists():
    !rm -rf {adapter_dst}
    !cp -r {adapter_src} {adapter_dst}
    print(f"✅ Adapter copiado a {adapter_dst}")
else:
    print(f"⚠️ Adapter no encontrado en {adapter_src}")

logs_dst = repo_dir / 'logs/distillation'
logs_dst.mkdir(parents=True, exist_ok=True)
print("✅ Carpeta de logs preparada")


In [None]:
# Option A: Direct upload (Colab only)
try:
    import google.colab  # type: ignore
    from google.colab import files  # type: ignore
    uploaded = files.upload()
    print('Sube po_teacher_supervised.jsonl y se guardará en /content/')
except Exception:
    print('⚠️ Esta opción solo está disponible en Colab. Usa Option B/C o coloca el archivo en DATASET_PATH manualmente.')


In [None]:
# Option B: Save to Google Drive (uncomment if using this option)
# !cp -r /content/po_student_v2_adapter /content/drive/MyDrive/lora_adapters/
# print("✅ Adapter saved to Google Drive: /content/drive/MyDrive/lora_adapters/po_student_v2_adapter")

## 11. Verify Adapter (Optional)

Quick sanity check to ensure the adapter loads correctly.

In [None]:
from peft import AutoPeftModelForCausalLM

# Reload adapter to verify it works
print("Loading adapter for verification...")
test_model = AutoPeftModelForCausalLM.from_pretrained(
    str(ADAPTER_DIR),
    device_map="auto",
    load_in_4bit=True,
)

print("\n✅ Adapter loaded successfully!")
print("Model is ready for inference.")

## Summary

**Training completed successfully!**

### Next Steps:

1. **Extract adapter** on local machine:
   ```bash
   unzip po_student_v2_adapter.zip -d artifacts/models/
   ```

2. **Run evaluation** (Step 4):
   ```bash
   PYTHONPATH=. .venv/bin/python scripts/eval_po_student.py \
     --adapter artifacts/models/po_student_v2_adapter \
     --max-samples 40 \
     --output inference_results/student_v2.json
   ```

3. **Compare results**:
   - Check mean ≥ 0.82
   - Check std ≤ 0.10
   - Check delta vs baseline ≤ 0.03

### Training Configuration Used:

| Hyperparameter | Value |
|----------------|-------|
| Model | Qwen/Qwen2.5-7B-Instruct |
| LoRA rank | 32 |
| LoRA alpha | 64 |
| Epochs | 4 |
| Learning rate | 8e-5 |
| LR scheduler | cosine |
| Warmup ratio | 0.05 |
| Batch size | 2 |
| Gradient accumulation | 12 |
| Effective batch size | 24 |
| Dataset size | 359 samples |
| Conflicts examples | 10 (2.8%) |