[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mmcmanus1/rlhf-canary/blob/main/notebooks/01_quickstart.ipynb)

# RLHF Canary - Quickstart

Get started with RLHF Canary in minutes. Learn the core workflow for detecting regressions in your RLHF/finetuning pipelines before they reach production.

**What you'll learn:**
1. Run a DPO canary training job
2. Save metrics as a baseline
3. Compare new runs against baseline
4. Interpret regression reports
5. Use custom thresholds
6. Understand root cause analysis

**Requirements:** GPU runtime (Runtime > Change runtime type > T4 GPU)

## 1. Setup

In [None]:
import os
import re
import sys

print("Starting Environment Setup...")

# --- 1. Clone or update the repo ---
if not os.path.exists("/content/rlhf-canary"):
    !git clone https://github.com/mmcmanus1/rlhf-canary.git /content/rlhf-canary
else:
    !cd /content/rlhf-canary && git pull --ff-only

%cd /content/rlhf-canary

# --- 2. Force-Install the "Safe Harbor" Stack ---
# These specific versions avoid the TRL 0.12+ dtype bug and Transformers 4.46+ generator bug
# We use --no-deps to prevent pip from trying to be "smart" and upgrading things
!pip install "trl==0.11.4" "transformers==4.44.2" "peft==0.12.0" "accelerate==0.34.2" "tokenizers==0.19.1" --force-reinstall --no-deps --quiet
!pip install -q datasets pydantic click PyYAML bitsandbytes
print("Libraries installed (TRL 0.11.4 / Transformers 4.44.2)")

# --- 3. Patch pyproject.toml (Prevent future drift) ---
project_file = "/content/rlhf-canary/pyproject.toml"
if os.path.exists(project_file):
    with open(project_file, "r") as f:
        content = f.read()
    
    # Ensure TRL is pinned to 0.11.4 in the config file
    if "trl==0.11.4" not in content:
        content = re.sub(r'trl[<>=!~]+[\d\.]+', 'trl==0.11.4', content)
        with open(project_file, "w") as f:
            f.write(content)
        print("Config file patched to lock TRL 0.11.4")

# --- 4. Patch Source Code (Compatibility Fix) ---
# TRL 0.11.4 uses 'tokenizer=' argument. Newer code uses 'processing_class='.
# We auto-detect and revert this change if found.
runner_file = "/content/rlhf-canary/canary/runner/local.py"
if os.path.exists(runner_file):
    with open(runner_file, "r") as f:
        code = f.read()
    
    if "processing_class=" in code:
        code = code.replace("processing_class=", "tokenizer=")
        with open(runner_file, "w") as f:
            f.write(code)
        print("Code patched: Reverted 'processing_class' to 'tokenizer'")
    else:
        print("Code is already compatible.")

# --- 5. Install the package ---
!pip install -e . --quiet

print("Environment Ready! You can run your training now.")

In [None]:
# Verify installation and paths
from pathlib import Path

print("=== Installation Verification ===\n")

# Check 1: Working directory
cwd = Path.cwd()
print(f"Working directory: {cwd}")

# Check 2: Config files accessible
config_path = cwd / "configs" / "dpo_smoke.yaml"
if config_path.exists():
    print(f"Config file: {config_path}")
else:
    print(f"ERROR: Config not found at {config_path}")
    print("Fix: Run '%cd rlhf-canary' or verify git clone succeeded")
    raise FileNotFoundError(f"Config missing: {config_path}")

# Check 3: Canary module location
try:
    import canary
    canary_path = Path(canary.__file__).parent
    print(f"Canary module: {canary_path}")

    # Detect nested directory issue
    if "rlhf-canary/rlhf-canary" in str(canary_path):
        print("\nWARNING: Nested directory detected!")
        print("Python is importing from a nested path.")
        print("Fix: Run these commands:")
        print("  !pip uninstall rlhf-canary -y")
        print("  %cd /content/rlhf-canary")
        print("  !pip install -e .")
except ImportError as e:
    print(f"ERROR: Cannot import canary: {e}")
    print("Fix: Restart runtime and re-run installation cells")
    raise

# Check 4: TRL version compatibility
try:
    import trl
    trl_version = trl.__version__
    print(f"TRL version: {trl_version}")

    major, minor, *_ = trl_version.split(".")
    major, minor = int(major), int(minor)
    if major == 0 and minor >= 12:
        print(f"\nWARNING: TRL {trl_version} detected. This version has dtype bugs.")
        print("Run the setup cell again to install TRL 0.11.4")
    elif major == 0 and minor == 11:
        print("TRL version OK (0.11.x)")
except Exception as e:
    print(f"TRL check failed: {e}")

# Check 5: Transformers version
try:
    import transformers
    tf_version = transformers.__version__
    print(f"Transformers version: {tf_version}")
    
    major, minor, *_ = tf_version.split(".")
    major, minor = int(major), int(minor)
    if major >= 4 and minor >= 46:
        print(f"\nWARNING: Transformers {tf_version} detected. This version has compatibility issues with TRL 0.11.x")
        print("Run the setup cell again to install Transformers 4.44.2")
    else:
        print("Transformers version OK")
except Exception as e:
    print(f"Transformers check failed: {e}")

print("\n=== Verification Complete ===")

In [None]:
# Verify GPU is available
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

In [None]:
# Show environment fingerprint
!python -m canary.cli env

## 2. Run Baseline Canary

In [None]:
# Run DPO smoke test (takes ~5-10 min on T4)
!python -m canary.cli run configs/dpo_smoke.yaml -o ./canary_output/baseline

In [None]:
# Find and display the metrics
import json
from pathlib import Path

baseline_path = next(Path('./canary_output/baseline').rglob('metrics.json'))
print(f"Baseline metrics: {baseline_path}")

with open(baseline_path) as f:
    metrics = json.load(f)

print(f"\nRun ID: {metrics['run_id']}")
print(f"Duration: {metrics['duration_seconds']:.1f}s")
print(f"Step time (mean): {metrics['perf']['step_time']['mean']:.4f}s")
print(f"Tokens/sec: {metrics['perf']['approx_tokens_per_sec']:.0f}")
print(f"Peak memory: {metrics['perf']['max_mem_mb']:.0f}MB")
print(f"NaN steps: {metrics['stability']['nan_steps']}")

In [None]:
# Save as baseline
!mkdir -p baselines
!cp {baseline_path} baselines/dpo_baseline.json
print("Baseline saved to baselines/dpo_baseline.json")

## 3. Run Again and Compare

Now let's run another canary and compare it to our baseline.

In [None]:
# Run another canary
!python -m canary.cli run configs/dpo_smoke.yaml -o ./canary_output/run2

In [None]:
# Find the new metrics file
current_path = next(Path('./canary_output/run2').rglob('metrics.json'))
print(f"Current metrics: {current_path}")

In [None]:
# Compare to baseline
!python -m canary.cli compare {current_path} baselines/dpo_baseline.json --threshold-tier smoke

## 4. Simulate a Regression

Let's intentionally create a slower config to see regression detection in action.

In [None]:
# Create a "slower" config (smaller batch = more steps = slower)
slow_config = """
name: dpo_slow
description: Intentionally slow config for regression demo

model_name: EleutherAI/pythia-70m
use_peft: true
lora_r: 16
lora_alpha: 32
lora_dropout: 0.05

training_type: dpo
max_steps: 100
batch_size: 1  # Smaller batch = slower!
gradient_accumulation_steps: 8
learning_rate: 5.0e-5
max_length: 256
warmup_steps: 10

beta: 0.1
max_prompt_length: 64

dataset_name: Anthropic/hh-rlhf
dataset_split: train
dataset_size: 512
seed: 42

output_dir: ./canary_output
metrics_warmup_steps: 10
"""

with open('configs/dpo_slow.yaml', 'w') as f:
    f.write(slow_config)

print("Created slow config")

In [None]:
# Run the slow config
!python -m canary.cli run configs/dpo_slow.yaml -o ./canary_output/slow_run

In [None]:
# Compare slow run to baseline - should show regression!
slow_path = next(Path('./canary_output/slow_run').rglob('metrics.json'))
print(f"Comparing: {slow_path}")

!python -m canary.cli compare {slow_path} baselines/dpo_baseline.json --threshold-tier smoke

## 5. Custom Thresholds

You can customize regression thresholds using YAML files or inline in your config.

In [None]:
# Create a custom threshold file
custom_thresholds = """
# Custom thresholds - start from smoke tier but customize
base_tier: smoke

# Override specific thresholds (lenient enough for the slow config demo)
max_step_time_increase_pct: 50.0  # Allow 50% step time increase
max_tps_drop_pct: 35.0            # Allow 35% throughput drop
max_mem_increase_mb: 750.0        # Allow 750MB memory increase
"""

with open('custom_thresholds.yaml', 'w') as f:
    f.write(custom_thresholds)

print("Created custom_thresholds.yaml")
print("\nAvailable threshold tiers:")
print("  - smoke: Lenient (15% step time, 12% TPS, 1000MB memory)")
print("  - default: Balanced (10% step time, 8% TPS, 500MB memory)")
print("  - perf: Strict (8% step time, 5% TPS, 300MB memory)")
print("  - nightly: Strictest (5% step time, 3% TPS, 200MB memory)")

In [None]:
# Compare using custom thresholds
# With lenient thresholds, the slow run passes despite the performance difference
!python -m canary.cli compare {slow_path} baselines/dpo_baseline.json --threshold-file custom_thresholds.yaml

## 6. Root Cause Analysis

When regressions are detected, the canary provides heuristic-based analysis to help identify the root cause. The system analyzes:

- **Gradient norms**: Detects gradient explosion and training instability
- **GPU utilization**: Identifies CPU bottlenecks (low GPU util = waiting on data)
- **Dataloader wait time**: Pinpoints data loading bottlenecks
- **Memory patterns**: Distinguishes memory leaks from fragmentation
- **Combined patterns**: Cross-check analysis (e.g., step time + memory = fragmentation)

In [None]:
# Programmatically access root cause analysis
from canary.compare.stats import compare_to_baseline, load_metrics
from canary.compare.thresholds import SMOKE_THRESHOLDS
from canary.compare.heuristics import analyze_regression, format_suspects_markdown

# Load metrics
current = load_metrics(str(slow_path))
baseline = load_metrics('baselines/dpo_baseline.json')

# Run comparison
report = compare_to_baseline(current, baseline, SMOKE_THRESHOLDS)

# Get root cause analysis
if not report.passed:
    analysis = analyze_regression(report, current, baseline)
    print("Root Cause Analysis")
    print("=" * 50)
    print(f"\nSummary: {analysis.summary}")
    print(f"\nTop suspects ({len(analysis.suspects)} found):")
    for i, suspect in enumerate(analysis.suspects[:3], 1):
        print(f"\n  #{i} {suspect.category.value.title()} (confidence: {suspect.confidence:.0%})")
        print(f"      {suspect.description}")
        for ev in suspect.evidence:
            print(f"      - {ev}")
else:
    print("No regressions detected!")

## 7. GitHub Integration

RLHF Canary includes built-in GitHub integration for CI/CD workflows.

### Available Commands

```bash
# Post comparison results to a PR (comment + commit status)
canary gh-report <current> <baseline> --threshold-tier smoke

# Options:
#   --post-comment/--no-comment   Post PR comment (default: yes)
#   --update-status/--no-status   Update commit status (default: yes)
#   --threshold-file PATH         Use custom thresholds
```

### Workflow Files

Two workflow files are included:
- `workflows/pr_canary.yml` - Runs on every PR (smoke tests)
- `workflows/nightly_canary.yml` - Runs daily at 2 AM UTC (soak tests)

### Test Configurations

| Config | Steps | Duration | Use Case |
|--------|-------|----------|----------|
| `dpo_smoke.yaml` | 100 | ~5-10 min | PR gating |
| `dpo_perf.yaml` | 500 | ~20-45 min | Performance analysis |
| `dpo_nightly.yaml` | 2000 | ~1-2 hr | Nightly soak tests |

## 8. Next Steps

Now that you've learned the basics, here are some ways to integrate RLHF Canary into your workflow:

### For Your Repository

1. **Copy workflow files** to `.github/workflows/`:
   - `pr_canary.yml` for PR gating
   - `nightly_canary.yml` for nightly soak tests

2. **Create a baseline** from your current main branch:
   ```bash
   canary run configs/dpo_smoke.yaml
   canary save-baseline ./canary_output/*/metrics.json ./baselines/main.json
   ```

3. **Customize thresholds** for your use case in a `custom_thresholds.yaml` file

### CLI Reference

```bash
canary --help                    # Show all commands
canary env                       # Show environment fingerprint
canary run <config>              # Run a canary job
canary compare <cur> <base>      # Compare metrics to baseline
canary gh-report <cur> <base>    # Post results to GitHub PR
canary save-baseline <src> <dst> # Save metrics as baseline
canary init-config <path>        # Generate sample config
```

### Learn More

See the [README](https://github.com/mmcmanus1/rlhf-canary) for complete documentation.