<a href="https://colab.research.google.com/github/RapidFireAI/rapidfireai/blob/main/tutorial_notebooks/COLAB_QUICKSTART.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🚀 RapidFire AI - Google Colab Quickstart

This notebook demonstrates how to run RapidFire AI in Google Colab with native port forwarding.

## What You'll Learn
- ✅ Install RapidFire AI in Colab
- ✅ Start RapidFire services with native port forwarding
- ✅ Access the RapidFire Dashboard, MLflow UI, and API
- ✅ Run a simple fine-tuning experiment

## Prerequisites
- Google Colab account (free tier works!)
- GPU runtime enabled (Runtime → Change runtime type → GPU)

---

## 1️⃣ Check GPU Availability

In [None]:
!nvidia-smi

## 2️⃣ Install RapidFire AI

This will install RapidFire AI and all dependencies.

In [None]:
# Install RapidFire AI from PyPI
!pip install -q rapidfireai

# Initialize RapidFire (installs additional dependencies)
!rapidfireai init

## 3️⃣ Start RapidFire Services

This cell will:
- Start MLflow server (port 5002)
- Start Dispatcher API (port 8080)
- Start Frontend Dashboard (port 3000)
- Expose all services using Colab's native port forwarding

**Note**: This cell will keep running. Click the URLs that appear to open the services.

In [None]:
# Start RapidFire in Colab mode
# This will automatically expose ports using native Colab forwarding
!rapidfireai start --colab

### Alternative: Using Cloudflare Tunnel

If you want public URLs (shareable), use Cloudflare Tunnel instead:

```python
!rapidfireai start --colab --tunnel cloudflare
```

This is useful if you want to:
- Share the dashboard with team members
- Access from multiple devices
- Keep URLs stable across restarts

## 4️⃣ Manual Port Forwarding (Optional)

If you prefer to start services manually and control port forwarding yourself:

In [None]:
# Alternative: Use Python API for more control
from rapidfireai.utils.colab_helper import expose_rapidfire_services, is_colab
import subprocess
import os

# Check if in Colab
if not is_colab():
    print("⚠️  This notebook is designed for Google Colab")
else:
    print("✅ Running in Google Colab")

# Start services in background (using start.sh)
# Note: You'll need to manually start the services first
# Then expose them:

# Expose services with native Colab forwarding
urls = expose_rapidfire_services(
    method='native',  # or 'cloudflare' or 'ngrok'
    mlflow_port=5002,
    dispatcher_port=8080,
    frontend_port=3000
)

print("\n📋 Access your services at:")
for service, url in urls.items():
    print(f"  {service}: {url}")

## 5️⃣ Run a Simple Experiment

Now that services are running, let's create a simple fine-tuning experiment.

**Note**: Open a new code cell or notebook for this, as the services are running in the previous cell.

In [None]:
# This is a minimal example - see other tutorial notebooks for full examples
from rapidfireai import Experiment
from datasets import load_dataset

# Create experiment
exp = Experiment("colab_quickstart_test")

# Define your configuration
config = {
    'trainer_type': 'SFT',
    'training_args': {
        'learning_rate': 1e-5,
        'per_device_train_batch_size': 2,
        'num_train_epochs': 1,
        'max_steps': 10,  # Short for demo
    },
    'peft_params': {
        'r': 8,
        'lora_alpha': 32,
        'target_modules': ['q_proj', 'v_proj'],
    }
}

# Load a small dataset for testing
dataset = load_dataset('imdb', split='train[:100]')  # Just 100 samples for demo

# Define model creation function
def create_model(config):
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model = AutoModelForCausalLM.from_pretrained('gpt2')
    tokenizer = AutoTokenizer.from_pretrained('gpt2')
    tokenizer.pad_token = tokenizer.eos_token
    
    return model, tokenizer

# Run the experiment
print("🚀 Starting experiment...")
exp.run_fit(
    param_config=config,
    create_model_fn=create_model,
    train_dataset=dataset,
    eval_dataset=dataset.select(range(20)),  # Small eval set
    num_chunks=2,
    seed=42
)

# Get results
results = exp.get_results()
print("\n✅ Experiment complete!")
print(results)

## 6️⃣ View Results

After running the experiment:

1. **RapidFire Dashboard**: Click the frontend URL from step 3 to see:
   - Real-time run status
   - Interactive Control Operations (stop, resume, clone)
   - Metrics visualization

2. **MLflow UI**: Click the MLflow URL to see:
   - Detailed metrics and parameters
   - Model artifacts
   - Comparison across runs

---

## 🎯 Next Steps

Now that you have RapidFire running in Colab, try:

1. **Explore Other Tutorials**:
   - `rf-tutorial-sft-chatqa.ipynb` - Supervised fine-tuning
   - `rf-tutorial-dpo-alignment.ipynb` - DPO training
   - `rf-tutorial-grpo-mathreasoning.ipynb` - GRPO for math

2. **Use AutoML**:
   ```python
   from rapidfireai.automl import GridSearch, RFModelConfig
   
   config = RFModelConfig(
       training_args={
           'learning_rate': [1e-4, 1e-5, 1e-6],
           'batch_size': [8, 16]
       }
   )
   
   grid = GridSearch(configs=[config])
   exp.run_fit(param_config=grid, ...)
   ```

3. **Interactive Control**: 
   - Stop/resume runs mid-training
   - Clone promising runs with modified configs
   - Compare results in real-time

---

## 📚 Resources

- **Documentation**: https://rapidfire-ai-oss-docs.readthedocs-hosted.com/
- **GitHub**: https://github.com/RapidFireAI/rapidfireai
- **Discord**: Join our community for help

---

## ⚠️ Troubleshooting

**Services not starting?**
```bash
# Check logs
!tail -50 mlflow.log
!tail -50 api.log
!tail -50 frontend.log
```

**Port conflicts?**
```python
import os
os.environ['RF_MLFLOW_PORT'] = '5003'
os.environ['RF_API_PORT'] = '8081'
os.environ['RF_FRONTEND_PORT'] = '3001'
```

**Native forwarding not working?**
- Try Cloudflare: `!rapidfireai start --colab --tunnel cloudflare`
- Or ngrok (requires token): `!rapidfireai start --colab --tunnel ngrok`

---

**Happy fine-tuning! 🎉**