# RL Swarm Worker Node (Google Colab)

This notebook runs a **worker node** that:
- Participates in training
- Coordinates with the coordinator node via Google Drive
- Discovers peers automatically
- No blockchain or Docker required

**Before running:**
1. Ensure coordinator node is running (using `colab_coordinator.ipynb`)
2. Mount your Google Drive (same account as coordinator)
3. Set **same EXPERIMENT_NAME** as coordinator
4. Set **unique NODE_ID** for this worker
5. Run all cells in order

**To run multiple workers:** Use this notebook in separate Colab sessions with different NODE_IDs

## 1. Configuration

In [None]:
# Experiment Configuration
EXPERIMENT_NAME = 'qwen_0.6b_seed42'  # MUST MATCH COORDINATOR
NODE_ROLE = 'worker'  # DO NOT CHANGE
NODE_ID = 'worker_1'  # MUST BE UNIQUE (worker_1, worker_2, worker_3, etc.)

# Model Configuration
MODEL_NAME = 'Gensyn/Qwen2.5-0.5B-Instruct'  # Should match coordinator
SEED = 42  # Should match coordinator

# Training Configuration
MAX_ROUNDS = 1000
NUM_GENERATIONS = 2
NUM_TRANSPLANT_TREES = 2

# Optional: HuggingFace Token (for pushing trained models)
HUGGINGFACE_TOKEN = None  # Set to your token or keep None

# Optional: Wandb Configuration
WANDB_API_KEY = None  # Set to your Wandb API key or keep None
WANDB_PROJECT = 'rl-swarm-colab'

print(f"✓ Experiment: {EXPERIMENT_NAME}")
print(f"✓ Node Role: {NODE_ROLE}")
print(f"✓ Node ID: {NODE_ID}")
print(f"✓ Model: {MODEL_NAME}")
print()
print("⚠ Make sure EXPERIMENT_NAME matches the coordinator!")

## 2. Mount Google Drive

In [None]:
from google.colab import drive
import os

# Mount Google Drive
drive.mount('/content/drive')

# Set base path (must be same as coordinator)
GDRIVE_BASE_PATH = '/content/drive/MyDrive/rl-swarm'

# Check if experiment exists
experiment_path = os.path.join(GDRIVE_BASE_PATH, 'experiments', EXPERIMENT_NAME)
if not os.path.exists(experiment_path):
    print(f"❌ Experiment '{EXPERIMENT_NAME}' not found!")
    print(f"   Expected at: {experiment_path}")
    print()
    print("Make sure:")
    print("  1. Coordinator is running")
    print("  2. EXPERIMENT_NAME matches the coordinator")
    print("  3. You're using the same Google Drive account")
    raise FileNotFoundError(f"Experiment not found: {EXPERIMENT_NAME}")
else:
    print(f"✓ Found experiment: {EXPERIMENT_NAME}")
    print(f"  Path: {experiment_path}")

## 3. System Setup & Dependencies

In [None]:
# Check GPU availability
import torch

if torch.cuda.is_available():
    print(f"✓ GPU available: {torch.cuda.get_device_name(0)}")
    print(f"  Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("⚠ No GPU detected - training will be slow")
    print("  Consider: Runtime > Change runtime type > GPU")

In [None]:
# Clone repository
import os
if not os.path.exists('/content/rl-swarm'):
    !git clone https://github.com/Elrashid/rl-swarm.git /content/rl-swarm
    print("✓ Repository cloned")
else:
    print("✓ Repository already exists")

%cd /content/rl-swarm

# Install dependencies
print("Installing dependencies (this may take 3-5 minutes)...")
!pip install -q -r requirements.txt
!pip install -q gensyn-genrl==0.1.9

print("✓ Dependencies installed")

## 4. Generate Peer Identity

In [None]:
import os
from cryptography.hazmat.primitives.asymmetric import rsa
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.backends import default_backend

# Identity file stored in Google Drive (persists across sessions)
IDENTITY_PATH = os.path.join(GDRIVE_BASE_PATH, f'identity_{NODE_ID}.pem')

if not os.path.exists(IDENTITY_PATH):
    print("Generating new peer identity...")
    key = rsa.generate_private_key(
        public_exponent=65537,
        key_size=2048,
        backend=default_backend()
    )
    pem = key.private_bytes(
        encoding=serialization.Encoding.PEM,
        format=serialization.PrivateFormat.TraditionalOpenSSL,
        encryption_algorithm=serialization.NoEncryption()
    )
    with open(IDENTITY_PATH, 'wb') as f:
        f.write(pem)
    print(f"✓ New identity created: {IDENTITY_PATH}")
else:
    print(f"✓ Using existing identity: {IDENTITY_PATH}")

# Copy to working directory
!cp "{IDENTITY_PATH}" /content/rl-swarm/swarm.pem
print("✓ Identity ready")

## 5. Setup Wandb (Optional)

In [None]:
if WANDB_API_KEY:
    import wandb
    wandb.login(key=WANDB_API_KEY)
    print("✓ Wandb configured")
else:
    print("ℹ Wandb disabled (WANDB_API_KEY not set)")

## 6. Set Environment Variables

In [None]:
import os

# Set environment variables
os.environ['GDRIVE_PATH'] = GDRIVE_BASE_PATH
os.environ['EXPERIMENT_NAME'] = EXPERIMENT_NAME
os.environ['NODE_ROLE'] = NODE_ROLE
os.environ['NODE_ID'] = NODE_ID
os.environ['MODEL_NAME'] = MODEL_NAME
os.environ['SEED'] = str(SEED)
os.environ['IDENTITY_PATH'] = '/content/rl-swarm/swarm.pem'

if HUGGINGFACE_TOKEN:
    os.environ['HUGGINGFACE_ACCESS_TOKEN'] = HUGGINGFACE_TOKEN

if WANDB_API_KEY:
    os.environ['WANDB_API_KEY'] = WANDB_API_KEY
    os.environ['WANDB_PROJECT'] = WANDB_PROJECT

print("✓ Environment variables set")

## 7. Check Peer Discovery

Verify that we can discover the coordinator and other workers.

In [None]:
from rgym_exp.src.gdrive_discovery import GDrivePeerDiscovery
import os

discovery_path = os.path.join(GDRIVE_BASE_PATH, 'discovery')
discovery = GDrivePeerDiscovery(discovery_path)

# Discover existing peers
peers = discovery.discover_peers(max_peers=10)

if peers:
    print(f"✓ Discovered {len(peers)} peer(s):")
    for peer in peers:
        print(f"  - {peer}")
else:
    print("⚠ No peers discovered yet")
    print("  Make sure the coordinator is running")
    print("  Peer discovery will continue automatically during training")

## 8. Start Training

**This cell will run until interrupted or max rounds reached.**

The worker will:
- Connect to discovered peers
- Sync with coordinator's current round
- Train the model and submit rewards
- Save checkpoints every 10 rounds

**Monitor logs below. Press the stop button to gracefully shutdown.**

In [None]:
import sys
import subprocess

%cd /content/rl-swarm

print("="*60)
print(f"Starting Worker Node: {NODE_ID}")
print(f"Experiment: {EXPERIMENT_NAME}")
print(f"Model: {MODEL_NAME}")
print("="*60)
print()

# Run training
try:
    subprocess.run([
        sys.executable, '-m', 'rgym_exp.runner.swarm_launcher',
        '--config-name', 'colab-gdrive'
    ])
    
except KeyboardInterrupt:
    print("\n" + "="*60)
    print("Training interrupted by user")
    print("="*60)
except Exception as e:
    print(f"\n❌ Error: {e}")
    import traceback
    traceback.print_exc()

## 9. Monitor Progress (Optional)

Run this cell in a separate window to monitor progress while training continues.

In [None]:
from rgym_exp.utils.experiment_manager import get_experiment_status, get_experiment_metrics
import pandas as pd

# Get current status
status = get_experiment_status(GDRIVE_BASE_PATH, EXPERIMENT_NAME)
print(f"Current Round: {status['current_round']}")
print(f"Current Stage: {status['current_stage']}")
print(f"Active Peers: {status['num_active_peers']}")
print(f"Total Submissions: {status['total_submissions']}")
print()

# Load and display recent metrics for this worker
try:
    df = get_experiment_metrics(GDRIVE_BASE_PATH, EXPERIMENT_NAME)
    if not df.empty:
        print(f"Recent metrics for {NODE_ID} (last 10 rounds):")
        recent = df[df['node_id'] == NODE_ID].tail(10)
        if not recent.empty:
            print(recent[['round', 'stage', 'my_reward']].to_string(index=False))
        else:
            print(f"No metrics for {NODE_ID} yet")
    else:
        print("No metrics available yet")
except Exception as e:
    print(f"Could not load metrics: {e}")

## 10. Resume Training (If Disconnected)

If your Colab session disconnects:
1. Re-run all cells above (keep same EXPERIMENT_NAME and NODE_ID)
2. The system will automatically resume from the last checkpoint
3. Training continues from the last saved round
4. Peer discovery will reconnect to coordinator automatically

## Troubleshooting

### Worker can't find experiment
- Verify EXPERIMENT_NAME matches coordinator exactly
- Check that coordinator has finished cell 7 (Initialize Experiment)
- Ensure using same Google Drive account

### No peers discovered
- Coordinator must be running and past cell 8
- Wait 1-2 minutes for peer discovery to propagate
- Check `{GDRIVE_BASE_PATH}/discovery/` folder exists

### Out of memory errors
- Use smaller model (e.g., Qwen2.5-0.5B instead of 1.5B)
- Reduce NUM_GENERATIONS to 1
- Enable GPU: Runtime > Change runtime type > GPU

### Training too slow
- Check GPU is enabled and available
- Reduce model size
- Check coordinator round duration isn't too short