# MetaRL-Agent-Emergence: Google Colab Demo

This notebook provides a quick demo to:
- Clone the repository and install dependencies (requirements.txt + MetaWorld + MuJoCo)
- Run a small MT10 training for smoke-test speed
- Load a checkpoint and visualize training curves
- Document Colab caveats (MuJoCo, GPU, timeouts)

If a step fails, re-run the cell once; transient network issues can occur on Colab.

## 1) Runtime and environment
- Recommended: Runtime -> Change runtime type -> GPU (T4/A100 ok).
- Colab VMs reset when idle; checkpoints are saved under `/content/MetaRL-Agent-Emergence/outputs`.
- Python 3.10+ recommended.


In [None]:
%%bash
set -e
echo 'Python:' $(python --version)
echo 'CUDA:' $(nvidia-smi 2>/dev/null | head -n1 || echo 'No GPU')

# System libs helpful for mujoco & plotting
sudo apt-get update -y && sudo apt-get install -y libosmesa6-dev patchelf ffmpeg xvfb


In [None]:
# 2) Clone repo (this repo)
%cd /content
if not os.path.exists('MetaRL-Agent-Emergence'):
    !git clone https://github.com/sunghunkwag/MetaRL-Agent-Emergence.git
%cd MetaRL-Agent-Emergence


In [None]:
# 3) Python deps: requirements + mujoco + metaworld
%pip install -U pip wheel setuptools
%pip install -r requirements.txt

# MuJoCo & wrappers (dm-control/mujoco-python-viewer not needed for headless)
%pip install mujoco==3.1.6 mujoco-python-viewer==0.1.4

# MetaWorld (gym==0.26+ uses gymnasium; pin compatible versions if needed)
# Try gymnasium first; fallback to gym if project expects gym APIs.
try:
    import gymnasium as gym
    need_gym = False
except Exception:
    need_gym = True
if need_gym:
    %pip install gym==0.26.2

%pip install metaworld==2.0.0

# Wandb optional; disable if unavailable
%pip install wandb==0.17.9 || true


In [None]:
# 4) Quick import checks
import sys, os, subprocess, json
print('Python', sys.version)
try:
    import mujoco
    print('MuJoCo', mujoco.__version__)
except Exception as e:
    print('MuJoCo import failed:', e)

try:
    import metaworld
    print('MetaWorld', metaworld.__version__)
except Exception as e:
    print('MetaWorld import failed:', e)


## 2) Run a tiny MT10 training demo
We use extremely small steps/epochs to finish fast. Adjust upward for real runs.


In [None]:
# Try to use project"s experiment runner if available
import os, sys, pathlib, json, shutil
from pathlib import Path
repo = Path.cwd()
exp_py = repo/ 'experiments' / 'run_experiment.py'

if exp_py.exists():
    print('Found experiment runner:', exp_py)
    # Common CLI patterns; adjust flags if your script differs
    # We assume it supports selecting MT10 and setting small steps.
    !python experiments/run_experiment.py \
        --benchmark MT10 --total_steps 2000 --eval_interval 500 \
        --logdir outputs/mt10_demo --seed 0
else:
    print('experiments/run_experiment.py not found. Running a placeholder smoke test...')
    # Fallback: just validate Mujoco + MetaWorld env Step loop
    import metaworld, random
    import numpy as np
    from collections import defaultdict
    ml10 = metaworld.MT10()
    train_tasks = ml10.train_tasks
    env_name = list(ml10.train_classes.keys())[0]
    env = ml10.train_classes[env_name]()
    env.set_task(train_tasks[0])
    obs, _ = env.reset()
    for t in range(200):
        a = env.action_space.sample()
        obs, rew, term, trunc, info = env.step(a)
        if term or trunc:
            obs, _ = env.reset()
    Path('outputs/mt10_demo').mkdir(parents=True, exist_ok=True)
    with open('outputs/mt10_demo/log.jsonl','w') as f:
        for i in range(5):
            f.write(json.dumps({'step': i*100, 'return': float(i)})+'
')
    print('Fallback smoke test complete. Logs at outputs/mt10_demo')


## 3) Load/checkpoint and visualize curves
This cell attempts to parse a simple JSONL or CSV log in `outputs/mt10_demo`.


In [None]:
import os, json, glob
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path
logdir = Path('outputs/mt10_demo')
jsonl = list(logdir.glob('*.jsonl'))
csvs = list(logdir.glob('*.csv'))
steps, rets = [], []
if jsonl:
    with open(jsonl[0]) as f:
        for line in f:
            try:
                obj = json.loads(line)
                if 'step' in obj and ('return' in obj or 'reward' in obj):
                    steps.append(obj['step'])
                    rets.append(obj.get('return', obj.get('reward')))
            except Exception:
                pass
elif csvs:
    import csv
    with open(csvs[0]) as f:
        reader = csv.DictReader(f)
        for row in reader:
            s = row.get('step') or row.get('global_step') or row.get('t')
            r = row.get('return') or row.get('eval/return') or row.get('reward')
            if s is not None and r is not None:
                steps.append(float(s))
                rets.append(float(r))
else:
    print('No logs found in', str(logdir))

if steps:
    steps, rets = (np.array(steps), np.array(rets))
    order = np.argsort(steps)
    steps, rets = steps[order], rets[order]
    plt.figure(figsize=(6,4))
    plt.plot(steps, rets, marker='o')
    plt.xlabel('Steps')
    plt.ylabel('Return')
    plt.title('MT10 Demo: Training Curve')
    plt.grid(True)
    plt.show()
else:
    print('No plottable data found yet.')


## 4) Colab caveats
- MuJoCo: We use mujoco>=3; headless works via OSMesa. If rendering windows are needed, use `xvfb-run`.
- GPU: Some meta-RL code is CPU-heavy; enable GPU anyway for potential acceleration.
- Time limits: Colab disconnects; keep runs short or persist to Google Drive.
- MetaWorld versions: API changes between gym and gymnasium. This notebook tries gymnasium first then falls back to gym.
- If your runner uses Hydra/Argparse, adjust flags in the training cell accordingly.


## 5) Next steps
- Increase total_steps and adjust hyperparameters for real experiments.
- Integrate W&B or TensorBoard for richer logging.
- Replace the fallback loop with your project"s full training pipeline once verified.
