# Mettabook

## Setup

In [5]:
# Optional: confirm you're set up to connect to the services used in this notebook
#    If the command does not run, run `./install.sh` from your terminal

!metta status --components=core,system,aws,wandb --non-interactive

[34mComponent | Installed  | Connected As              | Expected             | Status[0m[0m
[0m[34m----------------------------------------------------------------------------------[0m[0m
[0m[34mcore     | Yes        | -                         | -                    |[0m[0m[0m[32mOK[0m[0m
[0m[34msystem   | Yes        | -                         | -                    |[0m[0m[0m[32mOK[0m[0m
[0m[34maws      | Yes        | -                         | 751442549699         |[0m[0m[0m[31mNOT CONNECTED[0m[0m
[0m[34mwandb    | Yes        | metta-research            | metta-research       |[0m[0m[0m[32mOK[0m[0m
[0m[34m----------------------------------------------------------------------------------[0m[0m
[0m[33mSome components need authentication. Run 'metta install' to set them up.[0m[0m
[0m[33mComponents not connected: aws[0m[0m
[0m[34mThis could be due to expired credentials, network issues, or broken installations.[0m[0m
[0m[34mTo 

In [2]:
%load_ext autoreload
%autoreload 2
import os

import matplotlib.pyplot as plt
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots

from experiments.notebooks.utils.metrics import fetch_metrics
from experiments.notebooks.utils.monitoring import monitor_training_statuses
from experiments.notebooks.utils.replays import show_replay
from experiments.notebooks.utils.training import launch_training
from datetime import datetime
from metta.common.wandb.wandb_runs import find_training_runs

%matplotlib inline
plt.style.use("default")

print("Setup complete! Auto-reload enabled.")

Setup complete! Auto-reload enabled.


## Launch Training

In [4]:
# Example: Launch training

run_name = f"{os.environ.get('USER')}.training_run.{datetime.now().strftime('%Y-%m-%d_%H-%M')}.move_8way.3"
print(f"Launching training with run name: {run_name}...")

# # View `launch_training` function for all options
result = launch_training(
    run_name=run_name,
    # curriculum="env/mettagrid/arena/basic",
    wandb_tags=[f"{os.environ.get('USER')}-arena-experiment"],
    additional_args=[
                     "++trainer.env_overrides.game.actions.move.enabled=false",
                     "++trainer.env_overrides.game.actions.rotate.enabled=false",
                     "++trainer.env_overrides.game.actions.move_8way.enabled=true",
                     "--skip-git-check",
                     "--no-spot"
                     ],
)

Launching training with run name: me.training_run.2025-08-06_10-16.move_8way.3...
Launching training job: me.training_run.2025-08-06_10-16.move_8way.3
Command: ./devops/skypilot/launch.py train run=me.training_run.2025-08-06_10-16.move_8way.3 +wandb.tags=["me-arena-experiment"] ++trainer.env_overrides.game.actions.move.enabled=false ++trainer.env_overrides.game.actions.rotate.enabled=false ++trainer.env_overrides.game.actions.move_8way.enabled=true --skip-git-check --no-spot
Submitted sky.jobs.launch request: 0c846dcc-6ac4-4965-a7e0-bc40e37b3108
- Check logs with: sky api logs 0c846dcc
- Or, visit: https://skypilot-api.softmax-research.net/dashboard/jobs

✓ Job launched successfully!


In [27]:
!uv run sky jobs cancel --yes 3968


^C

Aborted!


In [9]:
!uv run sky jobs queue | grep "me.training_run"


4081  -     me.training_run.2025-08-06_10-16.move_8way.3               1x[A10G:1, L4:1]                  3 hrs ago   3h 45m 33s     3h 38m 31s    0            RUNNING    
4080  -     me.training_run.2025-08-06_10-16.move_8way.2               1x[L4:1, A10G:1]                  3 hrs ago   3h 45m 33s     3h 38m 55s    0            RUNNING    
4079  -     me.training_run.2025-08-06_10-15.move_8way.1               1x[A10G:1, L4:1]                  3 hrs ago   3h 46m 53s     3h 39m 53s    0            RUNNING    
4069  -     me.training_run.2025-08-06_09-24.move_cardinal.baseline.3  1x[A10G:1, L4:1]                  4 hrs ago   4h 37m 54s     4h 31m 38s    0            RUNNING    
4068  -     me.training_run.2025-08-06_09-24.move_cardinal.baseline.2  1x[A10G:1, L4:1]                  4 hrs ago   4h 37m 58s     4h 31m 12s    0            RUNNING    
4067  -     me.training_run.2025-08-06_09-23.move_cardinal.no_spot.3   1x[A10G:1, L4:1]                  4 hrs ago   4h 38m 43s     4h 32m 26s   

## Monitor Training Jobs

In [14]:
# Monitor Training
run_names = [
    "me.training_run.2025-08-06_10-16.move_8way.3",
    "me.training_run.2025-08-06_10-16.move_8way.2",
    "me.training_run.2025-08-06_10-15.move_8way.1",

    "me.training_run.2025-08-06_09-23.move_cardinal.no_spot.3",
    "me.training_run.2025-08-05_16-37.move_cardinal.no_spot",
    "me.training_run.2025-08-05_16-59.move_cardinal.no_spot",

    "me.training_run.2025-08-06_09-24.move_cardinal.baseline.3",
    "me.training_run.2025-08-06_09-24.move_cardinal.baseline.2",
    "me.training-run.2025-08-05_15-46.baseline-no-spot",
]

# Optional: instead, find all runs that meet some criteria
# run_names = find_training_runs(
#     # wandb_tags=["low_reward"],
#     # state="finished",
#     author=os.getenv("USER"),
#     limit=5,
# )

df = monitor_training_statuses(run_names, show_metrics=["_step", "overview/reward"])

HTML(value="\n    <style>\n        .training-table {\n            border-collapse: collapse;\n            widt…

## Fetch Metrics

In [17]:
metrics_dfs = fetch_metrics(run_names, samples=50000)

Fetching metrics for me.training_run.2025-08-06_10-16.move_8way.3: running, 2025-08-06T17:24:27Z
https://wandb.ai/metta-research/metta/runs/me.training_run.2025-08-06_10-16.move_8way.3...
  Fetched 1095 data points.
  Reward: mean=8.8243, max=15.2515
  Access with `metrics_dfs['me.training_run.2025-08-06_10-16.move_8way.3']`

Fetching metrics for me.training_run.2025-08-06_10-16.move_8way.2: running, 2025-08-06T17:24:01Z
https://wandb.ai/metta-research/metta/runs/me.training_run.2025-08-06_10-16.move_8way.2...
  Fetched 1130 data points.
  Reward: mean=8.8145, max=15.1978
  Access with `metrics_dfs['me.training_run.2025-08-06_10-16.move_8way.2']`

Fetching metrics for me.training_run.2025-08-06_10-15.move_8way.1: running, 2025-08-06T17:23:04Z
https://wandb.ai/metta-research/metta/runs/me.training_run.2025-08-06_10-15.move_8way.1...
  Fetched 1129 data points.
  Reward: mean=9.5186, max=15.6014
  Access with `metrics_dfs['me.training_run.2025-08-06_10-15.move_8way.1']`

Fetching metrics

## Analyze Metrics

In [20]:
# Plot overview metrics for all fetched runs
if not metrics_dfs:
    print("No metrics data available. Please fetch metrics first.")
else:
    print(f"Plotting metrics for {len(metrics_dfs)} runs")

    # Find common metrics across all runs
    all_columns = set()
    for _, df in metrics_dfs.items():
        all_columns.update(df.columns)

    columns = ["overview/reward", "losses/explained_variance"]
    plot_cols = []

    for col in all_columns:
        if col not in columns:
            continue
        # Check if this column exists in at least one run with numeric data
        has_numeric_data = False
        for df in metrics_dfs.values():
            if col in df.columns and pd.api.types.is_numeric_dtype(df[col]) and df[col].nunique() > 1:
                has_numeric_data = True
                break
        if has_numeric_data:
            plot_cols.append(col)

    if not plot_cols:
        print("No plottable metrics found")
    else:
        # Calculate grid dimensions
        n_metrics = len(plot_cols)
        n_cols = min(3, n_metrics)  # Max 3 columns
        n_rows = (n_metrics + n_cols - 1) // n_cols

        # Create subplots
        fig = make_subplots(
            rows=n_rows,
            cols=n_cols,
            subplot_titles=[col.replace("overview/", "").replace("_", " ") for col in plot_cols],
            vertical_spacing=0.08,
            horizontal_spacing=0.1,
        )

        # Color palette for different runs
        colors = ["blue", "red", "green", "orange", "purple", "brown", "pink", "gray", "olive", "cyan"]

        # Add traces for each metric and each run
        for idx, col in enumerate(plot_cols):
            row = (idx // n_cols) + 1
            col_idx = (idx % n_cols) + 1

            # Plot each run for this metric
            for run_idx, (run_name, df) in enumerate(metrics_dfs.items()):
                if col in df.columns and "_step" in df.columns:
                    color = colors[run_idx % len(colors)]

                    # Only show legend on first subplot to avoid clutter
                    show_legend = idx == 0

                    fig.add_trace(
                        go.Scatter(
                            x=df["_step"],
                            y=df[col],
                            mode="lines",
                            name=run_name,
                            line=dict(color=color, width=2),
                            showlegend=show_legend,
                            legendgroup=run_name,  # Group all traces from same run
                        ),
                        row=row,
                        col=col_idx,
                    )

        # Update layout
        runs_text = "run" if len(metrics_dfs) == 1 else "runs"
        fig.update_layout(
            height=1050 * n_rows,
            showlegend=True,
            legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1),
        )

        # Update x-axes labels for bottom row
        for col_idx in range(1, min(n_cols, n_metrics) + 1):
            fig.update_xaxes(title_text="Steps", row=n_rows, col=col_idx)

        fig.show()

Plotting metrics for 9 runs


## View Replays

Display replay viewer for a specific run:

In [22]:
# Show available replays
# replays = get_available_replays("daveey.lp.16x4.bptt8")

# Show the last replay for a run
#show_replay(run_names, step="last", width=1000, height=600)
for run_name in run_names:
    show_replay(run_name, step="last", width=1000, height=600)

Loading MettaScope viewer for me.training_run.2025-08-06_10-16.move_8way.3 at step 1,028,160,000...

Direct link: https://metta-ai.github.io/metta/?replayUrl=https://softmax-public.s3.amazonaws.com/replays/me.training_run.2025-08-06_10-16.move_8way.3/4f3b17aae769/e7777773-3b6e-4aee-9090-8403adc3a4a8.json.z


Loading MettaScope viewer for me.training_run.2025-08-06_10-16.move_8way.2 at step 1,028,160,000...

Direct link: https://metta-ai.github.io/metta/?replayUrl=https://softmax-public.s3.amazonaws.com/replays/me.training_run.2025-08-06_10-16.move_8way.2/3d7a4cb4f493/f1bad00b-42f1-4839-a515-74fe91d2ef5b.json.z


Loading MettaScope viewer for me.training_run.2025-08-06_10-15.move_8way.1 at step 1,028,160,000...

Direct link: https://metta-ai.github.io/metta/?replayUrl=https://softmax-public.s3.amazonaws.com/replays/me.training_run.2025-08-06_10-15.move_8way.1/458206df9265/7ae72fa0-fc49-4f67-9ff3-69f02237008f.json.z


Loading MettaScope viewer for me.training_run.2025-08-06_09-23.move_cardinal.no_spot.3 at step 822,528,000...

Direct link: https://metta-ai.github.io/metta/?replayUrl=https://softmax-public.s3.amazonaws.com/replays/me.training_run.2025-08-06_09-23.move_cardinal.no_spot.3/5b64ae7f5542/1112b6c8-deb7-4e60-a322-837ed9b5ba9c.json.z


Loading MettaScope viewer for me.training_run.2025-08-05_16-37.move_cardinal.no_spot at step 6,580,224,000...

Direct link: https://metta-ai.github.io/metta/?replayUrl=https://softmax-public.s3.amazonaws.com/replays/me.training_run.2025-08-05_16-37.move_cardinal.no_spot/8fc76034140d/5fef93e0-3608-4a93-b4fa-e59b6bf4a883.json.z


Loading MettaScope viewer for me.training_run.2025-08-05_16-59.move_cardinal.no_spot at step 6,168,960,000...

Direct link: https://metta-ai.github.io/metta/?replayUrl=https://softmax-public.s3.amazonaws.com/replays/me.training_run.2025-08-05_16-59.move_cardinal.no_spot/d207b63280e0/bd4caeed-813b-4de3-bc2b-47e955913cd1.json.z


Loading MettaScope viewer for me.training_run.2025-08-06_09-24.move_cardinal.baseline.3 at step 1,233,792,000...

Direct link: https://metta-ai.github.io/metta/?replayUrl=https://softmax-public.s3.amazonaws.com/replays/me.training_run.2025-08-06_09-24.move_cardinal.baseline.3/d48dbeb9a7e5/a601c3ac-6ce4-4068-bb0f-a5e9b94ae537.json.z


Loading MettaScope viewer for me.training_run.2025-08-06_09-24.move_cardinal.baseline.2 at step 1,233,792,000...

Direct link: https://metta-ai.github.io/metta/?replayUrl=https://softmax-public.s3.amazonaws.com/replays/me.training_run.2025-08-06_09-24.move_cardinal.baseline.2/c439abe93c4f/0b177c94-1d1c-47eb-bce5-9111fcb1bab4.json.z


Loading MettaScope viewer for me.training-run.2025-08-05_15-46.baseline-no-spot at step 6,374,592,000...

Direct link: https://metta-ai.github.io/metta/?replayUrl=https://softmax-public.s3.amazonaws.com/replays/me.training-run.2025-08-05_15-46.baseline-no-spot/9f1ed4be401d/82cc2d82-8277-4591-952d-4c99649481fc.json.z
