# LIBERO Benchmark Guide: Running GetiAction/LeRobot Policies

This notebook provides a comprehensive guide to:
1. Understanding the **LIBERO benchmark** for robot manipulation
2. Using `LiberoGym` to interact with LIBERO environments
3. Running **GetiAction/LeRobot policies** (Diffusion, ACT) for evaluation

## What is LIBERO?

**LIBERO** (Lifelong robotic learning BEnchmaRk with knOwledge transfer) is a benchmark designed for:
- **Lifelong learning** in robotics
- **Knowledge transfer** across tasks
- **Standardized evaluation** of manipulation policies

It provides **130 diverse manipulation tasks** across 5 task suites, using a simulated Franka Panda robot.

## 1. Setup & Imports

In [None]:
# Core imports
import matplotlib.pyplot as plt
import numpy as np
import torch

# GetiAction imports
from getiaction.data import Feature, FeatureType, NormalizationParameters
from getiaction.devices import get_available_device
from getiaction.gyms.libero import LiberoGym
from getiaction.policies import ACT, ACTModel

# Check device (supports CUDA, XPU, and CPU)
device = get_available_device()
print(f"Using device: {device}")

## 2. LIBERO Benchmark Overview

| Suite | Tasks | Max Steps | Focus |
|-------|-------|-----------|-------|
| `libero_spatial` | 10 | 280 | Spatial reasoning (same objects, different positions) |
| `libero_object` | 10 | 280 | Object generalization (different objects, same actions) |
| `libero_goal` | 10 | 300 | Goal-conditioned tasks (same scene, different goals) |
| `libero_10` | 10 | 520 | Mixed difficulty benchmark |
| `libero_90` | 90 | 400 | Large-scale comprehensive benchmark |

In [None]:
try:
    from libero.libero import benchmark
except ModuleNotFoundError:
    msg = "LIBERO is not installed. Install it with:\n  pip install hf-libero\nOr with uv:\n  uv pip install hf-libero"
    raise ImportError(msg) from None

# List available tasks
for suite_name in ["libero_spatial", "libero_object"]:
    suite = benchmark.get_benchmark_dict()[suite_name]()
    tasks = suite.get_task_names()
    print(f"\n{suite_name} ({len(tasks)} tasks):")
    for i, task in enumerate(tasks[:3]):
        print(f"  [{i}] {task}")
    if len(tasks) > 3:
        print(f"  ... and {len(tasks) - 3} more")

## 3. Create LiberoGym Environment

In [None]:
# Create a LiberoGym instance
gym = LiberoGym(
    task_suite="libero_spatial",
    task_id=0,
    observation_height=256,
    observation_width=256,
    obs_type="pixels_agent_pos",  # Images + proprioception
    control_mode="relative",  # Delta actions
)

print(f"Task: {gym.task_name}")
print(f"Max episode steps: {gym.max_episode_steps}")
print(f"Action space: {gym.action_space.shape}")

In [None]:
# Reset and inspect observation
obs, info = gym.reset(seed=42)

print(f"Observation type: {type(obs).__name__}")
print(f"\nImages: {list(obs.images.keys())}")
for name, img in obs.images.items():
    print(f"  {name}: {img.shape}")

print(f"\nState: {obs.state.shape}")
print("  Format: [eef_pos(3), axis_angle(3), gripper(2)]")
print(f"  Values: {obs.state.squeeze().numpy()}")

In [None]:
# Visualize camera views
fig, axes = plt.subplots(1, 2, figsize=(10, 4))

img1 = obs.images["image"].squeeze(0).permute(1, 2, 0).numpy()
img2 = obs.images["image2"].squeeze(0).permute(1, 2, 0).numpy()

axes[0].imshow(img1)
axes[0].set_title("Front Camera (agentview)")
axes[0].axis("off")

axes[1].imshow(img2)
axes[1].set_title("Eye-in-Hand Camera")
axes[1].axis("off")

plt.suptitle(f"Task: {gym.task_name[:50]}...")
plt.tight_layout()
plt.show()

## 4. Define Policy Features (Manual Approach)

GetiAction uses `Feature` dataclasses to define input/output shapes. This is the **manual approach** - you can also use `ACT.from_dataset()` for automatic feature extraction (see Section 5).

In [None]:
# Input features matching LiberoGym output (with normalization for ACT)
input_features = {
    "image": Feature(
        ftype=FeatureType.VISUAL,
        shape=(3, 256, 256),
        normalization_data=NormalizationParameters(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]),
    ),
    "image2": Feature(
        ftype=FeatureType.VISUAL,
        shape=(3, 256, 256),
        normalization_data=NormalizationParameters(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]),
    ),
    "state": Feature(
        ftype=FeatureType.STATE,
        shape=(8,),
        normalization_data=NormalizationParameters(mean=[0.0] * 8, std=[1.0] * 8),
    ),
}

# Output features (7-dim action)
output_features = {
    "action": Feature(
        ftype=FeatureType.ACTION,
        shape=(7,),
        normalization_data=NormalizationParameters(mean=[0.0] * 7, std=[1.0] * 7),
    ),
}

print("Input features:")
for name, feat in input_features.items():
    print(f"  {name}: {feat.ftype.name} {feat.shape}")
print(f"\nOutput: action {output_features['action'].shape}")

## 5. Create First-Party ACT Policy

Two approaches to create an ACT policy:
1. **Manual**: Define features yourself (shown above)
2. **From Dataset**: Use `ACT.from_dataset()` for automatic feature extraction

In [None]:
# Option 1: Manual feature definition (from Section 4)
act_model = ACTModel(
    input_features=input_features,
    output_features=output_features,
    chunk_size=100,
    dim_model=256,
    n_encoder_layers=2,
    n_decoder_layers=1,
)
act_policy = ACT(model=act_model)
act_policy.to(device)
act_policy.eval()

print(f"âœ… First-party ACT policy created on {device}")
print(f"   Parameters: {sum(p.numel() for p in act_policy.parameters()):,}")

### Option 2: Using `ACT.from_dataset()` (Recommended)

This approach automatically extracts features from a LIBERO dataset on HuggingFace:

In [None]:
# Option 2: Create ACT from a LeRobot dataset (automatic feature extraction)
# Note: LIBERO datasets on HuggingFace may need format migration for LeRobot v3.0
# Here we demonstrate with a compatible ALOHA dataset - same pattern works for LIBERO

from getiaction.data.lerobot import LeRobotDataModule

# Create datamodule from a compatible LeRobot dataset
# For LIBERO, use "lerobot/libero_spatial" once migrated to v3.0 format
datamodule = LeRobotDataModule(
    repo_id="lerobot/aloha_sim_transfer_cube_human",  # Example with ALOHA (2 arms, 4 cameras)
    train_batch_size=32,
    data_format="getiaction",  # Use GetiAction's Observation format
)

# Create ACT policy directly from dataset (features extracted automatically!)
act_policy_from_dataset = ACT.from_dataset(
    dataset=datamodule.train_dataset,
    chunk_size=100,
    dim_model=256,
    n_encoder_layers=2,
    n_decoder_layers=1,
)
act_policy_from_dataset.to(device)
act_policy_from_dataset.eval()

print(f"âœ… ACT.from_dataset() created policy on {device}")
print(f"   Parameters: {sum(p.numel() for p in act_policy_from_dataset.parameters()):,}")
print("\nðŸ“Š Features extracted automatically from dataset:")
print(f"   Observation keys: {list(datamodule.train_dataset.observation_features.keys())}")
print(f"   Action keys: {list(datamodule.train_dataset.action_features.keys())}")

## 6. Run Policy Evaluation

In [None]:
import time


def run_episode(gym, policy, max_steps=50, seed=42):
    """Run a single episode with a policy.

    Handles both chunked (ACT) and non-chunked (Diffusion) policies.
    """
    obs, _ = gym.reset(seed=seed)
    device = next(policy.parameters()).device
    obs = obs.to(device)

    frames, actions, rewards = [], [], []
    start = time.time()

    for _ in range(max_steps):
        with torch.no_grad():
            action = policy(obs)
            # Handle chunked outputs (ACT returns [batch, chunk_size, action_dim])
            if action.dim() == 3:
                action = action[:, 0, :]

        action_np = action.squeeze(0).cpu().numpy()
        actions.append(action_np)

        obs, reward, done, truncated, _ = gym.step(action_np)
        obs = obs.to(device)
        rewards.append(reward)
        frames.append(obs.images["image"].squeeze(0).permute(1, 2, 0).cpu().numpy())

        if done or truncated:
            break

    return {
        "frames": np.array(frames),
        "actions": np.array(actions),
        "sum_reward": sum(rewards),
        "steps": len(frames),
        "fps": len(frames) / (time.time() - start),
        "success": gym.check_success(),
    }


# Run ACT policy
print("Running first-party ACT policy...")
act_result = run_episode(gym, act_policy, max_steps=30)
print(f"  Steps: {act_result['steps']}, Success: {act_result['success']}, FPS: {act_result['fps']:.1f}")

## 7. Visualize Rollout

Extract frames from observations returned by rollout:

In [None]:
# Visualize ACT rollout trajectory
fig, axes = plt.subplots(1, 5, figsize=(15, 3))
indices = np.linspace(0, len(act_result["frames"]) - 1, 5, dtype=int)
for i, idx in enumerate(indices):
    axes[i].imshow(act_result["frames"][idx])
    axes[i].set_title(f"Step {idx}")
    axes[i].axis("off")

plt.suptitle("ACT Policy Rollout", fontsize=14)
plt.tight_layout()
plt.show()

In [None]:
# Visualize actions
actions = act_result["actions"]

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

axes[0].plot(actions[:, :3])
axes[0].legend(["Î”x", "Î”y", "Î”z"])
axes[0].set_title("Position Actions")
axes[0].set_xlabel("Step")
axes[0].grid(alpha=0.3)

axes[1].plot(actions[:, 3:6])
axes[1].plot(actions[:, 6], "k--", linewidth=2, label="gripper")
axes[1].legend(["Î”roll", "Î”pitch", "Î”yaw", "gripper"])
axes[1].set_title("Rotation + Gripper Actions")
axes[1].set_xlabel("Step")
axes[1].grid(alpha=0.3)

plt.tight_layout()
plt.show()

## 8. Third-Party Integration: LeRobot Policies

GetiAction also supports LeRobot policies (Diffusion, ACT, VQ-BeT, etc.) via `LeRobotPolicy` wrapper:

In [None]:
# LeRobot integration uses different feature format
from lerobot.configs.types import FeatureType as LRFeatureType
from lerobot.configs.types import PolicyFeature

from getiaction.policies.lerobot import LeRobotPolicy

# LeRobot-style features (note the different naming convention)
lr_input_features = {
    "observation.images.image": PolicyFeature(type=LRFeatureType.VISUAL, shape=(3, 256, 256)),
    "observation.images.image2": PolicyFeature(type=LRFeatureType.VISUAL, shape=(3, 256, 256)),
    "observation.state": PolicyFeature(type=LRFeatureType.STATE, shape=(8,)),
}
lr_output_features = {
    "action": PolicyFeature(type=LRFeatureType.ACTION, shape=(7,)),
}

# Create LeRobot Diffusion policy
diffusion_policy = LeRobotPolicy(
    policy_name="diffusion",
    input_features=lr_input_features,
    output_features=lr_output_features,
    config_kwargs={"crop_shape": None},
)
diffusion_policy.to(device)
diffusion_policy.eval()

print(f"âœ… LeRobot Diffusion policy created on {device}")

In [None]:
# Compare first-party ACT vs LeRobot Diffusion
print("Comparing policies...")

print("\n1. First-party ACT:")
act_result = run_episode(gym, act_policy, max_steps=20)
print(f"   Steps: {act_result['steps']}, Success: {act_result['success']}, FPS: {act_result['fps']:.1f}")

print("\n2. LeRobot Diffusion:")
diffusion_result = run_episode(gym, diffusion_policy, max_steps=20)
print(
    f"   Steps: {diffusion_result['steps']}, Success: {diffusion_result['success']}, FPS: {diffusion_result['fps']:.1f}",
)

In [None]:
# Side-by-side comparison
fig, axes = plt.subplots(2, 5, figsize=(15, 6))

# ACT trajectory
indices = np.linspace(0, len(act_result["frames"]) - 1, 5, dtype=int)
for i, idx in enumerate(indices):
    axes[0, i].imshow(act_result["frames"][idx])
    axes[0, i].set_title(f"Step {idx}")
    axes[0, i].axis("off")
axes[0, 0].set_ylabel("ACT\n(first-party)", fontsize=11)

# Diffusion trajectory
indices = np.linspace(0, len(diffusion_result["frames"]) - 1, 5, dtype=int)
for i, idx in enumerate(indices):
    axes[1, i].imshow(diffusion_result["frames"][idx])
    axes[1, i].set_title(f"Step {idx}")
    axes[1, i].axis("off")
axes[1, 0].set_ylabel("Diffusion\n(LeRobot)", fontsize=11)

plt.suptitle("Policy Comparison: First-Party vs Third-Party", fontsize=14)
plt.tight_layout()
plt.show()

In [None]:
# Cleanup
gym.close()
print("âœ… Environment closed.")

## Summary

### What You Learned:
- **LIBERO**: 5 task suites, 130 manipulation tasks
- **LiberoGym**: Gymnasium wrapper with compatible observations
- **First-party ACT**: Native GetiAction implementation
- **Third-party**: LeRobot policies (Diffusion, ACT, VQ-BeT) via wrapper


### Compatible LeRobot Datasets (v3.0):
| Dataset | Description |
|---------|-------------|
| `lerobot/aloha_sim_transfer_cube_human` | ALOHA bimanual cube transfer |
| `lerobot/pusht` | 2D pushing task |
| `lerobot/aloha_mobile_*` | Mobile ALOHA tasks |

> **Note**: Some LIBERO datasets on HuggingFace use older formats. Check [LeRobot docs](https://huggingface.co/lerobot) for dataset migration.

### Next Steps:
1. Train ACT on demonstrations using `ACT.from_dataset()`
2. Load pretrained checkpoints from HuggingFace
3. Run full benchmark evaluation with LiberoGym