# LeRobot PushT Environment Demo

This notebook demonstrates the gym-pusht environment from LeRobot. PushT is a 2D pushing task where the goal is to push a T-shaped block to a target location.

## Import Required Libraries

In [15]:
!pip install matplotlib pandas seaborn scikit-learn
!pip install Ipython
# Install gym-pusht environment with compatible pymunk version
!pip install gym-pusht pymunk==6.11.1



**Note:** After running the installation cell above, you may need to restart the kernel if you get compatibility errors.

In [16]:
import gymnasium as gym
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import clear_output
import time

In [17]:
# Suppress deprecation warnings
import warnings
warnings.filterwarnings('ignore', category=UserWarning)

In [None]:
# Verify pymunk version (should be 6.11.1 for gym-pusht compatibility)
import pymunk
print(f"Pymunk version: {pymunk.version}")
if not pymunk.version.startswith('6.'):
    print("⚠️  WARNING: Wrong pymunk version! Please restart kernel and reinstall.")
else:
    print("✓ Pymunk version is correct!")

In [18]:
# Import gym_pusht to register the environment
import gym_pusht.envs

## Create the PushT Environment

In [19]:
# Create the environment
env = gym.make('gym_pusht/PushT-v0', render_mode='rgb_array')

print(f"Observation space: {env.observation_space}")
print(f"Action space: {env.action_space}")

Observation space: Box(0.0, [512.         512.         512.         512.           6.28318531], (5,), float64)
Action space: Box(0.0, 512.0, (2,), float32)


## Reset Environment and Visualize Initial State

In [20]:
# Reset the environment
observation, info = env.reset(seed=42)

# Render and display the initial state
frame = env.render()

plt.figure(figsize=(8, 8))
plt.imshow(frame)
plt.title("Initial State of PushT Environment")
plt.axis('off')
plt.show()

print(f"Observation shape: {observation['pixels'].shape if 'pixels' in observation else observation.shape}")

AttributeError: 'Space' object has no attribute 'add_collision_handler'

## Run Random Actions

In [None]:
# Run a few steps with random actions
num_steps = 100
total_reward = 0

observation, info = env.reset(seed=42)

for step in range(num_steps):
    # Sample a random action
    action = env.action_space.sample()
    
    # Step the environment
    observation, reward, terminated, truncated, info = env.step(action)
    total_reward += reward
    
    # Break if episode ends
    if terminated or truncated:
        print(f"Episode ended at step {step + 1}")
        break

print(f"\nTotal steps: {step + 1}")
print(f"Total reward: {total_reward:.4f}")
print(f"Terminated: {terminated}, Truncated: {truncated}")

## Visualize Episode with Random Actions

In [None]:
# Collect frames for visualization
frames = []
rewards = []
num_steps = 50

observation, info = env.reset(seed=123)
frames.append(env.render())

for step in range(num_steps):
    action = env.action_space.sample()
    observation, reward, terminated, truncated, info = env.step(action)
    
    frames.append(env.render())
    rewards.append(reward)
    
    if terminated or truncated:
        break

print(f"Collected {len(frames)} frames")
print(f"Total reward: {sum(rewards):.4f}")

## Display Selected Frames

In [None]:
# Display 6 frames evenly spaced throughout the episode
num_display = 6
indices = np.linspace(0, len(frames) - 1, num_display, dtype=int)

fig, axes = plt.subplots(2, 3, figsize=(15, 10))
axes = axes.flatten()

for i, idx in enumerate(indices):
    axes[i].imshow(frames[idx])
    axes[i].set_title(f"Step {idx}")
    axes[i].axis('off')

plt.tight_layout()
plt.show()

## Plot Reward Over Time

In [None]:
# Plot the rewards
plt.figure(figsize=(12, 4))
plt.plot(rewards, label='Reward per step')
plt.plot(np.cumsum(rewards), label='Cumulative reward', linestyle='--')
plt.xlabel('Step')
plt.ylabel('Reward')
plt.title('Rewards During Episode')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

print(f"Average reward per step: {np.mean(rewards):.4f}")
print(f"Max reward: {np.max(rewards):.4f}")
print(f"Min reward: {np.min(rewards):.4f}")

## Explore Action Space

In [None]:
# Sample and visualize some actions
print("Sample actions from the action space:")
for i in range(5):
    action = env.action_space.sample()
    print(f"Action {i+1}: {action}")

print(f"\nAction space bounds:")
print(f"Low: {env.action_space.low}")
print(f"High: {env.action_space.high}")

## Clean Up

In [None]:
# Close the environment
env.close()
print("Environment closed successfully!")

## Next Steps

Now that you've explored the PushT environment, you can:

1. **Train a policy** - Use LeRobot's ACT, Diffusion, or other policies
2. **Load demonstrations** - Load pre-recorded expert demonstrations from Hugging Face
3. **Collect data** - Record your own demonstrations for imitation learning
4. **Evaluate policies** - Test trained models on the environment
5. **Visualize with Rerun** - Use rerun-sdk for 3D visualization

Check out the LeRobot documentation for more examples!