BenchPush is a comprehensive benchmarking suite designed for mobile robots performing pushing-based tasks. The goal is to provide researchers with a standardized platform for training and evaluating algorithms in both pushing-based navigation and manipulation. The suite includes a variety of simulated environments, evaluation metrics that capture task efficiency and interaction effort, as well as policy templates with reference implementations.
Checkout our paper "Bench-Push: Benchmarking Pushing-based Navigation and Manipulation Tasks for Mobile Robots", currently under review for the 23rd Conference on Robotics and Vision, 2026.
Create a 3D Maze environment:
env = gym.make('maze-NAMO-mujoco-v0')Create a 2D Maze environment:
env = gym.make('maze-NAMO-v0')This environment features a static maze structure with randomly initialized obstacles. The robot's task is to navigate from a starting position to a goal location while minimizing path length and obstacle collisions.
Create a 3D Ship-Ice environment:
env = gym.make('ship-ice-mujoco-v0')Create a 2D Ship-Ice environment:
env = gym.make('ship-ice-v0')In this task, an autonomous surface vehicle must reach a horizontal goal line ahead while minimizing collisions with broken ice floes in the channel.
Create a 3D Box-Delivery environment:
env = gym.make('box-delivery-mujoco-v0')Create a 2D Box-Delivery environment:
env = gym.make('box-delivery-v0')The Box-Delivery environment consists of a set of movable boxes to be delivered to a designated receptacle. The robot is tasked to delivery all boxes using its front bumper.
Create a 3D Area-Clearing environment:
env = gym.make('area-clearing-mujoco-v0')Create a 2D Area-Clearing environment:
env = gym.make('area-clearing-v0')This envronment consists of a set of movable boxes and a clearance area. The task of the robot is to remove all boxes from this clearance area.

The Area-Clearing environment.
A public PyPI release will be added after the review period.
-
Download the project
-
Go to the project directory and install dependencies.
cd BenchPush-CDF6
pip install -r requirements.txt- Install Gym environment
pip install -e .The steps above are sufficient to run all Ship-Ice and Maze environments. To run Box-Delivery and Area-Clearing, please install shortest path module as follows.
- Install the
spfapackage.
git clone https://github.com/IvanIZ/spfa.git
cd spfa
pip install -e .import benchpush.environments
import gymnasium as gym
env = gym.make('ship-ice-v0')
observation, info = env.reset()
terminated = truncated = False
while not (terminated or truncated):
action = your_policy(observation)
observation, reward, terminated, truncated, info = env.step(action)
env.render()To configure the parameters for each environment, please refer to the configuration examples for Maze, Ship-Ice, Box-Delivery, and Area-Clearing.
from benchpush.baselines.base_class import BasePolicy
class CustomPolicy(BasePolicy):
def __init__(self) -> None:
super().__init__()
# initialize costum policy here
...
def train(self):
# train the custom policy here, if needed
...
def act(self, observation, **kwargs):
# define how custom policy acts in the environment
...
def evaluate(self, num_eps: int, model_eps: str ='latest'):
# define how custom policy is evaluated here
...from benchpush.common.metrics.base_metric import BaseMetric
import CustomPolicy1 # some custom policies
import CustomPolicy2
import CustomPolicy3
# initialize policies to be evaluated
policy1 = CustomPolicy1()
policy2 = CustomPolicy2()
policy3 = CustomPolicy3()
# run evaluations
num_eps = 200 # number of episodes to evaluate each policy
benchmark_results = []
benchmark_results.append(policy1.evaluate(num_eps=num_eps))
benchmark_results.append(policy2.evaluate(num_eps=num_eps))
benchmark_results.append(policy3.evaluate(num_eps=num_eps))
# plot efficiency and effort scores
BaseMetric.plot_algs_scores(benchmark_results, save_fig_dir='./')| Tasks | Baselines |
|---|---|
| Maze | SAC1, PPO1, RRT Planning2 |
| Ship-Ice | SAC1, PPO1, ASV Planning3, 4 |
| Box-Delivery | SAC1, PPO1, SAM5 |
| Area_Clearing | SAC1, PPO1, SAM5, GTSP6 |
1: Reinforcement Learning policies Integrated with Stable Baselines 3.
2: Planning-based policy using a RRT planner.
3: Planning-based policy using an ASV ice navigation lattice planner.
4: Planning-based policy using a predictive ASV ice navigation planner.
5: Spatial Action Maps policy.
6: A Generalized Traveling Salesman Problem (GTSP) policy. Please see the Appendix for details.
You may download the our trained model weights from here.
To run the GTSP policy, we use a fork of the GLNS solver. To run this solver, the Julia programming language must be installed.
The path to the GLNS solver can be configured as a parameter. Please refer to the cofiguration example for Area-Clearing.


