Skip to content

In this work, we present RMBench, the first benchmark for robotic manipulations, which have high-dimensional continuous action and state spaces. We implement and evaluate reinforcement learning algorithms that directly use observed pixels as inputs.

xiangyanfei212/RMBench-2022

Repository files navigation

RMBench-2022

Brief Introduction

In this work, we present RMBench, the first benchmark for robotic manipulations, which have high-dimensional continuous action and state spaces. We implement and evaluate reinforcement learning algorithms that directly use observed pixels as inputs.

This repository is the official implementation of our paper: Y. Xiang et al., “RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator Control,” Oct. 2022, doi: 10.48550/arXiv.2210.11262.

RL algirithms

Tasks

Manipulation tasks

We utilize dm_control software package, which has task suites for reinforcement learning agents in an articulated-body simulation. We focus on the manipulation tasks with a 3D robotic arm, which can be divided into five categories: lifting, placing, reaching, stacking, and reassembling. They are described briefly below.

Category Task Description
Lifting Lift brick Elevate a brick above a threshold height.
Lift large box Elevate a large box above a threshold height. The box is too large to be grasped by the gripper, requiring non-prehensile manipulation.
Reaching Reach site Move the end effector to a target location in 3D space.
Reach brick Move the end effector to a brick resting on the ground.
Placing Place cradle Place a brick inside a concave `cradle' situated on a pedestal.
Place brick Place a brick on top of another brick that is attached to the top of a pedestal. Unlike the stacking tasks below, the two bricks are not required to be snapped together in order to obtain maximum reward.
Stacking Stack 2 bricks Snap together two bricks, one of which is attached to the floor.
Stack 2 bricks movable base Same as `stack 2 bricks', except both bricks are movable.
Reassembling Reassemble 5 bricks random order The episode begins with all five bricks already assembled in a stack, with the bottom brick being attached to the floor. The agent must disassemble the top four bricks in the stack, and reassemble them in the opposite order.

Installation

  1. Install MuJoCo
  • Obtain a license on the MuJoCo website.
  • Download MuJoCo binaries here. such as 'mujoco210_linux.zip'
  • Unzip the downloaded archive into ~/.mujoco/mujoco210
$ mkdir ~/.mujoco/mujoco210
$ cp mujoco210\_linux.zip ~/.mujoco/mujoco210 
$ cd ~/.mujoco/mujoco210 
$ unzip mujoco210\_linux.zip
  • Place your license key file mjkey.txt at ~/.mujoco/mujoco210.
$ cp mjkey.txt ~/.mujoco/mujoco210 
$ cp mjkey.txt ~/.mujoco/mujoco210/mujoco210_linux/bin
  • Add environment variables: Use the env variables MUJOCO_PY_MJKEY_PATH and MUJOCO_PY_MUJOCO_PATH to specify the MuJoCo license key path and the MuJoCo directory path.
$ export MUJOCO\_PY\_MJKEY\_PATH=$MUJOCO\_PY\_MJKEY\_PATH:~/.mujoco/mujoco210/mjkey.txt
$ export MUJOCO\_PY\_MUJOCO\_PATH=$MUJOCO\_PY\_MUJOCO\_PATH:~/.mujoco/mujoco210/mujoco210\_linux
  • Append the MuJoCo subdirectory bin path into the env variable LD_LIBRARY_PATH.
$ export LD\_LIBRARY\_PATH=$LD\_LIBRARY\_PATH:~/.mujoco/mujoco210/bin 
  1. Install the required python library
$ pip install -r requirements.txt

How to run?

For example, we want to train agents using DrQ-v2 algorithms for 'reaching site' tasks:

$ cd 00\_DrQv2
$ python drqv2_train.py task=reach_site

Some Results

When the training process finishes, you can use 'plot_curve.py' to plot the curves of rewards.

Manipulation tasks

Acknowledgements

Part of this code is inspired by SpinningUp2018 and DrQ-v2

Citation

Please kindly consider citing our paper in your publications.

@misc{https://doi.org/10.48550/arxiv.2210.11262,
      doi = {10.48550/ARXIV.2210.11262},
      url = {https://arxiv.org/abs/2210.11262},
      author = {Xiang, Yanfei and Wang, Xin and Hu, Shu and Zhu, Bin and Huang, Xiaomeng and Wu, Xi and Lyu, Siwei},
      keywords = {Robotics (cs.RO), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences},
      title = {RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator Control},
      publisher = {arXiv},
      year = {2022},
      copyright = {arXiv.org perpetual, non-exclusive license}
}

About

In this work, we present RMBench, the first benchmark for robotic manipulations, which have high-dimensional continuous action and state spaces. We implement and evaluate reinforcement learning algorithms that directly use observed pixels as inputs.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published