RMBench-2022

Brief Introduction

In this work, we present RMBench, the first benchmark for robotic manipulations, which have high-dimensional continuous action and state spaces. We implement and evaluate reinforcement learning algorithms that directly use observed pixels as inputs.

This repository is the official implementation of our paper: Y. Xiang et al., “RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator Control,” Oct. 2022, doi: 10.48550/arXiv.2210.11262.

RL algirithms

Tasks

We utilize dm_control software package, which has task suites for reinforcement learning agents in an articulated-body simulation. We focus on the manipulation tasks with a 3D robotic arm, which can be divided into five categories: lifting, placing, reaching, stacking, and reassembling. They are described briefly below.

Category	Task	Description
Lifting	Lift brick	Elevate a brick above a threshold height.
Lifting	Lift large box	Elevate a large box above a threshold height. The box is too large to be grasped by the gripper, requiring non-prehensile manipulation.
Reaching	Reach site	Move the end effector to a target location in 3D space.
Reaching	Reach brick	Move the end effector to a brick resting on the ground.
Placing	Place cradle	Place a brick inside a concave `cradle' situated on a pedestal.
Placing	Place brick	Place a brick on top of another brick that is attached to the top of a pedestal. Unlike the stacking tasks below, the two bricks are not required to be snapped together in order to obtain maximum reward.
Stacking	Stack 2 bricks	Snap together two bricks, one of which is attached to the floor.
Stacking	Stack 2 bricks movable base	Same as `stack 2 bricks', except both bricks are movable.
Reassembling	Reassemble 5 bricks random order	The episode begins with all five bricks already assembled in a stack, with the bottom brick being attached to the floor. The agent must disassemble the top four bricks in the stack, and reassemble them in the opposite order.

Installation

Install MuJoCo

Obtain a license on the MuJoCo website.
Download MuJoCo binaries here. such as 'mujoco210_linux.zip'
Unzip the downloaded archive into ~/.mujoco/mujoco210

$ mkdir ~/.mujoco/mujoco210
$ cp mujoco210\_linux.zip ~/.mujoco/mujoco210 
$ cd ~/.mujoco/mujoco210 
$ unzip mujoco210\_linux.zip

Place your license key file mjkey.txt at ~/.mujoco/mujoco210.

$ cp mjkey.txt ~/.mujoco/mujoco210 
$ cp mjkey.txt ~/.mujoco/mujoco210/mujoco210_linux/bin

Add environment variables: Use the env variables MUJOCO_PY_MJKEY_PATH and MUJOCO_PY_MUJOCO_PATH to specify the MuJoCo license key path and the MuJoCo directory path.

$ export MUJOCO\_PY\_MJKEY\_PATH=$MUJOCO\_PY\_MJKEY\_PATH:~/.mujoco/mujoco210/mjkey.txt
$ export MUJOCO\_PY\_MUJOCO\_PATH=$MUJOCO\_PY\_MUJOCO\_PATH:~/.mujoco/mujoco210/mujoco210\_linux

Append the MuJoCo subdirectory bin path into the env variable LD_LIBRARY_PATH.

$ export LD\_LIBRARY\_PATH=$LD\_LIBRARY\_PATH:~/.mujoco/mujoco210/bin

Install the required python library

$ pip install -r requirements.txt

How to run？

For example, we want to train agents using DrQ-v2 algorithms for 'reaching site' tasks:

$ cd 00\_DrQv2
$ python drqv2_train.py task=reach_site

Some Results

When the training process finishes, you can use 'plot_curve.py' to plot the curves of rewards.

Acknowledgements

Part of this code is inspired by SpinningUp2018 and DrQ-v2

Citation

Please kindly consider citing our paper in your publications.

@misc{https://doi.org/10.48550/arxiv.2210.11262,
      doi = {10.48550/ARXIV.2210.11262},
      url = {https://arxiv.org/abs/2210.11262},
      author = {Xiang, Yanfei and Wang, Xin and Hu, Shu and Zhu, Bin and Huang, Xiaomeng and Wu, Xi and Lyu, Siwei},
      keywords = {Robotics (cs.RO), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences},
      title = {RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator Control},
      publisher = {arXiv},
      year = {2022},
      copyright = {arXiv.org perpetual, non-exclusive license}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
00_DrQv2		00_DrQv2
01_VPG		01_VPG
02_TRPO		02_TRPO
03_PPO		03_PPO
04_DDPG		04_DDPG
05_SAC		05_SAC
06_TD3		06_TD3
assets		assets
.gitignore		.gitignore
README.md		README.md
dmc.py		dmc.py
logger.py		logger.py
plot_curve.py		plot_curve.py
requirements.txt		requirements.txt
utils.py		utils.py
video.py		video.py

xiangyanfei212/RMBench-2022

Folders and files

Latest commit

History

Repository files navigation

RMBench-2022

Brief Introduction

RL algirithms

Tasks

Installation

How to run？

Some Results

Acknowledgements

Citation

About

Resources

Stars

Watchers

Forks

Languages