Skip to content
An open source robotics benchmark for meta- and multi-task reinforcement learning
Python Other
  1. Python 99.7%
  2. Other 0.3%
Branch: master
Clone or download
mschachter and tianheyu927 Quaternion refactor (#7)
* Refactored quaternion code to use MuJoCo quaternion functions instead of pyquaternion.

- Refactored redundant functions in sawyer_xyz/sawyer_*	      tasks into sawyer_xyz/
- Swapped pyquaternion functionality in to	use MuJoCo quaternion functions
- Created unit test ( to demonstrate parity between MuJoCo	   functions and pyquaternion functions
- Removed pyquaternion dependency in from production, added to	   dev
- Removed numpy-stl dependency in

* bugfix

* - Added file so that assets	directory is installed along with source code
- Modified so that it reads the file

* Added newline to, removed global test run from
Latest commit 0e14dc2 Oct 14, 2019


License Build Status

Meta-World is an open-source simulated benchmark for meta-reinforcement learning and multi-task learning consisting of 50 distinct robotic manipulation tasks. We aim to provide task distributions that are sufficiently broad to evaluate meta-RL algorithms' generalization ability to new behaviors.

For more information, please refer to our website.

Table of Contents of This Document


Meta-World is based on MuJoCo, which has a proprietary dependency we can't set up for you. Please follow the instructions in the mujoco-py package for help. Once you're ready to install everything, clone this repository and install:

git clone
cd metaworld
pip install -e .

Using the benchmark

Here is a list of benchmark environments for meta-RL (ML*) and multi-task-RL (MT*):

  • ML1 is a meta-RL benchmark environment to test few-shot adaptation to goal variation within one task. You can choose a task from 50 available tasks.
  • ML10 is a meta-RL benchmark environment to test few-shot adaptation to new tasks with 10 meta-train tasks and 3 test tasks.
  • ML45 is a meta-RL benchmark environment to test few-shot adaptation to new tasks with 45 meta-train tasks and 5 test tasks.
  • MT10, MT50 are a multi-task-RL benchmark environments for learning a multi-task policy that perform 10 and 50 training tasks. The observation of MT10 and MT50 is augmented with an one-hot vector to provide information of task identities.


We provide two extra API's to extend a gym.Env interface for meta-RL and multi-task-RL:

  • sample_tasks(self, meta_batch_size): Return a list of tasks with a length of meta_batch_size.
  • set_task(self, task): Set the task of a multi-task environment.

Running ML1

from metaworld.benchmarks import ML1

print(ML1.available_tasks())  # Check out the available tasks

env = ML1.get_train_tasks('pick-place-v1')  # Create an environment with task `pick_place`
tasks = env.sample_tasks(1)  # Sample a task (in this case, a goal variation)
env.set_task(tasks[0])  # Set task

obs = env.reset()  # Reset environment
a = env.action_space.sample()  # Sample an action
obs, reward, done, info = env.step(a)  # Step the environoment with the sampled random action

Running ML10 and ML45

Create an environment with train tasks:

from metaworld.benchmarks import ML10
ml10_train_env = ML10.get_train_tasks()

Create an environment with test tasks:

ml10_test_env = ML10.get_test_tasks()

Running MT10 and MT50

Create an environment with train tasks:

from metaworld.benchmarks import MT10
mt10_train_env = MT10.get_train_tasks()

Create an environment with test tasks (noted that the train tasks and test tasks for multi-task (MT) environments are the same):

mt10_test_env = MT10.get_test_tasks()

Running Single-Task Environments

Meta-World can also be used as a normal gym.Env for single task benchmarking. Here is an example of creating a pick_place environoment:

from metaworld.envs.mujoco.sawyer_xyz import SawyerReachPushPickPlaceEnv
env = SawyerReachPushPickPlaceEnv()

Contributors and Acknowledgement

Meta-World is a work by Tianhe Yu (Stanford University), Deirdre Quillen (UC Berkeley), Zhanpeng He (Columbia University), Ryan Julian (University of Southern California), Karol Hausman (Google AI), Chelsea Finn (Stanford University) and Sergey Levine (UC Berkeley).

If you use Meta-World for your academic research, please kindly cite Meta-World with the following BibTeX:

  Author = {Tianhe Yu and Deirdre Quillen and Zhanpeng He and Ryan Julian and Karol Hausman and Chelsea Finn and Sergey Levine},
  Title = {Meta-World: A Benchmark and Evaluation for Multi-Task and Meta-Reinforcement Learning},
  Year = {2019},
  url = ""

The code for Meta-World was originally based on multiworld, which is developed by Vitchyr H. Pong, Murtaza Dalal, Ashvin Nair, Shikhar Bahl, Steven Lin, Soroush Nasiriany, Kristian Hartikainen and Coline Devin. The Meta-World authors are grateful for their efforts on providing such a great framework as a foundation of our work. We also would like to thank Russell Mendonca for his work on reward functions for some of the environments.

Contributing to Meta-World

We welcome all contributions to Meta-World. Please refer to the contributor's guide.

You can’t perform that action at this time.