We introduce myGym, a toolkit suitable for fast prototyping of neural networks in the area of robotic manipulation and navigation. Our toolbox is fully modular, so that you can train your network with different robots, in several environments and on various tasks. You can also create a curriculum of tasks with increasing complexity and test your network on them. We also included an automatic evaluation and benchmark tool for your developed model. We have pretained the Yolact network for visual recognition of all objects in the simulator, so that you can reward your networks based on visual sensors only. We keep training the current state-of-the-art algorithms to provide baselines for the tasks in the toolbox. There is also a leaderboard showing algorithms with the best generalization capability, tested on the tasks in our basic curriculum.
Environment | Gym-v0 is suitable for manipulation, navigation and planning tasks |
---|---|
Workspaces | Table, Collaborative table, Maze, Vertical maze, Drawer, Darts, Football, Fridge, Stairs, Baskets |
Vision | Cartesians, RGB, Depth, Class, Centroid, Bounding Box, Semantic Mask, Latent Vector |
Robots | 8 robotic arms, 2 dualarms, humanoid |
Robot actions | Absolute, Relative, Joints |
Objects | 54 objects in 5 categories |
Tasks | Reach, Press, Switch, Turn, Push, Pick, Place, PicknPlace, Throw, Hit, Catch, Navigate |
Randomizers | Light, Texture, Size, Camera position |
Baselines | Tensorflow, Pytorch |
Learn more about the toolbox in our documentation
Learnability is represented as a single value metric that evaluates algorithms under various conditions, allowing us to compare different RL algorithms. The number of conditions is limited for practical reasons, as the number of training configurations grows exponentially with each new condition, and each configuration requires standalone training and evaluation. Therefore, we limited the total number of combinations to
Pos. | Algorhitm | Score |
---|---|---|
1. | PPO2 | 30.11 |
2. | TRPO | 28.75 |
3. | ACKTR | 27.5 |
4. | SAC | 27.43 |
5. | PPO | 27.21 |
5. | myAlgo | 15.00 |
We have developed a fully modular toolbox where the user can easily combine the predefined elements into a custom environment. There are specific modules for each component of the simulation, as depicted in the following scheme.
Ubuntu 18.04, 20.04
Python 3
GPU acceleration strongly recommended
Clone the repository:
git clone https://github.com/incognite-lab/mygym.git
cd mygym
We recommend to create a conda environment:
conda env create -f environment.yml
conda activate mygym
Install myGym:
python setup.py develop
If you want to use the pretrained visual modules, please download them first:
cd myGym
sh download_vision.sh
If you want to use the pretrained baseline models, download them here:
cd myGym
sh download_baselines.sh
You can visualize the virtual gym env prior to the training.
python test.py
There will be the default workspace activated. The commands to the robot joints are random.
There are also visual outputs from the active cameras (both RGB and Depth):
Find more details about this function in the documentation
Run the default training without specifying the parameters:
python train.py
The training will start with the GUI window and a standstill visualization. Wait until the first evaluation after 10000 steps to check the progress:
After 100000 steps, the arm is able to reach the goal object with 80% accuracy:
There are more training tutorials in the documentation
Run training using following command
python train.py --config ./configs/train_press.json
Wait until the first evaluation after 100000 steps to check the progress:
After 250000 steps the arm is able to press the button with 90% accuracy:
There are more training tutorials in the documentation
Run training using following command
python train.py --config ./configs/train_switch.json
Wait until the first evaluation after 50000 steps to check the progress:
After 250000 steps the arm is able to switch the lever with 80% accuracy:
There are more training tutorials in the documentation
Run training using following command
python train.py --config ./configs/train_turn.json
Wait until the first evaluation after 250000 steps to check the progress:
After 500000 steps the arm is able to switch the lever with 90% accuracy:
There are more training tutorials in the documentation
As myGym is modular, you can easily train with different robots:
python train.py --robot jaco
You can also change the workspace within the gym, the task or the goal object. If you want to store an ouput video, just add the record parameter:
python train.py --workspace collabtable --robot panda --task push --task_objects wrench --record 1
You can fully control the environment, robot, object, task, reward, learning parameters and logging from the command line:
python train.py --env_name Gym-v0 --workspace table --engine=pybullet --render=opengl --camera=8 --gui=1 --visualize=1 --robot=kuka --robot_action=joints --robot_init=[0.5, 0.5, 2.0] --task_type=reach --task_objects=[hammer] --used_objects=None --object_sampling_area=[-0.2, 0.7, 0.3, 0.9, 0.65, 0.65] --reward_type=gt --reward=distance --distance_type=euclidean --train=1 --train_framework=tensorflow --algo=ppo2 --max_episode_steps=1024 --algo_steps=1024 --steps=500000 --eval_freq=5000 --eval_episodes=100 --test_after_train=0 --logdir=trained_models --model_path=./trained_models/test/best_model.zip --record=0
Learn more about the simulation parameters in the documentation
As the parametric definition is problematic in more complex projects, we present config files that will help with the reproducibility of results. The example of basic config file is [here(myGym/configs/train_example.conf)]. You can edit and clone this file according to your needs and run the training just by typing:
python train.py --config ./configs/train_example.json
We have developed scripts for parallel training to speed up this process. You can edit the desired parameter in train_parallel.py and run it:
python train_parallel.py
The default config will train 4 parallel simulations with different RL algorithms under the same conditions. After several training steps, you can see the difference in performace among algorithms. For better performance, the background visualization is turned off:
You can use the test script for the visualization of pretrained models:
python test.py --config ./trained_models/yourmodel/train.json
It will load the pretrained model and test it in the task and workspace defined in the config file.
There is automatic evaluation and logging included in the train script. It is controlled by parameters --eval_freq and --eval_episodes. The log files are stored in the folder with the trained model and you can easily visualize the learning progress after the training. There are also gifs for each eval period stored to compare the robot performance during training. We have also implemented evaluation in tensorboard:
tensorboard --logdir ./trained_models/yourmodel
If you want to interactively compare different parameters, just run tensorboard without model dir specification:
As myGym allows curriculum learning, the workspaces and tasks are concentrated in single gym, so that you can easily transfer the robot. The basic environment is called Gym-v0. There are more gyms for navigation and multi-agent collaboration in preparation.
Robot | Type | Gripper | DOF | Parameter value |
---|---|---|---|---|
UR-3 | arm | no gripper | 6 | ur3 |
UR-5 | arm | no gripper | 6 | ur5 |
UR-10 | arm | no gripper | 6 | ur10 |
Kuka IIWA | arm | magnetic | 6 | kuka |
Reachy | arm | passive palm | 7 | reachy |
Leachy | arm | passive palm | 7 | leachy |
Franka-Emica | arm | gripper | 7 | panda |
Jaco arm | arm | two finger | 13 | jaco |
Gummiarm | arm | passive palm | 13 | gummi |
ABB Yumi | dualarm | two finger | 12 | yummi |
ReachyLeachy | dualarm | passive palms | 14 | reachy_and_leachy |
Pepper | humanoid | -- | 20 | WIP |
Thiago | humanoid | -- | 19 | WIP |
Atlas | humanoid | -- | 28 | WIP |
Name | Type | Suitable tasks | Parameter value |
---|---|---|---|
Tabledesk | manipulation | Reach, Push, Pick, Place, PicknPlace | table |
Drawer | manipulation | Reach, Pick, PicknPlace | drawer |
Fridge | manipulation | Reach, Push, Open, Close, Pick | fridge |
Baskets | manipulation | Throw, Hit | baskets |
Darts | manipulation | Throw, Hit | darts |
Football | manipulation | Throw, Hit | football |
Collaborative table | collaboration | Give, Hold, Move together | collabtable |
Vertical maze | planning | -- | veticalmaze |
Maze | navigation | -- | maze |
Stairs | navigation | -- | stairs |
Workspace | Reach | Pick | Place | PicknPlace | Push | Throw | Hit | Open | Close | Kick | Give |
---|---|---|---|---|---|---|---|---|---|---|---|
Tabledesk | ✔️ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
Drawer | ✔️ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
Fridge | ✔️ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
Baskets | ✔️ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
Darts | ✔️ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
Football | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
Collaborative table | ✔️ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
The new global evaluation metric, which we call \textit{learnability}, allows the user to evaluate and compare algorithms in a more systematic fashion. Learnability is defined as a general ability to learn irrespective of environmental conditions. The goal is to test an algorithm with respect to the complexity of environment. We have decomposed the environment complexity into independent scales. The first scale is dedicated to the complexity of the task. Second scale exploits the complexity of the robotic body that is controlled by the neural network. The third scale stands for the temporal complexity of the environment.
Core team:
Contributors:
Radoslav Skoviera, Peter Basar, Michael Tesar, Vojtech Pospisil, Jiri Kulisek, Anastasia Ostapenko, Sara Thu Nguyen