This is the official code, implemented by Jan Wöhlke, accompanying the AAMAS 2020 paper A Performance-Based Start State Curriculum Framework for Reinforcement Learning by Jan Wöhlke, Felix Schmitt, and Herke van Hoof. The paper can be found here: http://www.ifaamas.org/Proceedings/aamas2020/pdfs/p1503.pdf. The code allows the users to reproduce and extend the results reported in the paper. Please cite the above paper when reporting, reproducing or extending the results:
@inproceedings{wohlke2020performance,
title={A Performance-Based Start State Curriculum Framework for Reinforcement Learning},
author={W{\"o}hlke, Jan and Schmitt, Felix and van Hoof, Herke},
booktitle={Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems},
pages={1503--1511},
year={2020}
}
This software is a research prototype, solely developed for and published as part of the publication cited above. It will neither be maintained nor monitored in any way.
Place the folder code where you want to execute the code.
Add the path of the folder code to your $PYTHONPATH environment variable.
You need a Python Set-Up with the following packages:
- python>=3.6
- gym==0.10.9 (install in editable mode: pip install -e, in case you want to run mujoco experiments)
- numpy>=1.15.2
- scipy>=1.3.1
- torch==0.3.1
- matplotlib>=3.0.1
The start scripts for the experiments can be found here.
For the experiments in Section 5.1 / Figure 2 you find the start scripts here:
For the experiments in Sections 5.2.1, 5.2.4, and 5.2.7 / Figures 3 and 5 you find the start scripts here:
- UST
- SG PMM
- SG PMN
- RC
- SPCRL PMM
- TPG PMM
- ASP
- ASP RC
- SAGG-RIAC
- UST USYM
- RC USYM
- SPCRL PMM USYM
- TPG PMM USYM
- SG PMM USYM
Run the scripts with: python [scriptname] --seed=[seed option]
There are different seed options to run the number random seeds used for the respective experiment:
- "all" : run all / the first 10 seeds one after one another
- "s1" to "s10 (or s50)" : run an individual random seed
- "p1" to "p2 (or p10)" : run a slice of 5 random seeds one after another
- "ps1" to "ps4" : run first three, second three, second last two, or last two random seeds, respectively (for 10 seeds).
You need to insert the files provided in the gym folder in the corresponding folders of your (editable) gym installation (same folder structure). By this you add some new files and replace some existing ones.
You need to have Mujoco (version 1.5 - mujoco/150) [no experience whether it works with newer Mujoco versions] installed and have a valid license key for it.
You need to install the following additional Python package:
- mujoco-py<1.50.2,>=1.50.1 (our version: 1.50.1.68)
Furthermore, you need to get / modify the Mujoco model files for the point mass, the ant, and the key insertion. For this purpose, you can perform, the following steps:
- Get the arm3d_key_tight.xml from here and place it in the /gym/gym/envs/mujoco/assets folder of your editable gym installation
- Run this script specifying the location of the assets folder in line 126 (bottom)
- An ant_new.xml and a point_spiral_2D.xml will be generated
The start scripts for the experiments can be found here.
For the experiments in Section 5.2.5 / Figure 6a you find the start scripts here:
For the experiments in Section 5.2.6 / Figure 6b you find the start scripts here:
For the experiments in Section 5.3 / Figure 8 you find the start scripts here:
Run the scripts with: python [scriptname] --seed=[seed option]
There are different seed options to run the number random seeds used for the respective experiment:
- "all" : run all / the first 10 seeds one after one another
- "s1" to "s10 (or s35)" : run an individual random seed
- "p1" to "p2 (or p7)" : run a slice of 5 random seeds one after another
- "ps1" to "ps4" : run first three, second three, second last two, or last two random seeds, respectively (for 10 seeds).
Plot the test goal reaching probabilities in the "results_ust_ ..." files of all the random seeds of the respective experiment of interest as mean +- standard error to arrive at the curves shown in the paper.
A-Performance-Based-Start-State-Curriculum-Framework-for-RL is open-sourced under the AGPL-3.0 license. See the LICENSE file for details.
For a list of other open source components included in A-Performance-Based-Start-State-Curriculum-Framework-for-RL, see the file 3rd-party-licenses.txt.