Skip to content

Meta Reinforcement Learning (RL²) techniques applied to Actor Critic (A2C,A3C) methods to develop agents capable of acting in multiple environment distributions.

License

Notifications You must be signed in to change notification settings

contimatteo/MetaRL-RL2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Meta Reinforcement Learning (RL^2)

Introduction

Meta Reinforcement Learning, in short, is to do meta-learning in the field of reinforcement learning. Usually the train and test tasks are different but drawn from the same family of problems.

Deep reinforcement learning (deep RL) has been successful in learning sophisticated behaviors automatically; however, the learning process requires a huge number of trials. In contrast, animals can learn new tasks in just a few trials, benefiting from their prior knowledge about the world. The RL2 algorithm is encoded in the weights of the RNN, which are learned slowly through a general-purpose ("slow") RL algorithm. The RNN receives all information a typical RL algorithm would receive, including observations, actions, rewards, and termination flags; and it retains its state across episodes in a given Markov Decision Process (MDP). The activations of the RNN store the state of the "fast" RL algorithm on the current (previously unseen) MDP. (source: https://arxiv.org/abs/1611.02779)

Documentation

You can find the report of the project here.

Getting Started

The weights of all the trained models are available at the following link.

Once you have downloaded it, you have to place the unzipped folders history and saved_models inside the /tmp folder (at the root level of the project).

Installation

Below you can find all the scripts for installing based on your OS/processor

$ make
    > "+------------------------------------------------------+"
    > "|         OS         |  Hardware  |    Setup Command   |"
    > "+------------------------------------------------------+"
    > "|   Windows/Linux    |   - GPU    |  'make setup.CPU'  |"
    > "|   Windows/Linux    |   + GPU    |  'make setup.GPU'  |"
    > "|    Apple macOS     |    + M1    |  'make setup.M1'   |"
    > "|    Apple macOS     |    - M1    |  'make setup.CPU'  |"
    > "+------------------------------------------------------+"

for instance, if you have MacOS with Intel chip you have to run:

$ make setup.CPU

or alternatively you can find all the different version of the requirements inside the /tools/requirements folder.

Apple M1

If you are using the new Apple M1 chip please be sure to have installed hdf5 by running:

$ brew install hdf5

Running the Tests

Usage

$ python src/run.py -h
    usage: run.py [-h] --config CONFIG

    MetaRL-RL2

    optional arguments:
        -h, --help       show this help message and exit
        --config CONFIG  path of the configurations json file.

Configurations

The base structure of the json configurations file can be found inside this file.

Look at the configs folder for additional configuration examples.

Training

In order to train all the models from scratch, you can uncomment all the rows inside this file. Once you have done that, you have to run:

$ sh scripts/train.all.sh

or you can simply execute once configuration at time by running:

$ python src/run.py "<path_of_your_json_config>"

where the json file has the parameter mode="training".

Inference

In order to run the inference procedure on all the models, you can uncomment all the rows inside this file. Once you have done that, you have to run:

$ sh scripts/test.all.sh

or you can simply execute once configuration at time by running:

$ python src/run.py "<path_of_your_json_config>"

where the json file has the parameter mode="inference".

Rendering

In order to run the rendering procedure on all the models, you can uncomment all the rows inside this file. Once you have done that, you have to run:

$ sh scripts/render.all.sh

or you can simply execute once configuration at time by running:

$ python src/run.py "<path_of_your_json_config>"

where the json file has the parameter mode="render".

References

Papers

Environments:

Authors

License

This project is licensed under the MIT License - see the LICENSE.md file for details

About

Meta Reinforcement Learning (RL²) techniques applied to Actor Critic (A2C,A3C) methods to develop agents capable of acting in multiple environment distributions.

Resources

License

Stars

Watchers

Forks