Meta Reinforcement Learning with Finite Training Tasks - a Density Estimation Approach

By Zohar Rimon, Aviv Tamar and Gilad Adler

Report Bug

Official implementation of the paper Meta Reinforcement Learning with Finite Training Tasks - a Density Estimation Approach.

Citation

@inproceedings{rimon2022mbrl2,
  title={Meta Reinforcement Learning with Finite Training Tasks - a Density Estimation Approach},
  author={Rimon, Zohar and Tamar, Aviv and Adler, Gilad},
  booktitle={Neural Information Processing Systems (NeurIPS)},
  year={2022}}

# General This code is based on the open-sourced VariBad repository of Zintgraf et al - https://github.com/lmzintgraf/varibad. For general overview of the repository, we refer the reader to the original VariBad repository.

Requirements

The requirements can be found in requirements.txt. One can create a sufficient conda environment with:

   conda create -n mbrl2 python=3.7
   pip install -r requirements.txt

Dream Environments Options

Besides the config options introduces in the VariBad repo,

env_num_train_goals, env_num_eval_goals - number of training and evaluation environmnets
num_dream_envs - number of dream environments processes
use_kde, use_mixup - use KDE to sample new latents, if false we use the learned Prior
use_mixup - use the mixup technique to sample new latents instead of regular KDE
delay_dream - number of iterations to delay the initialization of the dream environments by
update_kde_interval - iterations interval for the KDE updates
kde_from_train - create KDE using an oracle policy
kde_from_running_latents - use a latent pool, gathered along the training for the dream environments estimation
freeze_vae - don't train the VAE (only the policy)
delayed_freeze - stop the VAE training after given number of iterations
train_vae_on_dream - train the VAE to reconstruct reward over the dream environments
clone_dream_vae - use a different vae for the dream environments

Reproducing Results

In order to reproduce the experiments shown in the paper:

For the 20 real training environments and 4 dream environments experiment:

python main.py --exp_name 20_train_4_kde_dream --env_type pointrobot_varibad\
               --env_num_train_goals 20 --num_dream_envs 4

For the 30 real training environments and 6 dream environments experiment:

python main.py --exp_name 30_train_6_kde_dream --env_type pointrobot_varibad \
               --env_num_train_goals 30 --num_dream_envs 6

In order to use Mixup dream environments instead of the KDE, add the --use_mixup flag.

To reproduce the exact figures from the paper one need to run all the seeds specified in utils/plot_helpers.py (for a specific experiment) and run utils/plot_helpers.py.

For example, to reproduce the 30 real training environments experiment (VariBad vs VariBad dream) run seeds:

seeds = [3, 13, 23, 33, 43, 53, 63, 73, 83, 93, 103, 200, 201, 202, 203]

Contact

Zohar Rimon - zohar.rimon@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
config		config
environments		environments
gifs		gifs
models		models
rl_algorithms		rl_algorithms
utils		utils
.gitignore		.gitignore
README.md		README.md
learner.py		learner.py
main.py		main.py
metalearner.py		metalearner.py
requirements.txt		requirements.txt
vae.py		vae.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config

config

environments

environments

gifs

gifs

models

models

rl_algorithms

rl_algorithms

utils

utils

.gitignore

.gitignore

README.md

README.md

learner.py

learner.py

main.py

main.py

metalearner.py

metalearner.py

requirements.txt

requirements.txt

vae.py

vae.py

Repository files navigation

Meta Reinforcement Learning with Finite Training Tasks - a Density Estimation Approach

Citation

Requirements

Dream Environments Options

Reproducing Results

Contact

About

Releases

Packages

Languages

zoharri/MBRL2

Folders and files

Latest commit

History

Repository files navigation

Citation

Requirements

Dream Environments Options

Reproducing Results

Contact

About

Resources

Stars

Watchers

Forks

Languages