Task-Specific Exploration in Meta-Reinforcement Learning via Task Reconstruction

Code base for the Latent Space Exploration via Task Reconstruction (LaSER) model. Contains scripts for meta-training and meta-testing LaSER on the MEWA, Meta-World, and HopperMass benchmarks.

Instructions

It is recommended the code be run through Docker.

Prerequisites

Docker
NVIDIA CUDA

Wandb Logging (Optional)

If you wish to log your runs to wandb, create a file name docker/.env and add your API key inside: WANDB_API_KEY=<YOUR_WANDB_API_KEY>

If not using wandb, run your scripts with the --no-wandb parameter.

Create Docker Image

docker compose -f docker/docker-compose.yml up --build

Run Docker Container

To meta-train LaSER's encoder, exploration policy and task policy on the MEWA benchmark, use:

docker compose -f docker/docker-compose.yml run --rm run_laser --env-type mewa

Note: You can change the benchmark or any of LaSER's hyperparameters found in the files from the config directory as CLI arguments in the command above.

To run on Meta-World or HopperMass, set --env-type to metaworld_ml10 or mujoco, respectively.

To replicate the results from the paper's Sec. 5.3, run on MEWA using oracle contexts and a fixed set of tasks by setting the arguments --ablation_true_task True --ablation_fixed_tasks True

Decoupled running

It is possible to only run LaSER's pre-training phase, which will only optimize the encoder and exploration policy:

docker compose -f docker/docker-compose.yml run --rm run_laser --env-type mewa --no-task-train

Afterward, the task policy optimization phase can be run as:

docker compose -f docker/docker-compose.yml run --rm run_laser --env-type mewa --no-exp-train --save-path <PRE_TRAIN_SAVE_PATH>

where <PRE_TRAIN_SAVE_PATH> points to a results' directory containing pre-trained models for the encoder and exploration policy.

If running ablations with oracle contexts, the pre-training phase is automatically skipped, and there is no need to set the --save-path argument.

Acknowledgments

LaSER was built on top of several open-source repositories: garage, VariBAD, TrMRL

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config		config
docker		docker
laser		laser
mewa		mewa
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Task-Specific Exploration in Meta-Reinforcement Learning via Task Reconstruction

Instructions

Prerequisites

Wandb Logging (Optional)

Create Docker Image

Run Docker Container

Decoupled running

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Task-Specific Exploration in Meta-Reinforcement Learning via Task Reconstruction

Instructions

Prerequisites

Wandb Logging (Optional)

Create Docker Image

Run Docker Container

Decoupled running

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages