Skip to content

RStoican/LaSER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Task-Specific Exploration in Meta-Reinforcement Learning via Task Reconstruction

Code base for the Latent Space Exploration via Task Reconstruction (LaSER) model. Contains scripts for meta-training and meta-testing LaSER on the MEWA, Meta-World, and HopperMass benchmarks.

Instructions

It is recommended the code be run through Docker.

Prerequisites

  1. Docker
  2. NVIDIA CUDA

Wandb Logging (Optional)

If you wish to log your runs to wandb, create a file name docker/.env and add your API key inside: WANDB_API_KEY=<YOUR_WANDB_API_KEY>

If not using wandb, run your scripts with the --no-wandb parameter.

Create Docker Image

docker compose -f docker/docker-compose.yml up --build

Run Docker Container

To meta-train LaSER's encoder, exploration policy and task policy on the MEWA benchmark, use:

docker compose -f docker/docker-compose.yml run --rm run_laser --env-type mewa

Note: You can change the benchmark or any of LaSER's hyperparameters found in the files from the config directory as CLI arguments in the command above.

To run on Meta-World or HopperMass, set --env-type to metaworld_ml10 or mujoco, respectively.

To replicate the results from the paper's Sec. 5.3, run on MEWA using oracle contexts and a fixed set of tasks by setting the arguments --ablation_true_task True --ablation_fixed_tasks True

Decoupled running

It is possible to only run LaSER's pre-training phase, which will only optimize the encoder and exploration policy:

docker compose -f docker/docker-compose.yml run --rm run_laser --env-type mewa --no-task-train

Afterward, the task policy optimization phase can be run as:

docker compose -f docker/docker-compose.yml run --rm run_laser --env-type mewa --no-exp-train --save-path <PRE_TRAIN_SAVE_PATH> 

where <PRE_TRAIN_SAVE_PATH> points to a results' directory containing pre-trained models for the encoder and exploration policy.

If running ablations with oracle contexts, the pre-training phase is automatically skipped, and there is no need to set the --save-path argument.

Acknowledgments

LaSER was built on top of several open-source repositories: garage, VariBAD, TrMRL

About

Implementation of Task-Specific Exploration in Meta-Reinforcement Learning via Task Reconstruction (LaSER)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages