Skip to content
No description, website, or topics provided.
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
cenvs
gpflow_mod RL code, refactor, experiments Dec 2, 2018
results remove notes Dec 2, 2018
.gitignore
LICENSE
README.md
agents.py
cost_functions.py RL code, refactor, experiments Dec 2, 2018
data.py
env_utils.py
func_utils.py
model_rl.py
models.py
pilco.py
requirements.txt
run_pilco.py
training.py

README.md

Meta Reinforcement Learning with Latent Variable Gaussian Processes

Description

Implementation for Meta Reinforcement Learning with Latent Variable Gaussian Processes. Includes implementation of:

  • PILCO with (variational) sparse Gaussian Processes using GPflow
  • Latent Variable Gaussian Process model for meta learning
  • Cartpole Swingup task from paper (Note: requires DartEnv)
  • Pendulum task from OpenAI Gym

See paper for details.

Core requirements:

  • Tensorflow (v1.12.0)
  • GPflow (v1.3.0)
  • OpenAI gym (v0.10.8)
  • (Optional) DartEnv

See requirements.txt for other dependencies.

Cartpole Swingup

Success Rate vs. Number of Trials

Figure 1: Average success rate across training and test sets / trial (SGP-I: Independent training / ML-GP: Meta learning model). The ML-GP model solves 3/4 test tasks in the first trial.

Latent Task Embedding

Figure 2: Live inference of latent task variable for test tasks. Each frame corresponds to one additional observation from a previously unseen task. Labels indicate mass and length of the pendulum in the cartpole swingup tasks.

Run example

python run_pilco.py \
  --env=PendulumEnv \
  --seed=1 \
  --model_name=MLSVGP \
  --episode_length=50 \
  --planning_horizon=10 \
  **kwargs (see file)
You can’t perform that action at this time.