Skip to content

The reimplementation of Model Predictive Path Integral (MPPI) from the paper "Information Theoretic MPC for Model-Based Reinforcement Learning" (Williams et al., 2017) for the pendulum OpenAI Gym environment

Notifications You must be signed in to change notification settings

ferreirafabio/mppi_pendulum

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 

Repository files navigation

MPPI implementation with the OpenAI gym pendulum environment

This repository implements Model Predictive Path Integral (MPPI) as introduced by the paper Information Theoretic MPC for Model-Based Reinforcement Learning by (Williams et al., 2017) and takes as forward model the pendulum OpenAI Gym environment.

Requirements

  • OpenAI Gym
  • numpy

Gists of the paper

The paper derives an optimal control law as a (noise-) weighted average over sampled trajectories. In particular, the optimization problem is posed to compute the control input such that the controlled distribution Q is pushed as close as possible to the optimal distribution Q*. This corresponds to minimizing the KL divergence between Q and Q*.

The gists from the paper:

  • the noise assumption vt ̴ N(ut, ∑) stems from noise in low-level controllers

  • the noise term can be pulled out of the Monte-Carlo approximation (η) equation and neatly interpreted as a weight for the MC samples in the iterative update law

  • given the optimal control input distribution Q*, it is derived u*t = ∫q*(V)vtdV

  • computing the integral is not possible since q* is unknown, instead importance sampling is used to sample from the proposal distribution:

    where can be approximated by the Monte-Carlo estimate given in algorithm 2 as η, yielding:

    which resembles an iterative procedure to improve the MC estimate by using a more accurate importance sampler

About

The reimplementation of Model Predictive Path Integral (MPPI) from the paper "Information Theoretic MPC for Model-Based Reinforcement Learning" (Williams et al., 2017) for the pendulum OpenAI Gym environment

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages