Mountain Car Continuous

This repository contains implementations of algorithms that solve (or attempt to solve) the continuous mountain car problem, which is based on continuous states and actions. The continuous mountain car environment is provided by the OpenAI Gym (MountainCarContinuous-v0). The code in this repo makes use of the Tensorflow 1.1 library.

The following algorithms are implemented:

REINFORCE with Stochastic Policy Gradient:

located in the rl.reinforce module
uses a Gaussian Policy Gradient
mu and sigma are both learned (sigma does not depend on the state, while mu does)
the input features for mu are learned using an MLP with two tanh layers of 128 neurons, given the raw state
a vector of parameters is defined for sigma, which are adjusted using the Gaussian Policy Gradient, and are finally summed to yield a scalar, of which the exponential is used as the value for sigma (the standard deviation)
the optimization is done in minibatches, and the batches are shuffled
the class of interest is rl.reinforce.agent.MLPStochasticPolicyAgent
the entry point is the simulator.py script in the rl.reinforce module
the plot below represents the average total reward for each episode over 10 trials:

(MountainCarContinuous-v0 defines "solving" as getting an average reward of 90.0 over 100 consecutive trials.)

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
rl/reinforce		rl/reinforce
util		util
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mountain Car Continuous

About

Releases

Packages

Languages

lantunes/mountain-car-continuous

Folders and files

Latest commit

History

Repository files navigation

Mountain Car Continuous

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages