No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README.md
gym_experiment.py
policy_gradient.py
test_pg.py
tf_util.py

README.md

tensorflow-policy-gradient

Still under construction...

Dependencies

  • Python 2.7
  • TensorFlow >= 0.8.0
  • NumPy >= 1.10.0
  • openai gym
  • matplotlib

Quick try

Run

python gym_experiment.py

to train a softmax policy (without bias) using vanilla policy gradient on CartPole task. You can see that the return is stochastically increasing until it reaches the maximum (200).