Skip to content
Reinforcement Learning (REINFORCE Actor- Critic) on CartPole Environment
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md
actor.py
actor.pyc
actor.py~
critic.py
critic.pyc
critic.py~
policyGradient.py
policyGradient.py~
reward vs episodes.png

README.md

Reinforcement Learning using Policy Gradient

Algorithm: REINFORCE Actor- Critic

OpenAI environment: CartPole

Requirements:

Tensorflow 1.0.1

Openai Gym 0.81

Run Command:

python policyGradient.py

Output:

Reward Monitor will appear, the Model learns with each episode (you can see in the monitor as total reward increases)

Note: It Converges to 200 because cartpole in openai gym 0.81 terminates at 200 steps in each rollout.

alt text

You can’t perform that action at this time.