cartpole-ppo-ai

Description

The Jupyter Notebook will train and evaluate an agent in CartPole-v0 (OpenAI Gym) environment via Proximal Policy Optimization (PPO) algorithm.

A reward of +1 is provided for every step taken, and a reward of 0 is provided at the termination step. The state space has 4 dimensions and contains the cart position, velocity, pole angle and pole velocity at tip. Given this information, the agent has to learn how to select best actions. Two discrete actions are available, corresponding to:

0 - 'Push cart to the left'
1 - 'Push cart to the right'
For more details about the cartpole environment, see https://github.com/openai/gym/wiki/CartPole-v0

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Training		Training
CartPole.ipynb		CartPole.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training

Training

CartPole.ipynb

CartPole.ipynb

README.md

README.md

Repository files navigation

cartpole-ppo-ai

Description

About

Releases

Packages

Languages

buzzpranav/cartpole-ppo-ai

Folders and files

Latest commit

History

Repository files navigation

cartpole-ppo-ai

Description

About

Resources

Stars

Watchers

Forks

Languages