Skip to content

Cliff walking reinforcement learning example, with a variety of RL algorithms

License

Notifications You must be signed in to change notification settings

woutervanheeswijk/cliff_walking_public

Repository files navigation

Cliff Walking

This project demonstrates a number of common reinforcement learning (RL) algorithms, applied on Sutton & Barto's cliff walking problem. The aim is to aid understanding of RL mechanisms in a comprehensive environment. For this purpose, the code is relatively integrated and hard-coded. I intermittedly add new algorithms and refactor the code.

The project currently contains the following algorithms:

  • Q-learning
  • SARSA
  • Deep Q-learning
  • Discrete policy gradient
  • Deep policy gradient

Neural network approaches are incorporated using TensorFlow.

My series of blog posts at Towards Data Science provides descriptions and interpretations of the implemented algorithms and their results:
Q-learning and SARSA
Monte Carlo learning
Discrete Policy Gradient
Deep Q-Learning
Deep Policy Gradient

About

Cliff walking reinforcement learning example, with a variety of RL algorithms

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published