In this repo I explore the Sarsa, Sarsa max, and Expected sarsa methods to solve the RL task CliffWalking-v0 from OpenA-GYM.
All of the three RL algorithms are implemented in the jupyter notebook Temporal_Difference.ipynb and running all cell in it you can train an agent to solve the enviroment in a different way(Sarsa, Sarsa max, Expected sarsa).
To use this code you need to install the following packages:
- numpy
- jupiyter
- matplotlib
- seaborn
- OpenAI Gym
GNU General Public License v3.0