Skip to content

Value Iteration, Policy Iteration, and Q-Learning in Frozen lake gym env

Notifications You must be signed in to change notification settings

yahsiuhsieh/frozen-lake

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

Frozen Lake

Value Iteration, Policy Iteration and Q learning in Frozen lake gym env

The goal of this game is to go from the starting state (S) to the goal state (G) by walking only on frozen tiles (F) and avoid holes (H). However, the ice is slippery, so you won't always move in the direction you intend (stochastic environment).

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Built With

  • Python 3.6.10

  • gym >= 0.15.4

  • numpy >= 1.16.2

  • matplotlib >= 3.1.1

Code Organization

.
├── src                     # Python scripts
│   ├── value_iteration.py  # VI algorithm
│   ├── policy_iteration.py # PI algorithm
│   ├── q_learning.py       # Q-learning algorithm
│   └── utils.py            # Utility sets
├── images                  # Results
└── README.md

Tests

There are 3 methods you can try, namely policy iteration, value iteration, and Q-learning, with corresponding file name.

ex. if you want to try policy iteration, just do

python policy_iteration.py

Results

The resulting image would show the average success rate versus the number of episode.


Average success rate of value iteration algorithm over 50 episodes.

Authors

About

Value Iteration, Policy Iteration, and Q-Learning in Frozen lake gym env

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages