ME5406 Project 1

The Froze Lake Problem and Variations

Problem Statement

This project is aimed to help the robot to find a safe way to pick up a frisbee in a frozen lake with several holes covered by patches of very thin ice using basic Reinforcement Learning (RL) method.

There are 2 maps which have 4∗4 and 10∗10 grid respectively. The proportion between the number of holes and the number of states is maintained at 0.25. The start point and the destination are all set in the top left corner and the bottom right corner of both maps. All actions the robot can do contain moving up, down, left and right. When the robot fall into a hole, it will receive a -1 reward. If the robot get the frisbee, it will receive a +1 reward.

In this project, three basic RL methods will be used on both maps, which are first-visit Monte Carlo control without exploring starts, SARSA with an epsilon-greedy behavior policy and Q-learning with an epsilon-greedy behavior policy.

Performances of the algorithms

Monte-Carlo

4*4 map

10*10 map

SARSA

4*4 map

10*10 map

Q-Learning

4*4 map

10*10 map

Comparison

Steps of running the code:

1. Environment settings:

ale-py==0.7.1
certifi==2016.2.28
cloudpickle==2.0.0
cycler==0.10.0
gym==0.20.0
importlib-metadata==4.8.1
importlib-resources==5.2.2
kiwisolver==1.3.1
matplotlib==3.3.4
numpy==1.19.5
opencv-python==4.5.3.56
Pillow==8.3.2
pyglet==1.5.21
pyparsing==2.4.7
python-dateutil==2.8.2
scipy==1.5.4
six==1.16.0
typing-extensions==3.10.0.2
wincertstore==0.2
zipp==3.6.0

You can use the requirements file in the folder to install the environment:

Format:

pip install -r [FILE_PATH]/requirements.txt

You should replace the [FILE_PATH] with the real file path on your computer.

Example:

pip install -r E:/ME5406_Project_1/requirements.txt

2. Run the main.py

You need to define the RL algorithm type (MONTE_CARLO/SARSA/Q_LEARNING), map size (4/10), and training iteration then run the main.py through command line.

Format:

python [FILE_PATH]/main.py -t <algorithm_type> -s <map_size> -i <training_iteration>

You should replace the [FILE_PATH] with the real file path on your computer.

Example:

python E:/ME5406_Project_1/main.py -t Q_LEARNING -s 4 -i 500

It means using Q_LEARNING in 4*4 map at training iteration 500.

After training, the plot of average reward, average q value will be shown.

If the training is successful, The route will also be shown.

Writing in the end

This project is built up from zero by myself. Although some codes seem so redundant and stupid, I still hope it will help you. Another thing is that my senior schoolmate Zhang Yifeng has helped me a lot and his code, which is totally different from mine, is much more elegant. Here is his Github home page: https://github.com/zhangyifengdavid .

If you think this repository helps you, please give me a star!

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.idea		.idea
README_images		README_images
__pycache__		__pycache__
env_for_p1		env_for_p1
README.md		README.md
agent.py		agent.py
main.py		main.py
monte_carlo_Q.py		monte_carlo_Q.py
monte_carlo_V.py		monte_carlo_V.py
parameters.py		parameters.py
q_learning.py		q_learning.py
repeat_test.py		repeat_test.py
requirements.txt		requirements.txt
sarsa_Q.py		sarsa_Q.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ME5406 Project 1

The Froze Lake Problem and Variations

Problem Statement

Performances of the algorithms

Monte-Carlo

SARSA

Q-Learning

Comparison

Steps of running the code:

Writing in the end

About

Releases

Packages

Languages

Le-HN/ME5406_Project_1

Folders and files

Latest commit

History

Repository files navigation

ME5406 Project 1

The Froze Lake Problem and Variations

Problem Statement

Performances of the algorithms

Monte-Carlo

SARSA

Q-Learning

Comparison

Steps of running the code:

Writing in the end

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages