Atari-Games-using-Q-Learning

Overview

In this project, I'm dealing with multiple environments from the GYM library and trying to apply the Reinforcement Learning technique to optimize the agent actions.

It's easier to work with modularized code, as it's simple to use as we would see later
So, the project is divided into three modules:

Build a modular code consisting of functions that can be used in multiple environments.
Tune alpha, gamma, and/or epsilon using decay over episodes.
Implement a grid search to discover the best hyperparameters.

Usage

Install

I'm using some libraries in the code that should be installed at the beginning by running those commands:

!pip install cmake 'gym[atari]' scipy
!pip install gym[atari]
!pip install autorom[accept-rom-license]
!pip install gym[atari,accept-rom-license]==0.21.0

Import the project file

from Atari_RL import *

Train on the environment

All you need to train the model on an environment is just pass the environment's name to the train model function.

Train on the Taxi-v3 environment

env_name = 'Taxi-v3'
frames, AVG_timesteps, AVG_penalities = train_model(env_name)
print(f"Average timesteps per episode: {AVG_timesteps}")
print(f"Average penalties per episode: {AVG_penalities}")

"""" Output:
Episode: 100000
Training finished.
Results after 100 episodes:
Average timesteps per episode: 20.88
Average penalties per episode: 0.0
""""

Train and Evaluate

Train and Evaluate on the Taxi-v3 environment

# Specify the game required
env_name = 'Taxi-v3'
# Return the game environment as an object
env = get_env(env_name)
# Build the Q-Table just specify the learning parameters
q_table=q_table_train(env,alpha =.1,gamma = .6,epsilon = .9)
# Evaluate the model by returning the time and penalties
frames, AVG_timesteps, AVG_penalities= model_evaluate(env, q_table)
# Visualize the game frame by frame
print_frames(frames)
# print the model Average timesteps and Average penalties
print(f"Average timesteps per episode: {AVG_timesteps}")
print(f"Average penalties per episode: {AVG_penalities}")

Train and Evaluate on the FrozenLake-v1 environment

# Specify the game required
env_name = 'FrozenLake-v1'
# Return the game environment as an object
env = get_env(env_name)
# Build the Q-Table just specify the learning parameters
q_table=q_table_train(env,alpha =.1,gamma = .6,epsilon = .9)
# Evaluate the model by returning the time and penalties
frames, AVG_timesteps, AVG_penalities= model_evaluate(env, q_table)
# Visualize the game frame by frame
print_frames(frames)
# print the model Average timesteps and Average penalties
print(f"Average timesteps per episode: {AVG_timesteps}")
print(f"Average penalties per episode: {AVG_penalities}")

Tuning the Parameters using Decay Over Episodes Technique

Also, built a function to train and evaluate the model using the decay over episodes technique using this equation: parameter = parameter*(1-parameter * decay_factor)

# The hyperparameter
alpha = 0.1
gamma = 0.9
epsilon = 0.9
# Apply the decay over technique with decay factor .1
decay_over = True
decay_factor= .1

Tuning on the Taxi-v3 environment

env_name = 'Taxi-v3'
frames, AVG_timesteps, AVG_penalities = train_model(env_name, alpha_para = alpha, gamma_para =gamma, epsilon_para = epsilon,decay_over=decay_over,decay_factor=decay_factor)
print(f"Average timesteps per episode: {AVG_timesteps}")
print(f"Average penalties per episode: {AVG_penalities}")

""" Output:
Episode: 100000
Average timesteps per episode: 20.88
Average penalties per episode: 0.0
"""

Tuning on the FrozenLake-v1 environment

env_name = 'FrozenLake-v1'
frames,AVG_timesteps, AVG_penalities = train_model(env_name, alpha_para = 0.1, gamma_para = 0.6, epsilon_para = 0.9,decay_over=True,decay_factor=.1)
print(f"Average timesteps per episode: {AVG_timesteps}")
print(f"Average penalties per episode: {AVG_penalities}")

""" Output:
Episode: 100000
Average timesteps per episode: 8.31
Average penalties per episode: 0.0
"""

Use the Grid Search

It's required to implement Grid Search to find the best combinations of hyper parameters values to get the minimum penalty and minimum steptime.

On the Taxi-v3 environment

env_name = "Taxi-v3"
params = {'alpha':[0.9,0.6,0.3],'gamma':[0.9,0.6,0.3],'epsilon':[0.9,0.6,0.3]}
best_params1, best_AVGtime1 ,best_AVGpenalties1, best_frame1 = grid_search(env_name=env_name,parameters=params,decay_over=False,decay_factor=.1)
print('Best_parameters:', best_params1)
print('Average timesteps per episode:', best_AVGtime1)
print('Average penalties per episode:', best_AVGpenalties1)

""" Output:
Best_parameters: {'alpha': 0.6, 'gamma': 0.3, 'epsilon': 0.9}
Average timesteps per episode: 12.43
Average penalties per episode: 0.0
"""

On the FrozenLake-v1 environment

env_name = "FrozenLake-v1"
params = {'alpha':[0.9,0.6,0.3],'gamma':[0.9,0.6,0.3],'epsilon':[0.9,0.6,0.3]}
best_params1, best_AVGtime1 ,best_AVGpenalties1, best_frame1 = grid_search(env_name=env_name,parameters=params,decay_over=False,decay_factor=.1)
print('Best_parameters:', best_params1)
print('Average timesteps per episode:', best_AVGtime1)
print('Average penalties per episode:', best_AVGpenalties1)

""" Output:
Best_parameters: {'alpha': 0.9, 'gamma': 0.9, 'epsilon': 0.9}
Average timesteps per episode: 5.1
Average penalties per episode: 0.0
"""

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Image		Image
Atari_QLearning.ipynb		Atari_QLearning.ipynb
Atari_RL.py		Atari_RL.py
README.md		README.md
Reduce-Your-Customer-Churn-Rate.jpg		Reduce-Your-Customer-Churn-Rate.jpg
Usage_Example.ipynb		Usage_Example.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Atari-Games-using-Q-Learning

Overview

Table of Contents

Usage

Install

Import the project file

Train on the environment

Train on the Taxi-v3 environment

Train and Evaluate

Train and Evaluate on the Taxi-v3 environment

Train and Evaluate on the FrozenLake-v1 environment

Tuning the Parameters using Decay Over Episodes Technique

Tuning on the Taxi-v3 environment

Tuning on the FrozenLake-v1 environment

Use the Grid Search

On the Taxi-v3 environment

On the FrozenLake-v1 environment

About

Releases

Packages

Languages

girgismicheal/Atari-Games-using-Q-Learning

Folders and files

Latest commit

History

Repository files navigation

Atari-Games-using-Q-Learning

Overview

Table of Contents

Usage

Install

Import the project file

Train on the environment

Train on the Taxi-v3 environment

Train and Evaluate

Train and Evaluate on the Taxi-v3 environment

Train and Evaluate on the FrozenLake-v1 environment

Tuning the Parameters using Decay Over Episodes Technique

Tuning on the Taxi-v3 environment

Tuning on the FrozenLake-v1 environment

Use the Grid Search

On the Taxi-v3 environment

On the FrozenLake-v1 environment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages