D2QN

Implementation of Double DQN with Pytorch.

in case if you understand persian and wanna find out more, checkout my Virgool post :D

Installation

clone project with:

git clone git@github.com:mhyrzt/D2QN.git

for training D2QN run following command in terminal:

python trainer.py

for running a simulation with trained ANN:

python play.py

Modules/Classes

`History`

for storing, plotting & logging history of rewards and epsilon.

Methods

add → for adding new values for reward and epsilon
log →for logging last reward and episode from arrays.
plot→ for plotting epsilon and reward arrays

`Epsilon`

the main purpose of this class is to implement Epsilon-Greedy for exploration and exploitation. it takes two arguments:

gym environment: for taking random action.
torch ANN model: to predict best action.

epsilon = Epsilon(env, model)

Methods

_rand→ this method generate a random floating number in range of 0 and 1.
get_action→ predict best action based on ANN model.
take_action→ based on random number return a random action or the best action from model.
decrease→decrease amount of epsilon by multiplying it with a constant.

`ReplyBuffer`

for storing info and stats of each step also known as experience:

Current State
Action
Reward
Next State
Is Terminal (done)

as arguments it takes two number:

max_len → maximum number of experience to store.
batch_size → number of experience for random sampling.

buffer = ReplyBuffer(5_000, 128)

Methods

add → sotre a new experience.
sample → random sampling with self.batch_size
can_sample → check if sampling is possible or not.

`Model`

our ANN model for predicting Q-values. as arguments it takes 3 parameters:

shape of state which represent input dimension.
number of possible action which represents output dimension.
an array of numbers which represent hidden layers and their size.

model = Model(4, 2, (32, 32, 32))

Methods

copy → create a copy from model and return it
save → save model to a file.
load → load model from file.

`Agent`

main implementation of D2QN which utilize all classes above to make it work

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
results		results
.gitignore		.gitignore
CartPole-v1-agent.pt		CartPole-v1-agent.pt
LICENSE		LICENSE
README.md		README.md
agent.py		agent.py
epsilon.py		epsilon.py
history.py		history.py
model.py		model.py
play.py		play.py
reply_buffer.py		reply_buffer.py
trainer.py		trainer.py

License

mhyrzt/D2QN

Folders and files

Latest commit

History

Repository files navigation

D2QN

Installation

Modules/Classes

History

Methods

Epsilon

Methods

ReplyBuffer

Methods

Model

Methods

Agent

Results

CartPole-v1

LunarLander-v2

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

`History`

`Epsilon`

`ReplyBuffer`

`Model`

`Agent`