Skip to content

mhyrzt/D2QN

Repository files navigation

D2QN

PyTorch NumPy

Implementation of Double DQN with Pytorch.

in case if you understand persian and wanna find out more, checkout my Virgool post :D

Installation

clone project with:

git clone git@github.com:mhyrzt/D2QN.git

for training D2QN run following command in terminal:

python trainer.py

for running a simulation with trained ANN:

python play.py

Modules/Classes

History

for storing, plotting & logging history of rewards and epsilon.

Methods

  • add → for adding new values for reward and epsilon
  • log →for logging last reward and episode from arrays.
  • plot→ for plotting epsilon and reward arrays

Epsilon

the main purpose of this class is to implement Epsilon-Greedy for exploration and exploitation. it takes two arguments:

  1. gym environment: for taking random action.
  2. torch ANN model: to predict best action.
epsilon = Epsilon(env, model)

Methods

  • _rand→ this method generate a random floating number in range of 0 and 1.

  • get_action→ predict best action based on ANN model.

  • take_action→ based on random number return a random action or the best action from model.

  • decrease→decrease amount of epsilon by multiplying it with a constant.

ReplyBuffer

for storing info and stats of each step also known as experience:

  • Current State
  • Action
  • Reward
  • Next State
  • Is Terminal (done)

as arguments it takes two number:

  1. max_len → maximum number of experience to store.

  2. batch_size → number of experience for random sampling.

buffer = ReplyBuffer(5_000, 128)

Methods

  • add → sotre a new experience.
  • sample → random sampling with self.batch_size
  • can_sample → check if sampling is possible or not.

Model

our ANN model for predicting Q-values. as arguments it takes 3 parameters:

  1. shape of state which represent input dimension.
  2. number of possible action which represents output dimension.
  3. an array of numbers which represent hidden layers and their size.
model = Model(4, 2, (32, 32, 32))

Methods

  • copy → create a copy from model and return it
  • save → save model to a file.
  • load → load model from file.

Agent

main implementation of D2QN which utilize all classes above to make it work

Results

CartPole-v1

CartPole-v1

LunarLander-v2

LunarLander-v2

About

Implementation of Double DQN with Pytorch

Topics

Resources

License

Stars

Watchers

Forks

Languages