Skip to content

sandipan1/BitFlipper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 

Repository files navigation

BitFlipper

BitFlipper environment in OpenAI gym format

Problem Statement

lets say there is a n size binary array (e.g [1,0,0,0,1,1]) and we want to get different array (e.g [0,0,0,1,1,0]) of the same size. Only actions actions allowed are single bit flip at i th position and only get a reward of 0 if the final state is achieved. Otherwise you get a reward -1.

initial=[1,0,0,0,1,1]
flip at index 2 =[1,0,1,0,1,1]
reward=-1

flip at index 5=[1,0,1,0,1,0]
reward=-1

The goal is to make a agent learn to achieve final state given a initial state.The only observation the agent get are the reward and current state.

Steps to run

Clone the repo
In a conda env / virtualenv :
pip install -e .
To run DQN on BitFlipper environment call main() from dqn.py

To run DQN+HER on BitFlipper environment call main() from dqn_her.py

Related papers:

Deep Q Networks :http://www.davidqiu.com:8888/research/nature14236.pdf

Hindsight Experience Replay:https://arxiv.org/pdf/1707.01495.pdf