Deep Reinforcement Learning in Large Discrete Action Spaces

This is a PyTorch implementation of the paper "Deep Reinforcement Learning in Large Discrete Action Spaces" (Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, Ben Coppin).

Installation

To install the relevant libraries, run the following command:

pip install -r requirements.txt

Demonstration of Model

Demonstration video:

Results for `k = 1` (K is the number of nearest neighbours)	Results for `k = 10` (K is the number of nearest neighbours)

Train the agent

To train the agent, simply run the main.ipynb file provided in the repository. The parameters can be updated by changing the values in the Arguments class.

Test the agent

After training the agent using the above code, run the following code to test it on the cartpole environment.

import gym
from gym import wrappers
env_to_wrap = ContinuousCartPoleEnv()
env = wrappers.Monitor(env_to_wrap, './demo', force = True)
env.reset()
for i_episode in range(1):
    observation = env.reset()
    ep_reward = 0
    for t in range(500):
        env.render()
        action = agent.select_action(observation)
        observation, reward, done, info = env.step(action)
        ep_reward += reward
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break
    print(ep_reward)
env_to_wrap.close()
env.close()

Acknowledgements

Our DDPG code is based on the excellent implementation provided by ghliu/pytorch-ddpg.
The WOLPERTINGER agent code and action_space.py code is based on the excellent implementation of the paper provided by jimkon/Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces

Reference

If you are interested in the work and want to cite it, please acknowledge the following paper:

@article{DBLP:journals/corr/Dulac-ArnoldESC15,
  author    = {Gabriel Dulac{-}Arnold and
               Richard Evans and
               Peter Sunehag and
               Ben Coppin},
  title     = {Reinforcement Learning in Large Discrete Action Spaces},
  journal   = {CoRR},
  volume    = {abs/1512.07679},
  year      = {2015},
  url       = {http://arxiv.org/abs/1512.07679},
  archivePrefix = {arXiv},
  eprint    = {1512.07679},
  timestamp = {Mon, 13 Aug 2018 16:46:25 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/Dulac-ArnoldESC15},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
ContinuousCartPole.py		ContinuousCartPole.py
README.md		README.md
TML_Presesntation.pdf		TML_Presesntation.pdf
TOML_Final_Report.pdf		TOML_Final_Report.pdf
action_space.py		action_space.py
cartpole_demo.gif		cartpole_demo.gif
changelog.md		changelog.md
ddpg.py		ddpg.py
evaluator.py		evaluator.py
main.ipynb		main.ipynb
memory.py		memory.py
model.py		model.py
normalized_env.py		normalized_env.py
random_process.py		random_process.py
requirements.txt		requirements.txt
reward_vs_steps_k1.png		reward_vs_steps_k1.png
reward_vs_steps_k10.png		reward_vs_steps_k10.png
util.py		util.py
wolp.py		wolp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Reinforcement Learning in Large Discrete Action Spaces

Installation

Demonstration of Model

Train the agent

Test the agent

Acknowledgements

Reference

Collaborators

About

Releases

Packages

Contributors 2

Languages

nikhil3456/Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces

Folders and files

Latest commit

History

Repository files navigation

Deep Reinforcement Learning in Large Discrete Action Spaces

Installation

Demonstration of Model

Train the agent

Test the agent

Acknowledgements

Reference

Collaborators

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages