Skip to content

PyTorch implementation of the paper "Deep Reinforcement Learning in Large Discrete Action Spaces" (Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, Ben Coppin).

Notifications You must be signed in to change notification settings

nikhil3456/Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces

Repository files navigation

Deep Reinforcement Learning in Large Discrete Action Spaces

This is a PyTorch implementation of the paper "Deep Reinforcement Learning in Large Discrete Action Spaces" (Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, Ben Coppin).

Installation

To install the relevant libraries, run the following command:

pip install -r requirements.txt

Demonstration of Model

Demonstration video:

cartpole_demo

Results for k = 1 (K is the number of nearest neighbours) Results for k = 10 (K is the number of nearest neighbours)
reward_k_1.png reward_k_1.png

Train the agent

To train the agent, simply run the main.ipynb file provided in the repository. The parameters can be updated by changing the values in the Arguments class.

Test the agent

After training the agent using the above code, run the following code to test it on the cartpole environment.

import gym
from gym import wrappers
env_to_wrap = ContinuousCartPoleEnv()
env = wrappers.Monitor(env_to_wrap, './demo', force = True)
env.reset()
for i_episode in range(1):
    observation = env.reset()
    ep_reward = 0
    for t in range(500):
        env.render()
        action = agent.select_action(observation)
        observation, reward, done, info = env.step(action)
        ep_reward += reward
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break
    print(ep_reward)
env_to_wrap.close()
env.close()

Acknowledgements

Reference

If you are interested in the work and want to cite it, please acknowledge the following paper:

@article{DBLP:journals/corr/Dulac-ArnoldESC15,
  author    = {Gabriel Dulac{-}Arnold and
               Richard Evans and
               Peter Sunehag and
               Ben Coppin},
  title     = {Reinforcement Learning in Large Discrete Action Spaces},
  journal   = {CoRR},
  volume    = {abs/1512.07679},
  year      = {2015},
  url       = {http://arxiv.org/abs/1512.07679},
  archivePrefix = {arXiv},
  eprint    = {1512.07679},
  timestamp = {Mon, 13 Aug 2018 16:46:25 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/Dulac-ArnoldESC15},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Collaborators

  1. Shashank Srikanth
  2. Nikhil Bansal

About

PyTorch implementation of the paper "Deep Reinforcement Learning in Large Discrete Action Spaces" (Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, Ben Coppin).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published