Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated 1.dqn for compatability with PyTorch 0.4 and 1.0 #24

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

joleeson
Copy link

@joleeson joleeson commented Feb 8, 2019

  1. Updated for compatibility with latest PyTorch versions. (more thorough than recommendations in Update to run on torch 0.4 #20)
  • no longer uses the deprecated "Variable" class
  • use of appropriate dtypes
  • cpu/gpu agnostic code
  • use of tensor.item() for conversion of 0-dimensional tensors to ordinary python numbers
  1. Made changes such that the algorithm more closely matches that in Mnih et al. (2015) and other DQN literature:
  • linear epsilon decay
  • frame stacking
  • training frequency is now once every 4 steps in the environment for Atari env
  • option of using Huber loss instead of RMS loss in def compute_td_loss()
  1. Borrowed monitoring wrapper from OpenAI's Baselines to log progress of training.
  2. Modified the wrappers such that it now accommodates stacked frames frame_stack default to False #9 , and outputs them as a LazyFrames object. Axes of the data is appropriately swapped for PyTorch i.e. (no. of channels)x(breadth)x(height)

Updated for PyTorch 0.4.
Made changes such that the algorithm more closely matches that in Mnih et al. (2015) and other DQN literature:
- linear epsilon decay
- frame stacking
- training frequency is now once every 4 steps in the environment for Atari env
- option of using Huber loss instead of RMS loss in def compute_td_loss()
Also borrowed logging facility from OpenAI's Baselines
-Borrowed monitoring wrapper from OpenAI's Baselines to log progress of training.
-Modified the wrappers such that it now accommodates stacked frames, and outputs them as a LazyFrames object. Axes of the data is appropriately swapped for PyTorch i.e. (no. of channels)x(breadth)x(height)
"import torch.nn.functional as F\n",
"\n",
"import os\n",
"import logger\n",
Copy link

@colin-leu colin-leu May 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which specific module is this? (can't find a module named logger)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants