A comparison of parameter space noise methods for exploration in deep reinforcement learning
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
core
misc
utils
.gitignore
.gitmodules
LICENSE
README.md
__init__.py
main.py

README.md

ParamNoise

A comparison of parameter space noise methods for exploration in deep reinforcement learning

Links to papers

Parameter Space Noise for Exploration : https://openreview.net/forum?id=ByBAl2eAZ&noteId=ByBAl2eAZ

Noisy Networks For Exploration : https://openreview.net/forum?id=rywHCPkAW&noteId=rywHCPkAW

Resources

TODOs

  • Implement PPO and MuJoCo env handling
  • Revisit logging; make sure everything is there to reproduce results in papers
  • Implement plotting (matplotlib is in Logger object; maybe try out visdom)
  • More tests (figure out different combinations of arguments to ensure everything's interacting well)
  • Begin experiments (start with Mujoco; it's cheaper)

Atari Games to Test

  • Alien: Adaptive helps a lot, learned shows no improvement
  • Enduro: Both methods improve
  • Seaquest: Adaptive helps, learned performs worse than baseline
  • Space Invaders: Adaptive helps, but learned helps more
  • WizardOfWor: Adaptive worse than baseline, but learned helps a lot

MuJoCo enviroments to test

  • Hopper
  • Walker2d
  • HalfCheetah
  • Sparse versions of these? (from rllab)