Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Refactoring RL models #3

Merged
merged 105 commits into from Aug 29, 2018
Merged

[WIP] Refactoring RL models #3

merged 105 commits into from Aug 29, 2018

Conversation

hill-a
Copy link
Owner

@hill-a hill-a commented Jul 18, 2018

  • refactored A2C, ACER, ACTKR, DDPG, DeepQ, GAIL, TRPO, PPO1 and PPO2 under a single constant class
  • added callback to refactored algorithm training
  • added saving and loading to refactored algorithms
  • refactored ACER, DDPG, GAIL, PPO1 and TRPO to fit with A2C, PPO2 and ACKTR policies
  • added new policies for most algorithms (Mlp, MlpLstm, MlpLnLstm, Cnn, CnnLstm and CnnLnLstm)
  • added dynamic environment switching (so continual RL learning is now feasible)
  • added prediction from observation and action probability from observation for all the algorithms
  • fixed graphs issues, so models wont collide in names
  • fixed behavior_clone weight loading for GAIL
  • fixed Tensorflow using all the GPU VRAM
  • fixed models so that they are all compatible with vectorized environments
  • fixed set_global_seed to update gym.spaces's random seed
  • fixed PPO1 and TRPO performance issues when learning identity function
  • added new tests for loading, saving, continuous actions and learning the identity function
  • fixed DQN wrapping for atari
  • added saving and loading for Vecnormalize wrapper
  • added automatic detection of action space (for the policy network)
  • fixed ACER buffer with constant values assuming n_stack=4
  • fixed some RL algorithms not clipping the action to be in the action_space, when using gym.spaces.Box
  • refactored algorithms can take either a gym.Environment or a str (if the environment name is registered)

TODO:

  • Finish refactoring HER
  • Refactor ACKTR and ACER for continuous implementation

@araffin araffin changed the base branch from fixes_cleanup to master July 27, 2018 14:42
@araffin araffin merged commit 282e2ec into master Aug 29, 2018
@hill-a hill-a deleted the refactoring branch August 29, 2018 11:49
araffin added a commit that referenced this pull request Apr 19, 2021
* Fixed typo

* Update changelog.rst

Co-authored-by: Rouslan Placella <rouslan@placella.com>
Miffyli added a commit that referenced this pull request Apr 19, 2021
* Faster tests

* Add github workflow

* Faster test

* Fix MPI dependency

* Faster HER tests + fix CI

* No specific python version for pytype

* Fixes + add badge

* Fix multiprocessing error

* Separate TD3 test

* Add tolerance for deterministic check

* Better tests for saving/loading

* Remove unnecessary check

* Add comment about VecEnv start method

* Make pytype happy

* Debug MPI

* Add MPI step

* Move MPI tests outside pytest

* Two processes for GitHub CI

* Deactivate check moments

* Fix import error

* Copy version.txt to docker container (#2)

* Copy version.txt to docker container

* Update changelog

* Update Max username

Co-authored-by: Al Nejati <anej001@aucklanduni.ac.nz>

* Fixed typo (#3)

* Fixed typo

* Update changelog.rst

Co-authored-by: Rouslan Placella <rouslan@placella.com>

* Warn users to switch to Stable Baselines3 (#4)

* Warn users to switch to Stable Baselines3

* Fix for pytype

* Fix other pytype error

* Clarify update warning

* Remove python 3.5 build

* Allow test to fail

* Flaky TD3 test

* Fix argument

Co-authored-by: Anssi <kaneran21@hotmail.com>

Co-authored-by: Al Nejati <anej001@aucklanduni.ac.nz>
Co-authored-by: Rouslan Placella <rouslan@placella.com>
Co-authored-by: Anssi <kaneran21@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants