Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Decentralized Distributed PPO #245
Motivation and Context
Implementation of Decentralized Distributed PPO, a simple, synchronous method for distributed RL.
How Has This Been Tested
Locally -- adding tests via pytest is kinda odd due to not all builds of pytorch supporting distributed mode.
Types of changes
* Refactor baselines to allow for extension * Fix eval * Make clip an after-backward * Put ddppo on top of refactor * Making things work as intended again * Update README * Update readme more * Fix table * Update * Add task * readme * before_step not after_backward * Fixes * Add val SPL to weights table * Add running mean and var code * Loop over rollout step instead of function for full rollout * Fix merge * DD-PPO v2 * Make it work again! * Little changes * Fixes * Make it work! * Rebased! * Update authors * Whoops * Move EXIT into the inner most loop * Add contextlib.suppress() * Hmmmm * Format * Fix bug * Update readme * Fix env_utils bug * Update README * Eval on val and add an option to not take the checkpoints config * black * @mathfac comments * Episode iterator upgrades * Make default slightly lower * Fix env.py * fix is true * Update docstring * Fix formatting * Turn off shuffle! * Added pre-commit, added isort-seed, changed CI black and isort, added pyproject config * Make things interruptible and support pretrained * Precommit and isort * Preemption seems to work * Loading pretrained weights works! * Should be good to go! * Make config a little better looking * Add visual encoder caching * Fix shuffling * Little refactor
@@ Coverage Diff @@ ## master #245 +/- ## ========================================== - Coverage 79.3% 74.95% -4.36% ========================================== Files 86 95 +9 Lines 5896 6456 +560 ========================================== + Hits 4676 4839 +163 - Misses 1220 1617 +397