MDP Playground integration into bsuite #38

RaghuSpaceRajan · 2021-03-30T21:27:16Z

Hope you're doing well!

Following our discussions, I added MDP Playground experiments into bsuite for the following dimensions of MDP Playground: Delay, Transition Noise, Reward Noise, Reward Sparsity, Rewardable Sequence Length

Here is a short summary of the changes we have made:

Added MDP Playground environments and experiments into bsuite
Updated analysis Jupyter notebook:
- Added a spoke in the bsuite spider plot for MDP Playground environments.
- Added additional analyses cells for individual MDP Playground experiments:
  - Delay, Transition Noise, Reward Noise, Reward Sparsity, Rewardable Sequence Length
Added a guide on how to add new experiments into bsuite to CONTRIBUTING.md
Updated setup.py to include mdp-playground as a dependency
Changed HPs for A2C because performance was very noisy with the old ones
Removed Jupyter notebook conflicts with deepmind:master
Removed .gitignore conflicts with deepmind:master

Improvements still needed:

The code still needs to conform to the Python style that bsuite follows.
The tests probably need some improvement.

Please let us know your inputs and feedback on what to do next!

Best regards,
Raghu Rajan.

Co-authored-by: suresh-guttikonda guttikondasuresh7@gmail.com
Co-authored-by: guttikon guttikon@informatik.uni-freiburg.de

* Integrating MDP Playground environments into bsuite: Added MDP Playground environments and experiments into bsuite Analysis Jupyter notebook: Added a spoke in the bsuite spider plot for MDP Playground environments. Added additional analyses cells for individual MDP Playground experiments: Delay, Transition Noise, Reward Noise, Reward Sparsity, Rewardable Sequence Length * Removed Jupyter notebook conflicts with deepmind:master * Removed .gitignore conflicts with deepmind:master Co-authored-by: suresh-guttikonda <guttikondasuresh7@gmail.com> Co-authored-by: guttikon <guttikon@informatik.uni-freiburg.de>

…eriments in bsuite.

google-cla · 2021-03-30T21:27:21Z

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

sguttikon · 2021-03-31T11:48:00Z

@googlebot I fixed it.

google-cla · 2021-03-31T11:48:05Z

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

sguttikon · 2021-03-31T11:52:29Z

@googlebot I fixed it.

RaghuSpaceRajan · 2021-04-28T15:56:30Z

Hi @iosband and @yotam,

Could you please restart the workflow tests? I managed to fix the failing pytype tests in the latest commit and pytest tests are also passing locally.

yotam · 2021-10-07T22:23:13Z

bsuite/baselines/tf/actor_critic/run.py

@@ -43,8 +43,8 @@
 # algorithm
 flags.DEFINE_integer('seed', 42, 'seed for random number generation')
 flags.DEFINE_integer('num_hidden_layers', 2, 'number of hidden layers')
-flags.DEFINE_integer('num_units', 64, 'number of units per hidden layer')
-flags.DEFINE_float('learning_rate', 1e-2, 'the learning rate')
+flags.DEFINE_integer('num_units', 50, 'number of units per hidden layer')


We should do any agent changes in a separate commit

yotam · 2021-10-07T22:23:54Z