Update vectorized reinforcement learning #104

engintoklu · 2024-06-03T22:36:17Z

This pull request updates vectorized reinforcement functionalities of EvoTorch, so that they are compatible with the gymnasium 1.0.x API (while preserving compatibility with gymnasium 0.29.x).

In more details, this pull request introduces an EvoTorch-specific SyncVectorEnv implementation (as an alternative to gymnasium's SyncVectorEnv class). This custom SyncVectorEnv preserves the classical auto-reset behavior on which EvoTorch relies, allowing us to transition to gymnasium 1.0.x.

Having our custom SyncVectorEnv allows us to introduce these performance-related improvements as well:

observations, rewards, etc. reported by SyncVectorEnv are now moved into the device of where the policies are executed;
sub-environments of SyncVectorEnv that have reached the maximum number of episodes are not executed further.

Brax-related notebook examples are also refactored. Instead of including the entire brax example in a single notebook, there are now two notebooks, one focusing on the training and the other focusing on the visualization. The visualization example is updated so that it works correctly with the latest version of brax.

This commit updates vectorized reinforcement functionalities of EvoTorch, so that they are compatible with the gymnasium 1.0.x API (while preserving compatibility with gymnasium 0.29.x). In more details, this commit introduces an EvoTorch-specific `SyncVectorEnv` implementation (as an alternative to gymnasium's SyncVectorEnv class). This custom SyncVectorEnv preserves the classical auto-reset behavior on which EvoTorch relies, allowing us to transition to gymnasium 1.0.x. Having our custom `SyncVectorEnv` allows us to introduce these performance-related improvements as well: (i) observations, rewards, etc. reported by `SyncVectorEnv` are now moved into the device of where the policies are executed; (ii) sub-environments of SyncVectorEnv that have reached the maximum number of episodes are not executed further. Brax-related notebook examples are also refactored. Instead of including the entire brax example in a single notebook, there are now two notebooks, one focusing on the training and the other focusing on the visualization. The visualization notebook is updated so that it works correctly with the latest version of brax.

Fix a typo in the code

engintoklu added the enhancement New feature or request label Jun 3, 2024

engintoklu requested review from Higgcz, pliskowski, flukeskywalker and NaturalGradient June 3, 2024 22:36

engintoklu self-assigned this Jun 3, 2024

Update vecrl.py

55baba1

Fix a typo in the code

flukeskywalker approved these changes Jun 6, 2024

View reviewed changes

flukeskywalker merged commit d995b97 into master Jun 6, 2024
1 of 2 checks passed

flukeskywalker deleted the refactor/updated-vecgymne branch June 6, 2024 18:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update vectorized reinforcement learning #104

Update vectorized reinforcement learning #104

engintoklu commented Jun 3, 2024

Update vectorized reinforcement learning #104

Update vectorized reinforcement learning #104

Conversation

engintoklu commented Jun 3, 2024