v2.8.0: Fix for `MaskablePPO` and markdown doc
Breaking Changes:
- Removed support for Python 3.9, please upgrade to Python >= 3.10
- Upgraded to Stable-Baselines3 >= 2.8.0
- Set
strict=Truefor every call tozip(...)
New Features:
- Added official support for Python 3.13
Bug Fixes:
- Fixed
MaskablePPOandRecurrentPPOinaccuraten_updatescounting whentarget_klearly exits the training loop - Fixed
RecurrentPPOandMaskablePPOforwardandpredictnot reshaping the action before clipping it (@immortal-boy) - Do not call
forward()method directly inRecurrentPPO(@immortal-boy) - Fixed
MaskableCategorical.apply_masking()crashing withValueError: Simplexwhen cachedprobsdeviate from sum=1 in float32 with large action spaces (torch 2.9+) (@kirann-05)
Documentation:
- Switched to markdown documentation (using MyST parser)
New Contributors
- @immortal-boy made their first contribution in #320
- @kirann-05 made their first contribution in #326
Full Changelog: v2.7.1...v2.8.0