What's Changed
- feat:support step for selective subenvs by @vermouthdky in #75
- (fix): fix reset data idx. by @PanAndy in #76
- feat: integrate OpenRLHF training framework by @lkevinzc in #80
- refactor: change verifier to oat.math_grader && allow idx specification by @vermouthdky in #84
- Integrate RL2 with Gem environments by @simonucl in #83
- feat: add 4 more games and random version of each game by @vermouthdky in #86
- feat: align with oat's multi-turn ppo apis by @lkevinzc in #85
- feat: multi-agent env api design by @Benjamin-eecs in #87
- Grpo4 by @anyasims in #89
- feat: support MCP tool and MCPMark env by @cameron-chen in #82
- feat: add visual math environment by @lkevinzc in #93
- feat: add non-multi-processing math grading by @lkevinzc in #95
- feat: integrate tinker-cookbook & add spawn() for lightweight parallel environments by @cameron-chen in #96
- feat: add single-file tinker example by @lkevinzc in #97
- chore: update readme by @lkevinzc in #99
- doc: add bibtex by @lkevinzc in #100
- chore: bump version by @lkevinzc in #101
- chore: fix typo by @lkevinzc in #102
- docs: refactor the examples folder and add readme for tinker by @cameron-chen in #98
New Contributors
- @PanAndy made their first contribution in #76
- @simonucl made their first contribution in #83
- @Benjamin-eecs made their first contribution in #87
Full Changelog: v0.0.4...v0.1.0