Release v0.2.6

hijkzzz released this 30 Apr 02:33

· 24 commits to main since this release

Changes

Upgraded vLLM to v0.4.1 @mgerstgrasser @wuxibin89 @hijkzzz
Upgraded Transformers to v4.40.1 and DeepSpeed to v0.14.0 @hijkzzz
Fixed typo in train_ppo_ray.py @mickelliu
Fixed mismatch size output_state_dict(148) and state_dict(149) in model saving @hijkzzz
Added support for --colocate_actor_ref and --colocate_critic_reward in train_ppo_ray.py @hijkzzz
Added support for Ray PPO reward ref models offloading @hijkzzz

Contributors

wuxibin89, mgerstgrasser, and 2 other contributors

Assets 2