Skip to content

Releases: OpenLLMAI/OpenRLHF

Release v0.2.0

19 Feb 14:50
Compare
Choose a tag to compare

Changes

Release v0.1.10

18 Feb 16:06
Compare
Choose a tag to compare

Changes

  • Fixed save_models for named_buffer @wuxibin89
  • Fixed vLLM generation hang bug (requires vLLM<0.2.7) @hijkzzz

Release v0.1.9

29 Jan 05:35
Compare
Choose a tag to compare

Changes

Release v0.1.8

25 Jan 23:42
Compare
Choose a tag to compare

Changes

  • Upgraded transformers to version 4.37
  • Fixed gradient checkpoint configuration in Ray RLHF @wuxibin89
  • Fixed loss coefficient for PPO-ptx @hijkzzz

Release v0.1.7

23 Jan 04:25
Compare
Choose a tag to compare

Changes

  • Fixed LLaMA RoPE initialization bug for ZeRO3 @wuxibin89
  • Fixed a DPO training script bug @hijkzzz

Release v0.1.6

15 Jan 14:21
Compare
Choose a tag to compare

Changes

  • Fixed DeepSpeed configs to improve PPO training stability @hijkzzz

Release v0.1.5

11 Jan 10:13
Compare
Choose a tag to compare

Changes

  • Optimized deepspeed configuration and improved performance by 30%+ with Adam Offload @hijkzzz
  • Added support for QLora and Lora in all stages @hijkzzz
  • Fixed Mixtral 8*7b balancing loss bugs @hijkzzz

Release v0.1.4

10 Jan 05:29
Compare
Choose a tag to compare

Changes

  • Fixed reward model training when using the Huggingface ZeRO3 initialization API (for models with 70 billion+ parameters) @wuxibin89
  • Added support for Mixtral 8x7b balancing loss (--balancing_loss_coef) @hijkzzz
  • Fixed issue with vllm_engine when tp=1 @wuxibin89
  • Fixed ZeRO2 model saving bugs @hijkzzz
  • Added --grad_accum_dtype args to save memory of the CPUAdam @hijkzzz

Release v0.1.3

08 Jan 16:28
Compare
Choose a tag to compare

Changes

  • Fixed Huggingface Reward model saving @wuxibin89
  • Improved mask_mean for loss function @hijkzzz
  • Fixed num_actions and action_mask @ZiyiLiubird
  • Optimized PPO performance of example scripts (set micro_batch_size=4) @hijkzzz

Release v0.1.2

05 Jan 07:45
Compare
Choose a tag to compare

Changes

  • Fix Reward model hidden size and value_head initialization @wuxibin89
  • Fix save bugs @hijkzzz