Skip to content

Navigation Menu

Explore
For
- Enterprise
- Teams
- Startups
- Education
By Solution
Resources
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

OpenLLMAI / OpenRLHF Public

Notifications
Fork 108
Star 1.2k

Code
Issues 40
Pull requests
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Releases: OpenLLMAI/OpenRLHF

Releases · OpenLLMAI/OpenRLHF

Release v0.2.7

06 May 07:57

hijkzzz

Compare

Choose a tag to compare

Release v0.2.7 Latest

Latest

Changes

Added support for vLLM-v0.4.2 @hijkzzz
Added support for Jamba-v0.1 (Incompatible with vLLM-v0.4.2 now) @hijkzzz
Added LoRA configs (--lora_dropout, --target_modules) @hijkzzz

Contributors

hijkzzz

Assets 2

All reactions

Release v0.2.6

30 Apr 02:33

hijkzzz

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Release v0.2.6

Changes

Upgraded vLLM to v0.4.1 @mgerstgrasser @wuxibin89 @hijkzzz
Upgraded Transformers to v4.40.1 and DeepSpeed to v0.14.0 @hijkzzz
Fixed typo in train_ppo_ray.py @mickelliu
Fixed mismatch size output_state_dict(148) and state_dict(149) in model saving @hijkzzz
Added support for --colocate_actor_ref and --colocate_critic_reward in train_ppo_ray.py @hijkzzz
Added support for Ray PPO reward ref models offloading @hijkzzz

Contributors

wuxibin89, mgerstgrasser, and 2 other contributors

Assets 2

All reactions

Release v0.2.5

12 Apr 09:13

hijkzzz

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Release v0.2.5

Changes

Added Chinese README.md @khazic
Added KD Trainer and Loss @ifromeast
Fixed num_training_steps @wuxibin89
Updated requirements.txt @kfertakis
Fixed error due to 'margin' variable type being list in rm_trainer.py @StwayneXG

Contributors

wuxibin89, kfertakis, and 3 other contributors

Assets 2

All reactions

Release v0.2.4

13 Mar 14:05

hijkzzz

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Release v0.2.4

Changes

Fixed DPO masked loss function @hijkzzz
Fixed Yi-34B tokenizer (--disable_fast_tokenizer) #240 @hijkzzz
Supported wandb.login() (--wandb True) #231 @mgerstgrasser

Contributors

mgerstgrasser and hijkzzz

Assets 2

All reactions

Release v0.2.3

04 Mar 00:14

hijkzzz

Compare

Choose a tag to compare

Release v0.2.3

Changes

Fixed #191 "deepspeed.zero.Init causes very strange spikes in PPO policy_loss" @hijkzzz
Added dockerfile for vLLM @hijkzzz

Contributors

hijkzzz

Assets 2

All reactions

Release v0.2.2

01 Mar 01:44

hijkzzz

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Release v0.2.2

Changes

Fixed LlamaRotaryEmbedding for Transformers v4.38.1 @hijkzzz
Use lazy vLLM engine @wuxibin89
Added Chinese PR docs @catqaq
Fixed tensor shape docs @Thecats-Jfm

Contributors

wuxibin89, hijkzzz, and 2 other contributors

Assets 2

All reactions

Release v0.2.1

22 Feb 02:50

hijkzzz

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Release v0.2.1

Changes

Fixed position_ids for left padding #217 @hijkzzz
Supported input_key for custom dataset @hijkzzz

Contributors

hijkzzz

Assets 2

All reactions

Release v0.2.0

19 Feb 14:50

hijkzzz

Compare

Choose a tag to compare

Release v0.2.0

Changes

Supported vLLM 0.3.1 @wuxibin89

Contributors

wuxibin89

Assets 2

All reactions

Release v0.1.10

18 Feb 16:06

hijkzzz

Compare

Choose a tag to compare

Release v0.1.10

Changes

Fixed save_models for named_buffer @wuxibin89
Fixed vLLM generation hang bug (requires vLLM<0.2.7) @hijkzzz

Contributors

wuxibin89 and hijkzzz

Assets 2

All reactions

Release v0.1.9

29 Jan 05:35

hijkzzz

Compare

Choose a tag to compare

Release v0.1.9

Changes

Supported input_template #203 @rbao2018
Supported KTO #201 @Dylancer1998
Upgrade HuggingFace Transformers to 4.37.1

Contributors

Dylancer1998 and rbao2018

Assets 2

All reactions

Previous 1 2 3 Next

Previous Next

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.