New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add Reward Model training #1246

Draft

dmahan93 wants to merge 13 commits into EleutherAI:main from dmahan93:add-rm

Commits on Jun 21, 2024

Add a chat data preprocessing script

dmahan93 committed Jun 21, 2024
Configuration menu
View commit details

Copy full SHA for a950f8b

Browse repository at this point
Copy the full SHA

a950f8b View commit details

Browse the repository at this point in the history
add EOT at end of a chat

dmahan93 committed Jun 21, 2024
Configuration menu
View commit details

Copy full SHA for e360e24

Browse repository at this point
Copy the full SHA

e360e24 View commit details

Browse the repository at this point in the history
- add different packing impl (Unpacked, packing until overflow)
```
- fix labels to also have valid/test implementations
- fix label masking in _get_batch to also include anything from get_ltor_masks_and_position_ids
```
dmahan93 committed Jun 21, 2024
Configuration menu
View commit details

Copy full SHA for 9ee4a8f

Browse repository at this point
Copy the full SHA

9ee4a8f View commit details

Browse the repository at this point in the history
update README.md

dmahan93 committed Jun 21, 2024
Configuration menu
View commit details

Copy full SHA for 0678573

Browse repository at this point
Copy the full SHA

0678573 View commit details

Browse the repository at this point in the history

Commits on Jun 24, 2024

Merge remote-tracking branch 'origin/add-chat-template-based-datasets…
```
…' into add-dpo
```
dmahan93 committed Jun 24, 2024
Configuration menu
View commit details

Copy full SHA for 15e3059

Browse repository at this point
Copy the full SHA

15e3059 View commit details

Browse the repository at this point in the history

Commits on Jun 25, 2024

- Add metrics to forward step to add DPO specific metrics that are us…
```
…eful (accuracy, etc)

- Add reference model setup for DPO
- Add pairwise dataset for positive/negative pairs
- Add DPO loss
```
dmahan93 committed Jun 25, 2024
Configuration menu
View commit details

Copy full SHA for 2d20d86

Browse repository at this point
Copy the full SHA

2d20d86 View commit details

Browse the repository at this point in the history
Update arguments.py to use train_label_data_paths instead of label_da…
```
…ta_paths
```
dmahan93 committed Jun 25, 2024
Configuration menu
View commit details

Copy full SHA for c045006

Browse repository at this point
Copy the full SHA

c045006 View commit details

Browse the repository at this point in the history
Merge remote-tracking branch 'origin/add-different-packing-impl' into…
```
… add-dpo
```
dmahan93 committed Jun 25, 2024
Configuration menu
View commit details

Copy full SHA for eed3643

Browse repository at this point
Copy the full SHA

eed3643 View commit details

Browse the repository at this point in the history
- Bugfixes from upstreaming....

dmahan93 committed Jun 25, 2024
Configuration menu
View commit details

Copy full SHA for 0392080

Browse repository at this point
Copy the full SHA

0392080 View commit details

Browse the repository at this point in the history
- add precompute logprobs...

dmahan93 committed Jun 25, 2024
Configuration menu
View commit details

Copy full SHA for 361f459

Browse repository at this point
Copy the full SHA

361f459 View commit details

Browse the repository at this point in the history

Commits on Jun 26, 2024

- Finishing up precompute logprobs...

dmahan93 committed Jun 26, 2024
Configuration menu
View commit details

Copy full SHA for 7398e07

Browse repository at this point
Copy the full SHA

7398e07 View commit details

Browse the repository at this point in the history
- update readme for DPO...

dmahan93 committed Jun 26, 2024
Configuration menu
View commit details

Copy full SHA for 51af714

Browse repository at this point
Copy the full SHA

51af714 View commit details

Browse the repository at this point in the history

Commits on Jun 28, 2024

- Add RM training

dmahan93 committed Jun 28, 2024
Configuration menu
View commit details

Copy full SHA for 06c851e

Browse repository at this point
Copy the full SHA

06c851e View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Reward Model training #1246

Add Reward Model training #1246

Commits on Jun 21, 2024

Commits on Jun 24, 2024

Commits on Jun 25, 2024

Commits on Jun 26, 2024

Commits on Jun 28, 2024

Add Reward Model training #1246

Are you sure you want to change the base?

Add Reward Model training #1246

Commits on Jun 21, 2024

Commits on Jun 24, 2024

Commits on Jun 25, 2024

Commits on Jun 26, 2024

Commits on Jun 28, 2024