Feature/remove reward instructor #2289

CloseChoice · 2023-04-01T11:23:24Z

closes #2049

I updated the model_training/utils.py with all the functionality I could find diverging from reward/instructor.

I still have a few questions:

do we need the all of the dataset processing functionality form reward/instructor, e.g. in experimental_dataset.py and cls_dataset.py and alsow the webgpt_return_format function in utils?
will we use webgpt and hfsummary as reward model training data? Currently only the oasst_export data is defined as training data for the RM model

github-actions · 2023-04-01T11:26:15Z

❌ pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

…e/remove-reward-instructor

olliestanley · 2023-04-01T18:56:19Z

In terms of checking whether we need functionality from reward/instructor, I wouldn't worry too much. If any of it gets deleted and turns out to be needed later, it's still in the Git history so easy enough to go back and grab! I'm also not sure if we need to add all those configs to the YAML as most of them are probably outdated now but will leave that up to the ML guys to review

…e/remove-reward-instructor

andreaskoepf · 2023-04-02T22:04:50Z

@CloseChoice since you are actively developing in the ML parts of open-assistiant I would like to invite you to join the OA ML team on discord. Please ping me (andreaskoepf).

dvruette

Changes look good. Not sure we actually still need the old RM configs, but it's probably fine to keep them around for a little longer.

thx!

andreaskoepf · 2023-04-03T15:18:51Z

@theblackcat102 Is all relevant RM code already part of the new trainer_rm so that the old reward/instructor can be deleted? At least the old rank datasets were not ported yet. What's with rankgen loss etc?

CloseChoice · 2023-04-03T15:32:36Z

Do we have a dataset with which I can check if that works correctly? I don't have access to the private OA data but I guess that a any other dataset suitable for RM should suffice. Even if I have to manipulate it a bit.

theblackcat102 · 2023-04-04T00:27:14Z

@andreaskoepf rankgen didn't have good results to back it up anyway. I think its fine if we just move on

theblackcat102 · 2023-04-04T00:29:00Z

@CloseChoice I saw you added RM configs for deberta, however the existing code doesn't work for deberta nor due to limited choices in TOKENIZER_CONFIGS under utils.py. Have you tried running a webgpt examples on these new RM config?

…e/remove-reward-instructor

CloseChoice · 2023-04-04T05:57:49Z

@theblackcat102 I tried, but so far I failed. Currently we only support oasst_export as RM dataset (as given by the RM_DATASETS and I don't see any reward modelling functionality in the WebGPT dataset (and don't see it under rank_datasets aswell). Should WebGPT just work out-of-the-box if I add it to RM_DATASETS?

I just added the configs from reward/instructor and copied them as they were.

github-actions · 2023-04-04T06:00:20Z

❌ pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

CloseChoice · 2023-04-04T06:05:10Z

I fixed the HFSummary dataset and ran

python trainer_rm.py --configs defaults_rm debug_rm

with this branch and ran into the following error

  File "xxx/Open-Assistant/model/.venv/lib/python3.10/site-packages/torch/cuda/amp/grad_scaler.py", line 210, in _unscale_grads_
    raise ValueError("Attempting to unscale FP16 gradients.")
ValueError: Attempting to unscale FP16 gradients.

I'm not the first to have this problem but any ideas how to fix this?

github-actions · 2023-04-04T06:06:31Z

❌ pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

CloseChoice · 2023-04-04T06:26:22Z

I removed the configs where I ran into problems with the tokenizer config

theblackcat102 · 2023-04-05T07:37:46Z

@CloseChoice the problem with webgpt, hf_summary and ValueError: Attempting to unscale FP16 gradients. are all fixed here Zzzzz

CloseChoice · 2023-04-05T18:09:21Z

@theblackcat102 still get the errors on main

…e/remove-reward-instructor

CloseChoice · 2023-04-12T18:24:05Z

@theblackcat102 @andreaskoepf @dvruette Any reasons why we should not merge this?

theblackcat102 · 2023-04-14T12:01:04Z

model/model_training/custom_datasets/rank_datasets.py

@shahules786 is this a bug?

for data in dataset: for item in dataset:

yes, that is a bug I fixed on the fly

theblackcat102

LGTM

CloseChoice added 2 commits April 1, 2023 13:17

move functionality from reward/instructor to model_training

93bedbd

update readme

02d1ea4

CloseChoice added 3 commits April 1, 2023 13:27

update files to pass pre-commit

42c7d2f

Merge branch 'main' of github.com:LAION-AI/Open-Assistant into featur…

4bad123

…e/remove-reward-instructor

update rm datasets

bd13d5f

andreaskoepf added the ml label Apr 1, 2023

CloseChoice added 2 commits April 2, 2023 23:07

Merge branch 'main' of github.com:LAION-AI/Open-Assistant into featur…

4b3583f

…e/remove-reward-instructor

remove reward

2fa4706

CloseChoice marked this pull request as ready for review April 2, 2023 21:14

CloseChoice requested review from theblackcat102, sanagno, dvruette, andreaskoepf and yk as code owners April 2, 2023 21:14

dvruette approved these changes Apr 2, 2023

View reviewed changes

CloseChoice mentioned this pull request Apr 3, 2023

Integrate reward/instructor implementation into trainer_rm #2309

Closed

dvruette mentioned this pull request Apr 3, 2023

Revert unrelated changes in instructor rank_datasets #2306

Merged

Merge branch 'main' of github.com:LAION-AI/Open-Assistant into featur…

c5d3150

…e/remove-reward-instructor

CloseChoice added 2 commits April 4, 2023 07:57

update rank dataset

8dd8009

update rank datasets

ffad212

update config_rm

5886ec4

CloseChoice added 2 commits April 4, 2023 08:15

fix config

a003c40

remove configs without tokenizer config

b844e83

Merge branch 'main' of github.com:LAION-AI/Open-Assistant into featur…

af0ca41

…e/remove-reward-instructor

theblackcat102 reviewed Apr 14, 2023

View reviewed changes

theblackcat102 approved these changes Apr 14, 2023

View reviewed changes

theblackcat102 merged commit 1813bcc into LAION-AI:main Apr 14, 2023
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/remove reward instructor #2289

Feature/remove reward instructor #2289

CloseChoice commented Apr 1, 2023 •

edited

github-actions bot commented Apr 1, 2023

olliestanley commented Apr 1, 2023

andreaskoepf commented Apr 2, 2023

dvruette left a comment

andreaskoepf commented Apr 3, 2023

CloseChoice commented Apr 3, 2023

theblackcat102 commented Apr 4, 2023

theblackcat102 commented Apr 4, 2023

CloseChoice commented Apr 4, 2023 •

edited

github-actions bot commented Apr 4, 2023

CloseChoice commented Apr 4, 2023

github-actions bot commented Apr 4, 2023

CloseChoice commented Apr 4, 2023

theblackcat102 commented Apr 5, 2023

CloseChoice commented Apr 5, 2023

CloseChoice commented Apr 12, 2023

theblackcat102 Apr 14, 2023

CloseChoice Apr 14, 2023

theblackcat102 left a comment

Feature/remove reward instructor #2289

Feature/remove reward instructor #2289

Conversation

CloseChoice commented Apr 1, 2023 • edited

github-actions bot commented Apr 1, 2023

olliestanley commented Apr 1, 2023

andreaskoepf commented Apr 2, 2023

dvruette left a comment

Choose a reason for hiding this comment

andreaskoepf commented Apr 3, 2023

CloseChoice commented Apr 3, 2023

theblackcat102 commented Apr 4, 2023

theblackcat102 commented Apr 4, 2023

CloseChoice commented Apr 4, 2023 • edited

github-actions bot commented Apr 4, 2023

CloseChoice commented Apr 4, 2023

github-actions bot commented Apr 4, 2023

CloseChoice commented Apr 4, 2023

theblackcat102 commented Apr 5, 2023

CloseChoice commented Apr 5, 2023

CloseChoice commented Apr 12, 2023

theblackcat102 Apr 14, 2023

Choose a reason for hiding this comment

CloseChoice Apr 14, 2023

Choose a reason for hiding this comment

theblackcat102 left a comment

Choose a reason for hiding this comment

CloseChoice commented Apr 1, 2023 •

edited

CloseChoice commented Apr 4, 2023 •

edited