Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add training code for reward model #222

Merged
merged 23 commits into from Jan 1, 2023
Merged

add training code for reward model #222

merged 23 commits into from Jan 1, 2023

Conversation

theblackcat102
Copy link
Collaborator

trainer code to train a single score reward model. Currently support webgpt and raw datasets from humanfeed back summary by openai. See readme and rank_datasets.py for more details.

Copy link
Collaborator

@andreaskoepf andreaskoepf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! First step: Please run/install pre-commit, it is mandatory for all code that enters this repo.

@@ -1,4 +1,4 @@
{
"python.formatting.provider": "black",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but we require all contributors to use the same pre-commit rules.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just updated, please revise

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see the provider as autopep8, when it should be black. do you maybe have a local commit that you didn't push yet?

@andreaskoepf andreaskoepf self-requested a review January 1, 2023 11:50
Copy link
Collaborator

@andreaskoepf andreaskoepf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall nice training code! Thanks a lot ... also for instantly responding to change requests on discord.

model/reward/instructor/requirements.txt Outdated Show resolved Hide resolved
from rank_datasets import DataCollatorForPairRank, HFSummary, WebGPT
from torch.utils.data import DataLoader
from transformers import AutoTokenizer

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very useful file, a short docstring at the beginning would be nice to explain how it is used during dev/purpose (e.g. dataloader test, batch-shape inspection)

Copy link
Collaborator

@yk yk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just reset the formatting provider in settings.json to black, otherwise LGTM, thank you very much!

@@ -1,4 +1,4 @@
{
"python.formatting.provider": "black",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see the provider as autopep8, when it should be black. do you maybe have a local commit that you didn't push yet?

@theblackcat102
Copy link
Collaborator Author

@yk yeah, it's my problem. just reset the format setting

@yk yk merged commit 29c6491 into LAION-AI:main Jan 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants