-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
two more datasets #2301
two more datasets #2301
Conversation
return train, validation | ||
|
||
|
||
def load_open_ai_summarize_from_feedback(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's better to just make the split manually here and in load_open_ai_summarize_from_feedback ?
I'm not sure what's best, but the default split seems to be not balanced in terms of sizes...
assert split in ("train", "test") | ||
if sep_token is None: | ||
sep_token = " . " | ||
|
||
if mode == "rm": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This are changes in the reward/instructor folder. They are not used for the trainer_rm code. I suggest to remove these changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank!
Reverts unrelated changes in `model/reward/instructor/rank_datasets.py` `AnthropicRLHF` class introduced by #2301.
No description provided.