two more datasets #2301

mikegarts · 2023-04-03T06:17:06Z

No description provided.

mikegarts · 2023-04-03T06:20:30Z

model/model_training/custom_datasets/extra_rm_datasets.py

+    return train, validation
+
+
+def load_open_ai_summarize_from_feedback():


Maybe it's better to just make the split manually here and in load_open_ai_summarize_from_feedback ?
I'm not sure what's best, but the default split seems to be not balanced in terms of sizes...

andreaskoepf · 2023-04-03T07:25:38Z

model/reward/instructor/rank_datasets.py

        assert split in ("train", "test")
        if sep_token is None:
            sep_token = " . "
+
+        if mode == "rm":


This are changes in the reward/instructor folder. They are not used for the trainer_rm code. I suggest to remove these changes.

andreaskoepf

thank!

Reverts unrelated changes in `model/reward/instructor/rank_datasets.py` `AnthropicRLHF` class introduced by #2301.

two more datasets

2cb1fc8

mikegarts requested review from theblackcat102, sanagno, dvruette, andreaskoepf and yk as code owners April 3, 2023 06:17

mikegarts commented Apr 3, 2023

View reviewed changes

andreaskoepf reviewed Apr 3, 2023

View reviewed changes

andreaskoepf approved these changes Apr 3, 2023

View reviewed changes

andreaskoepf merged commit 9f2d19b into LAION-AI:main Apr 3, 2023

andreaskoepf mentioned this pull request Apr 3, 2023

Revert unrelated changes in instructor rank_datasets #2306

Merged

theblackcat102 pushed a commit that referenced this pull request Apr 4, 2023

Revert unrelated changes in instructor rank_datasets (#2306)

fa3e7ef

Reverts unrelated changes in `model/reward/instructor/rank_datasets.py` `AnthropicRLHF` class introduced by #2301.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

two more datasets #2301

two more datasets #2301

mikegarts commented Apr 3, 2023

mikegarts Apr 3, 2023

andreaskoepf Apr 3, 2023

andreaskoepf left a comment

		return train, validation


		def load_open_ai_summarize_from_feedback():

two more datasets #2301

two more datasets #2301

Conversation

mikegarts commented Apr 3, 2023

mikegarts Apr 3, 2023

Choose a reason for hiding this comment

andreaskoepf Apr 3, 2023

Choose a reason for hiding this comment

andreaskoepf left a comment

Choose a reason for hiding this comment