Add RLHF Reward Trainer and Loss #3435

asdataminer · 2023-06-08T01:35:07Z

Code Pull Requests

Please provide the following:

a clear explanation of what your code does
if applicable, a reference to an issue
a reproducible test for your PR (code, config and data sample)

Documentation Pull Requests

Note that the documentation HTML files are in docs/ while the Markdown sources are in mkdocs/docs.

If you are proposing a modification to the documentation you should change only the Markdown files.

api.md is automatically generated from the docstrings in the code, so if you want to change something in that file, first modify ludwig/api.py docstring, then run mkdocs/code_docs_autogen.py, which will create mkdocs/docs/api.md .

github-actions · 2023-06-08T01:46:03Z

Unit Test Results

      6 files ±      0       6 suites ±0 42m 52s ⏱️ - 36m 46s
2 779 tests +2 746 2 718 ✔️ +2 689   9 💤 +  5   52 ❌ +  52
8 343 runs +8 244 8 154 ✔️ +8 067 33 💤 +21 156 ❌ +156

For more details on these failures, see this check.

Results for commit 439ec2a. ± Comparison against base commit 9112470.

This pull request removes 33 and adds 2779 tests. Note that renamed tests count towards both.

tests.integration_tests.test_cli ‑ test_reproducible_cli_runs[horovod-experiment-1919-0]
tests.integration_tests.test_cli ‑ test_reproducible_cli_runs[horovod-experiment-1919-1]
tests.integration_tests.test_cli ‑ test_reproducible_cli_runs[horovod-experiment-31-0]
tests.integration_tests.test_cli ‑ test_reproducible_cli_runs[horovod-experiment-31-1]
tests.integration_tests.test_cli ‑ test_reproducible_cli_runs[horovod-train-1919-0]
tests.integration_tests.test_cli ‑ test_reproducible_cli_runs[horovod-train-1919-1]
tests.integration_tests.test_cli ‑ test_reproducible_cli_runs[horovod-train-31-0]
tests.integration_tests.test_cli ‑ test_reproducible_cli_runs[horovod-train-31-1]
tests.integration_tests.test_cli ‑ test_train_cli_horovod
tests.integration_tests.test_experiment ‑ test_experiment_model_resume_distributed[horovod]
…

tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_image_augmentation[augmentation_pipeline_ops0]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_image_augmentation[augmentation_pipeline_ops1]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_image_augmentation[augmentation_pipeline_ops2]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[None]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[augmentation_pipeline_ops1]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[augmentation_pipeline_ops2]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[augmentation_pipeline_ops4]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_invalid_augmentation_parameters[random_horizontal_flip]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_load_model_with_augmentation_pipeline
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_local_model_training_with_augmentation_pipeline[preprocessing0-encoder0-False]
…

This pull request removes 4 skipped tests and adds 9 skipped tests. Note that renamed tests count towards both.

tests.integration_tests.test_horovod ‑ test_horovod_gpu_memory_limit
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[ames_housing.ecd.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[mercedes_benz_greener.ecd.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[sarcos.ecd.yaml]

tests.ludwig.automl.test_base_config
tests.ludwig.automl.test_utils
tests.ludwig.backend.test_ray
tests.ludwig.benchmarking.test_profiler
tests.ludwig.data.test_ray_data
tests.ludwig.models.test_training_determinism ‑ test_training_determinism_ray_backend
tests.ludwig.utils.test_fs_utils ‑ test_get_fs_and_path_invalid_windows
tests.ludwig.utils.test_hyperopt_ray_utils ‑ test_grid_strategy[test_1]
tests.ludwig.utils.test_hyperopt_ray_utils ‑ test_grid_strategy[test_2]

♻️ This comment has been updated with latest results.

asdataminer added 6 commits June 5, 2023 12:07

Make preprocessing modifications V1

2bc1d01

Add dataset validation

d6cc331

Small edit

ea07940

Add tests

e993d7b

Small edits

98db29b

Another small edit

518feca

asdataminer added 6 commits June 11, 2023 20:04

Modify processing strategy

8ac5101

Small edit

b61a174

Another small edit

e78152d

Small edit

440cbec

Add loss items

b12f8ac

Add trainer

936194d

asdataminer force-pushed the rlhf_reward_loss branch from 1f51206 to 936194d Compare June 13, 2023 13:23

asdataminer added 2 commits June 13, 2023 06:41

Modify reward model trainer

2b74e7b

Small edits

becf832

asdataminer force-pushed the rlhf_reward_loss branch from 446cf0f to becf832 Compare June 13, 2023 13:43

asdataminer added 2 commits June 13, 2023 08:14

Add trainer, data edits

7d4243f

Add schema changes

316e2bf

asdataminer force-pushed the rlhf_reward_loss branch from 3279c22 to 316e2bf Compare June 14, 2023 20:43

Add refactored processing logic and trainer

dd675d6

asdataminer force-pushed the rlhf_reward_loss branch from 9b64a1a to dd675d6 Compare June 14, 2023 23:34

Style edits

cae42ad

asdataminer force-pushed the rlhf_reward_loss branch from b5d61b9 to cae42ad Compare June 14, 2023 23:40

Modify tests

a0808cf

asdataminer force-pushed the rlhf_reward_loss branch from 336b52b to a0808cf Compare June 14, 2023 23:43

More test edits

9b46959

asdataminer force-pushed the rlhf_reward_loss branch from b9b379e to 9b46959 Compare June 15, 2023 00:00

asdataminer added 2 commits June 14, 2023 18:36

Make reward model a separate model type

cc40f7c

Additional refactor edits

a93a1b8

asdataminer force-pushed the rlhf_reward_loss branch from 132af98 to a93a1b8 Compare June 15, 2023 02:01

Style edits

e311741

asdataminer force-pushed the rlhf_reward_loss branch from f6c84b1 to e311741 Compare June 15, 2023 02:07

Add text encoder

1662827

asdataminer force-pushed the rlhf_reward_loss branch from 55e8d2b to 1662827 Compare June 15, 2023 02:13

Small edits

4860265

asdataminer force-pushed the rlhf_reward_loss branch from 0750f2d to 4860265 Compare June 15, 2023 02:37

asdataminer added 2 commits June 14, 2023 21:04

Modify trainer, tests passing

9cf18d6

Bug fix

82bbffe

asdataminer force-pushed the rlhf_reward_loss branch from 439ec2a to 82bbffe Compare June 20, 2023 14:26

asdataminer added 2 commits June 20, 2023 07:53

Small edit

3675f42

Another edit

4f852ea

asdataminer force-pushed the rlhf_reward_loss branch from c2527f0 to 4f852ea Compare June 20, 2023 15:04

Reward loss test

b6ef5d1

mhabedank closed this Oct 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RLHF Reward Trainer and Loss #3435

Add RLHF Reward Trainer and Loss #3435

asdataminer commented Jun 8, 2023

github-actions bot commented Jun 8, 2023 •

edited

Loading

Add RLHF Reward Trainer and Loss #3435

Add RLHF Reward Trainer and Loss #3435

Conversation

asdataminer commented Jun 8, 2023

Code Pull Requests

Documentation Pull Requests

github-actions bot commented Jun 8, 2023 • edited Loading

Unit Test Results

github-actions bot commented Jun 8, 2023 •

edited

Loading