Fake DPO / KTO #599

psinger · 2024-01-31T14:03:39Z

This PR adds the simple KTO loss.

It currently requires to build pairs of accepted and random rejected samples outside of LLm Studio.

…upgrade_python_deps

psinger · 2024-02-01T13:56:16Z

llm_studio/app_utils/sections/chat.py

+        # merges the LoRa layers into the base model.
+        # This is needed if one wants to use the base model as a standalone model.
+        logger.info("Merging LORA layers with base model.")
+        if device == "cpu":


This was a side fix I did in this PR, some models cant be merged on cpu in float16.

maxjeblick · 2024-02-01T19:13:40Z

llm_studio/src/losses/text_dpo_modeling_losses.py

@@ -95,6 +142,7 @@ class Losses:
        "DPOLoss": DPOLoss,
        "HingeLoss": HingeLoss,
        "IPOLoss": IPOLoss,
+        "KTOPairLoss": KTOPairLoss,


I guess KTOPairLoss needs to be added to LOSS_REDUCTION dict.

Mid-term it may sense to add get_batch_logps function directly to the loss calculation instead of using it in the model (and pass output dict with logits + labels to the loss functions). But not high priority atm.

…o psi/dpofakepairs

psinger · 2024-02-05T11:09:43Z

@maxjeblick any idea why this mypy error happens while it does not for DPOLoss
https://github.com/h2oai/h2o-llmstudio/actions/runs/7783291836/job/21221448386?pr=599#step:5:32

maxjeblick · 2024-02-05T11:11:45Z

No, no idea, looks strange.

psinger · 2024-02-05T11:58:30Z

No, no idea, looks strange.

Aftzer spending an hour on it without managing to solve the issues (apart from manually casting everything), I decided to remove the type annotation for return for both losses in order to not waste more time on it.

maxjeblick

Thanks a lot, LGTM! Let's maybe add a note in the README that we added KTO loss (and how to use it currently).

pascal-pfeiffer and others added 23 commits January 27, 2024 11:28

h2o-wave = "==1.0.0" and minor upgrades

729021f

Update requirements.txt

c1c03d8

SQLAlchemy

82057fe

Merge remote-tracking branch 'origin/pp/upgrade_python_deps' into pp/…

0591878

…upgrade_python_deps

Update requirements.txt

4849292

replaced deprecated use_auth_token with token

b6d46b4

upd changed-filesv35 to changed-filesv41

c4182df

mapped_column

d6ae214

use of __wave_submission_name__

581729c

more __wave_submission_name__

f329b5e

Merge branch 'main' into pp/upgrade_python_deps

cfcdc56

c

de88f05

m

35715e8

c

e009706

m

cb3ccd1

c

6043f1e

c

965f576

lock

535444e

format

2a2a79f

tooltips

565dc46

m

c980a9b

pipfile

ab1d6c3

format

c5d1cbd

psinger requested a review from maxjeblick February 1, 2024 13:48

psinger marked this pull request as ready for review February 1, 2024 13:48

Update requirements.txt

1165467

psinger commented Feb 1, 2024

View reviewed changes

maxjeblick reviewed Feb 1, 2024

View reviewed changes

psinger added 2 commits February 2, 2024 10:14

fix

235bc76

Merge branch 'psi/dpofakepairs' of github.com:h2oai/h2o-llmstudio int…

5c44e9d

…o psi/dpofakepairs

Merge branch 'main' into psi/dpofakepairs

33a4ccd

psinger added 2 commits February 5, 2024 11:56

c

241b05b

c

33708cf

psinger requested a review from maxjeblick February 5, 2024 11:58

maxjeblick approved these changes Feb 6, 2024

View reviewed changes

README

976523d

psinger merged commit a7050b3 into main Feb 6, 2024
5 checks passed

psinger deleted the psi/dpofakepairs branch February 6, 2024 13:20

psinger mentioned this pull request Feb 14, 2024

[FEATURE] Add KTO #544

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fake DPO / KTO #599

Fake DPO / KTO #599

psinger commented Jan 31, 2024 •

edited

psinger Feb 1, 2024

maxjeblick Feb 1, 2024

psinger commented Feb 5, 2024

maxjeblick commented Feb 5, 2024

psinger commented Feb 5, 2024

maxjeblick left a comment

Fake DPO / KTO #599

Fake DPO / KTO #599

Conversation

psinger commented Jan 31, 2024 • edited

psinger Feb 1, 2024

Choose a reason for hiding this comment

maxjeblick Feb 1, 2024

Choose a reason for hiding this comment

psinger commented Feb 5, 2024

maxjeblick commented Feb 5, 2024

psinger commented Feb 5, 2024

maxjeblick left a comment

Choose a reason for hiding this comment

psinger commented Jan 31, 2024 •

edited