[`DPOTrainer`] Fix DPO trainer + mistral + FA2 #1290

younesbelkada · 2024-01-30T07:08:53Z

Fixes: #1217
Fixes: huggingface/transformers#26877
Fixes: #1266

Simply setting use_cache=False circumvents all issues with FA-2 + DPO + Mistral. in fact we should bypass that check since we are not in text generation mode when computing the loss function. use_cache is retrieved from the model config by default which falls back always to True. The cache is not used anyway when purely computing the logits so this change is fully BC

cc @kashif @vwxyzjn

To test that, I managed to repro the issue by adding --attn_implementation "flash_attention_2" in the dpo shell script on a A100 machine, and I confirm this PR fixes it. Unfortunately our CI runners are not compatible with FA2 so we cannot add a slow test to test that

HuggingFaceDocBuilderDev · 2024-01-30T07:12:58Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

kashif · 2024-01-30T07:25:26Z

thank you great help!

fix DPO trainer + mistral + FA2

6e324bb

younesbelkada requested review from vwxyzjn and kashif January 30, 2024 07:08

kashif approved these changes Jan 30, 2024

View reviewed changes

kashif merged commit b415224 into main Jan 30, 2024
9 checks passed

kashif deleted the fix-dpo-fa2 branch January 30, 2024 07:25

younesbelkada mentioned this pull request Feb 2, 2024

Mixtral 16bit LoRa OOM with deepspeed zero stage 3 and dpo trainer on 4 80GB A100s #1268

Closed

lapp0 pushed a commit to lapp0/trl that referenced this pull request May 10, 2024

fix DPO trainer + mistral + FA2 (huggingface#1290)

c268b09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`DPOTrainer`] Fix DPO trainer + mistral + FA2 #1290

[`DPOTrainer`] Fix DPO trainer + mistral + FA2 #1290

younesbelkada commented Jan 30, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Jan 30, 2024

kashif commented Jan 30, 2024

[DPOTrainer] Fix DPO trainer + mistral + FA2 #1290

[DPOTrainer] Fix DPO trainer + mistral + FA2 #1290

Conversation

younesbelkada commented Jan 30, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Jan 30, 2024

kashif commented Jan 30, 2024

[`DPOTrainer`] Fix DPO trainer + mistral + FA2 #1290

[`DPOTrainer`] Fix DPO trainer + mistral + FA2 #1290

younesbelkada commented Jan 30, 2024 •

edited

Loading