update in DPO raise several problems... #1440

holarissun · 2024-03-18T21:52:53Z

get_hh was removed. There is no dataset now.
from trl.commands.cli_utils import DpoScriptArguments, init_zero_verbose, TrlParser: No module named 'trl.commands'

lvwerra · 2024-03-20T08:53:34Z

Those changes are currently only on main, did you install TRL from source?

vwxyzjn · 2024-03-20T14:29:39Z

Would you like to give #1456 a try?

python examples/scripts/dpo.py \
    --dataset_name=trl-internal-testing/hh-rlhf-trl-style \
    --model_name_or_path=gpt2 \
    --per_device_train_batch_size 4 \
    --max_steps 1000 \
    --learning_rate 1e-3 \
    --gradient_accumulation_steps 1 \
    --logging_steps 10 \
    --eval_steps 500 \
    --output_dir="dpo_anthropic_hh" \
    --warmup_steps 150 \
    --report_to wandb \
    --bf16 \
    --logging_first_step \
    --no_remove_unused_columns

github-actions · 2024-04-18T15:04:59Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

github-actions bot closed this as completed Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update in DPO raise several problems... #1440

update in DPO raise several problems... #1440

holarissun commented Mar 18, 2024

lvwerra commented Mar 20, 2024

vwxyzjn commented Mar 20, 2024 •

edited

Loading

github-actions bot commented Apr 18, 2024

update in DPO raise several problems... #1440

update in DPO raise several problems... #1440

Comments

holarissun commented Mar 18, 2024

lvwerra commented Mar 20, 2024

vwxyzjn commented Mar 20, 2024 • edited Loading

github-actions bot commented Apr 18, 2024

vwxyzjn commented Mar 20, 2024 •

edited

Loading