Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Online DPO scheduler step before optimizer step #1872

Closed
edbeeching opened this issue Jul 25, 2024 · 1 comment
Closed

Online DPO scheduler step before optimizer step #1872

edbeeching opened this issue Jul 25, 2024 · 1 comment

Comments

@edbeeching
Copy link
Collaborator

I see this warning in the test logs when testing online DPO:

tests/test_online_dpo_trainer.py::TestOnlineDPOTrainer::test_online_dpo_trainer_training
  C:\hostedtoolcache\windows\Python\3.10.11\x64\lib\site-packages\torch\optim\lr_scheduler.py:216: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate

So it seems the opt.step and scheduler.step are called in the wrong order.

@qgallouedec
Copy link
Member

Fixed in #1918

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants