Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lr_warmup should not be passed when adafactor is used as the optimizer #617

Closed
martianunlimited opened this issue Apr 13, 2023 · 3 comments

Comments

@martianunlimited
Copy link

It's not really a major bug, more of an inconvenience, but the gui passes lr_warmup to the training command when it is non-zero, causing library/train_util.py to raise a ValueError if the optimizer is AdaFactor. Error is raised mid-way after latents are cached, just before dataloader is created instead of the start of the calling train_network, wasting time, and making it harder to pick out what caused the error for users who are not used to reading stack traces.

ValueError: adafactor:0.0001 does not require num_warmup_steps. Set None or 0.

Suggested fix in the order of preference:
a) Pop-up a warning in the gui if user did not set lr_warmup to 0 when optimizer is set to AdaFactor (recommended)
or
b) Raise error at train_network.py when invalid combination of optimizer and lr_warmup is used.
c) Change train_util.py to raise a warning and ignore value lr_warmup (not-recommended)

Unless there is differing opinions as to why a) and b) is not the way to go, I will go ahead and send a push request for a) and b) over the weekend with said "fix".

@bmaltais
Copy link
Owner

I can't change train_network.py because it is maintained by kohya in his repo. I can implement option a easilly enough ;-)

@bmaltais
Copy link
Owner

The dev branch now has the fix

@bmaltais bmaltais mentioned this issue Apr 18, 2023
@danielaixer
Copy link

Does setting "LR warmup (% of steps)" to "0" act as a workaround?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants