Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is learning rate in R50 logs half of the config lr? #53

Closed
NaomiEX opened this issue Feb 5, 2024 · 2 comments
Closed

Why is learning rate in R50 logs half of the config lr? #53

NaomiEX opened this issue Feb 5, 2024 · 2 comments

Comments

@NaomiEX
Copy link

NaomiEX commented Feb 5, 2024

When looking at your R50 logs and configs, I noticed that the lr in the logs is 3e-4 after warmup:
image

whereas in the config the lr should be 6e-4:
image

Is this the intended behaviour? This doesn't show up in the R101 config and I was wondering why it behaves like this

@NaomiEX
Copy link
Author

NaomiEX commented Feb 5, 2024

Nevermind, it turns out that the TextLoggerHook prints out the lr of the first param group, which happens to be the img_backbone in this case, and its lr is halved.

Although as a suggestion to the authors, you could change the order of param groups, like create self.head or self.neck first in sparse4d.py instead of self.img_backbone to avoid this confusion as I spent an hour trying to figure out why my lr was getting halved

@NaomiEX NaomiEX closed this as completed Feb 5, 2024
@linxuewu
Copy link
Owner

linxuewu commented Feb 5, 2024

Nevermind, it turns out that the TextLoggerHook prints out the lr of the first param group, which happens to be the img_backbone in this case, and its lr is halved.

Although as a suggestion to the authors, you could change the order of param groups, like create self.head or self.neck first in sparse4d.py instead of self.img_backbone to avoid this confusion as I spent an hour trying to figure out why my lr was getting halved

This is a design flaw in mmcv, changing the model order cannot fundamentally solve the problem, for example, when lr scale is also set for the head or neck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants