-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix small bugs in hpopt #873
Conversation
Reduced the search space for the init and final lr ratios and max_lr Fixed the warmup epochs to be an integer (and not None) Fixed the batch size to be integer (powers of 2) Replace quniform to qrandint to ensure int for agrregation_norm, depth, ffn_hidden_dim, ffn_num_layers, message_hidden_dim.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple small questions/suggestions. Thanks for starting this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making this PR.
I have addressed all the comments. Please re-review and merge! |
…ch space accordingly
Made some more fixes re: warmup epochs. Open for re-review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
All comments addressed. Open for re-review and merge |
@akshatzalte one last small thing, then it's good to go. |
Description
Hyperparameter optimization was giving very small (~10^-10) Learning rates, which seemed wrong to people in the devs meeting (5/20/2024). This was because the search space was too vast.
Also, the batch-size was not enforced to be an integer currently.
Reduction of search space and fixing the values in hpopt.
Example / Current workflow
In the current workflow, quniform outputs a float value which is not correct.
For the batch-size, tune.choice() used to simplify it and enforce only 16, 32.., 256 is selected.
Bugfix / Desired workflow
Reduced the search space for the init and final lr ratios and max_lr Fixed the warmup epochs to be an integer (and not None) Fixed the batch size to be an integer (powers of 2)
Replace quniform to qrandint to ensure int for agrregation_norm, depth, ffn_hidden_dim, ffn_num_layers, message_hidden_dim.
Questions
none
Relevant issues
#851
Checklist