-
Notifications
You must be signed in to change notification settings - Fork 301
Closed
Labels
stat:contributions welcomeAdd this label to feature request issues so they are separated out from bug reporting issuesAdd this label to feature request issues so they are separated out from bug reporting issuestype:featureNew feature or requestNew feature or request
Description
From the BERT paper...
For fine-tuning, most model hyperparameters are
the same as in pre-training, with the exception of
the batch size, learning rate, and number of training
epochs. The dropout probability was always
kept at 0.1. The optimal hyperparameter values
are task-specific, but we found the following range
of possible values to work well across all tasks:
• Batch size: 16, 32
• Learning rate (Adam): 5e-5, 3e-5, 2e-5
• Number of epochs: 2, 3, 4
We should allow our BERT finetuning script to do this search automatically. KerasTuner is a good fit for this.
Steps:
- Add an setup.py
examples
dependency on keras-tuner. - Remove epochs, batch size and learning rage arguments from run_glue_finetuning.py.
- Use keras tuner to hyperparemeter search on the above value ranges with the validation set.
Metadata
Metadata
Assignees
Labels
stat:contributions welcomeAdd this label to feature request issues so they are separated out from bug reporting issuesAdd this label to feature request issues so they are separated out from bug reporting issuestype:featureNew feature or requestNew feature or request