-
-
Notifications
You must be signed in to change notification settings - Fork 963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce search space for AllenNLP example. #1542
Reduce search space for AllenNLP example. #1542
Conversation
optimizer = torch.optim.SGD(model.parameters(), lr=lr) | ||
|
||
data_loader = torch.utils.data.DataLoader( | ||
train_dataset, batch_size=64, collate_fn=allennlp.data.allennlp_collate | ||
train_dataset, batch_size=16, collate_fn=allennlp.data.allennlp_collate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed batch size, as I found small batch size improved accuracy.
examples/allennlp/allennlp_simple.py
Outdated
@@ -105,11 +100,11 @@ def objective(trial): | |||
if DEVICE > -1: | |||
model.to(torch.device("cuda:{}".format(DEVICE))) | |||
|
|||
lr = trial.suggest_float("lr", 1e-5, 1e-1, log=True) | |||
lr = trial.suggest_float("lr", 1e-2, 1e-1, log=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I increased the lower bound of learning rate, as I found that model training didn't succeed at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
74c8a23 1e-2 -> 1e-3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good 👍
@@ -123,7 +118,7 @@ def objective(trial): | |||
validation_data_loader=validation_data_loader, | |||
validation_metric="+" + TARGET_METRIC, | |||
patience=None, # `patience=None` since it could conflict with AllenNLPPruningCallback | |||
num_epochs=50, | |||
num_epochs=30, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decreased the number of epochs for reducing the execution time for each trial.
Codecov Report
@@ Coverage Diff @@
## master #1542 +/- ##
=======================================
Coverage 89.15% 89.15%
=======================================
Files 104 104
Lines 7891 7891
=======================================
Hits 7035 7035
Misses 856 856 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your PR. I confirmed that the trials finished in ten minutes, and it basically looks good to me except for a typo of the class name.
|
||
|
||
@DatasetReader.register("subsample") | ||
class SubsampledDatasetReader(allennlp.data.dataset_readers.TextClassificationJsonReader): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is called SubsampleDatasetReader
in the allennlp_simple.py
. Could you tell me which is correct?
class SubsampledDatasetReader(allennlp.data.dataset_readers.TextClassificationJsonReader): | |
class SubsampleDatasetReader(allennlp.data.dataset_readers.TextClassificationJsonReader): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review.
Initially, I create SubSampledDataset
, which describes a reader for a subsampled dataset.
But I renamed it to SubsampleDataset
since I could describe this class as a sub-sample dataset.
(And I thought it would be better to use only present tense words in the filenames if it was possible.)
I'm not confident for this convention, so I'd like to get your feedback.
Do you think SubsampledDatasetReader
is better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think either one is OK because I found both use cases.
SubsampledDatasetReader
can be seen in the following issue:
Limit dataset size allenai/allennlp#3099 (comment)
(BTW,itertools.islice
was employed in the code example in the issue. It can be useful if the dataset is too large to fit in memory.)subsample
is also used in the variable namebatch_subsample_map
.
I prefer SubsampleDatasetReader
for its simplicity, but it is not a strong opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so let me leave the dataset reader name as is.
islice
... It can be useful if the dataset is too large to fit in memory.
Thank you for pointing out, you're right!
I gave it try to useislice
in 4437f48 and found the dataset is quite balanced.
train: Counter({'0': 1003, '1': 997})
validation: Counter({'1': 489, '0': 511})
I also ran HPO on GPU to confirm training could go well.
python allennlp_simple.py
....
[I 2020-07-22 13:38:58,896] Trial 21 finished with value: 0.704 and parameters: {'dropout': 0.18415084463564746, 'embedding_dim': 33, 'output_dim': 69, 'max_filter_size': 4, 'num_filters': 55, 'lr': 0.09093981266227305}. Best is trial 12 with value: 0.725.
Number of finished trials: 22
Best trial:
Value: 0.725
Params:
dropout: 0.31246648830863727
embedding_dim: 22
output_dim: 98
max_filter_size: 4
num_filters: 38
lr: 0.09838055114180111
So I determined to remove train_test_split
and simply use islice
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! The changes look basically good to me. I have a minor comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you for your contribution.
Motivation
Current
allennlp_simple.py
may be computationally extensive and it sometimes causes a timeout.https://github.com/optuna/optuna/actions/runs/175854473
Description of the changes
In this PR, I reduce the hyperparameter search space for the model.
Additionally, I extracted the dataset reader from the script and created the module because I'll also use it in
allennlp_jsonnet.py
.I tested this new example on GitHub Actions to confirm that it reduced the execution time for each trial.
https://github.com/himkt/optuna/runs/894041267