Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory. What can I do to improve model performance? #6

Open
mrxiaohe opened this issue Mar 17, 2019 · 3 comments
Open

CUDA out of memory. What can I do to improve model performance? #6

mrxiaohe opened this issue Mar 17, 2019 · 3 comments

Comments

@mrxiaohe
Copy link

I have a Tesla GPU which has only 16 Gb -- much less than what you used for your experiment described in the Medium article. As a result, I had to reduce the max sequence length from 512 to 128, and the batch size from 32 to 16. After 4 epochs, the validation accuracies of the various toxic comment categories were around 0.6 to 0.65. I wonder if increasing the number of epochs would help increase the performance.

In addition, is there a way to continue training a model -- say after 4 epochs, if the validation results are not good, can I continue the training rather than restart the training with a larger number of epochs? Is it sufficient to just rerun fit()`?

Thanks !

@ghost
Copy link

ghost commented Mar 17, 2019

Are you using the BERT-large or BERT-base model type? With BERT-base, you should get very good results with a seq len of 256 and batch size of 16 (I did, anyway...).

Google's recommended seq/batch combos are at https://github.com/google-research/bert#out-of-memory-issues .

@mrxiaohe
Copy link
Author

I am using BERT-large uncased. Did you get your results after only 4 epochs?

@mrxiaohe
Copy link
Author

@tombriles I changed the model from large to base (uncased), and now a max seq len of 256 doesn't cause the out of memory error (it did before when I used the large model). I will report back on the performance once training is done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant