Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set Layers that are freezed to eval mode in BERT during training #3419

Open
successar opened this issue Nov 2, 2019 · 2 comments
Open

Set Layers that are freezed to eval mode in BERT during training #3419

successar opened this issue Nov 2, 2019 · 2 comments

Comments

@successar
Copy link

@successar successar commented Nov 2, 2019

Hi

I was wondering if there was a way to turn the dropout and layer-norm layers in BERT to eval mode during training when we set the requires_grad parameter to False for PretrainedBert here -

for param in model.parameters():
?

The problem I see is that the allennlp trainer loop call model.train() on the whole model and will turn back the BERT to train mode even if I modify the initialization code above to set to eval model.

My current setup is to set the layers to eval mode during forward call of the model .
Is there a better way ?

https://discourse.allennlp.org/t/bert-set-to-eval-mode-when-requires-grad-false/103/2?u=sarthak_jain

@kernelmachine

This comment has been minimized.

Copy link
Contributor

@kernelmachine kernelmachine commented Nov 5, 2019

Based on the discourse thread, I'll set this to Contributions Welcome.

@matt-gardner

This comment has been minimized.

Copy link
Member

@matt-gardner matt-gardner commented Nov 8, 2019

Putting in my comment from the discourse thread:

Hmm, sounds like we’d want to override model.train() to handle this properly. It also sounds a bit messy to get correct for everything, but if you can think of a clean solution, I think this is definitely a problem that we’d want to fix in the library. Feel free to open an issue about this in the repo, and I’ll mark it as “contributions welcome”.

You should be able to override train() on your own model class, also. That would be a good way to test this to see if it’s possible to do it in a clean way that will generalize to other models. If you can, then a PR to add it to the base model class would be lovely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.