Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Set Layers that are freezed to eval mode in BERT during training #3419
I was wondering if there was a way to turn the dropout and layer-norm layers in BERT to eval mode during training when we set the requires_grad parameter to False for PretrainedBert here -
The problem I see is that the allennlp trainer loop call model.train() on the whole model and will turn back the BERT to train mode even if I modify the initialization code above to set to eval model.
My current setup is to set the layers to eval mode during forward call of the model .
Putting in my comment from the discourse thread:
Hmm, sounds like we’d want to override model.train() to handle this properly. It also sounds a bit messy to get correct for everything, but if you can think of a clean solution, I think this is definitely a problem that we’d want to fix in the library. Feel free to open an issue about this in the repo, and I’ll mark it as “contributions welcome”.
You should be able to override train() on your own model class, also. That would be a good way to test this to see if it’s possible to do it in a clean way that will generalize to other models. If you can, then a PR to add it to the base model class would be lovely.