You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I feel like #18951 is not visible enough. The change made it super easy to have silent correctness bugs in the codebase when using libraries like huggingface.
Pitch
Add a warning about modules that are set to eval mode before training.
Hello, thanks @mszulc913 for bringing this up. It seems like when loading a huggingface model using from_pretrained, it defaults its training state to False during the training_step. That's indeed not ideal, and could potentially have some negative consequences, especially regarding components like Dropout in the pretrained model.
I prefer not to make a warning. This could confuse users who finetune models where parts of the model are frozen (and thus have to remain in eval mode). This was the main motivation of #18951. The model summary update should give some more visibility into this #19468, it could also be made even more explicit there.
Thank you @awaelchli for the reply. Indeed, the updated model summary will help a lot. However, I'm not sure if it's enough. My main concern is that pre-trained models are ubiquitous today, and this feature poses a risk of turning users away from PL, because of a regression they don't have time or will to investigate.
Description & Motivation
I feel like #18951 is not visible enough. The change made it super easy to have silent correctness bugs in the codebase when using libraries like
huggingface
.Pitch
Add a warning about modules that are set to eval mode before training.
Alternatives
Update the documentation and tutorials.
Additional context
No response
cc @Borda
The text was updated successfully, but these errors were encountered: