Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a post init method to all models #14431

Merged
merged 6 commits into from Nov 18, 2021
Merged

Add a post init method to all models #14431

merged 6 commits into from Nov 18, 2021

Conversation

sgugger
Copy link
Collaborator

@sgugger sgugger commented Nov 17, 2021

What does this PR do?

This PR introduces the proper fix for #14388 by introducing a new post_init method to each model, which replaces the current init_weights() call. The method can execute any code that requires the model to be properly initialized, such as the init_weights() or the gradient checkpointing BC fix (and more if need to in the future).

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I'm wondering whether keeping self.init_weights() in each model's __init__() method might be better for readability as it's harder to see now for users where and how the model weights are initialized. But don't feel strongly about it.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me and I don't think it hurts readability, but I don't feel strongly about it. If you'd rather the self.init_weights call stay in each modeling file, that's also fine by me @patrickvonplaten

@sgugger sgugger merged commit d83b0e0 into master Nov 18, 2021
@sgugger sgugger deleted the post_init branch November 18, 2021 13:38
@patil-suraj patil-suraj mentioned this pull request Jan 7, 2022
Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request Jan 27, 2022
* Add a post init method to all models

* Fix tests

* Fix last tests

* Fix templates

* Add comment

* Forgot to save
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants