Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 29807, sinusoidal positional encodings overwritten by post_init() #29813

Merged
merged 6 commits into from
Mar 27, 2024

Conversation

hovnatan
Copy link
Contributor

@hovnatan hovnatan commented Mar 22, 2024

What does this PR do?

Fixes #29807, sinusoidal positional encodings overwritten by post_init(). First time contributing, please let me know of any issues, comments.

Before submitting

Who can review?

@ArthurZucker, @younesbelkada, @amyeroberts

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! IMO we should rather place

        if config.sinusoidal_pos_embds:
            create_sinusoidal_embeddings(
                n_pos=config.max_position_embeddings, dim=config.dim, out=self.position_embeddings.weight
            )

in the _init_weights function.
Also this should be made BC as older version did not have the requirement on _init_weights being the only mode of initializing the weights

@hovnatan
Copy link
Contributor Author

hovnatan commented Mar 25, 2024

@ArthurZucker _init_weights() is applied to every module in the model. I moved sinusoidal positional encoding generation after post_init() in my latest commit. Is that ok?

@ArthurZucker
Copy link
Collaborator

It does but you can check if the name is "position_embeddings"! 🤗

@hovnatan
Copy link
Contributor Author

AFAIK post_init() uses pytorch Module's apply(init_func) functionality which doesn't pass name of the module to the init_func(), see https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.apply.

@hovnatan
Copy link
Contributor Author

Anyways, please look at my latest changes. They solve the problem, but not sure if this is optimal

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect! Thanks for iterating

@ArthurZucker ArthurZucker merged commit a81cf9e into huggingface:main Mar 27, 2024
17 checks passed
itazap pushed a commit that referenced this pull request May 14, 2024
…#29813)

* Check for requires_grad when initing weights

* Add unit test

* Move sinusoidal positional encoding generation after post_init()

* Add modules to skip init list

* Move create_sinusoidal_embeddings to _init_weights
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sinusoidal positional encodings overwritten by post_init()
2 participants