Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation and error messages for w_init and w_init_scale to avoid confusion #541

Merged
merged 1 commit into from
Oct 18, 2022

Conversation

nlsfnr
Copy link
Contributor

@nlsfnr nlsfnr commented Oct 13, 2022

As per #535, it seems that the arguments w_init and w_init_scale to MultiHeadAttention are causing some confusion. This comes from w_init having a placeholder default value of None which provides backwards compatibility as long as w_init_scale is not fully deprecated.

This PR adds documentation and extends an error message to make the situation clearer.

@copybara-service copybara-service bot merged commit 660cdd4 into google-deepmind:main Oct 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants