Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] RMSNorm Implementation #180

Open
0seba opened this issue Mar 27, 2024 · 2 comments
Open

[BUG] RMSNorm Implementation #180

0seba opened this issue Mar 27, 2024 · 2 comments
Assignees
Labels
bug Something isn't working no-issue-activity

Comments

@0seba
Copy link

0seba commented Mar 27, 2024

I think you should be dividing by the scale in the following line

return normed * self.scale * self.gamma

This this the scale definition

https://github.com/kyegomez/zeta/blob/7dbb6a62f83413977a922d5fc6dec1b11f734bc3/zeta/nn/modules/rms_norm.py#L29C9-L29C31

self.scale = dim**-0.5

And RMSNorm formula

image

Edit:

Also, I think the normalization should be in the dim -1, not -2

normed = F.normalize(x, dim=-2)

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar
@0seba 0seba added the bug Something isn't working label Mar 27, 2024
Copy link

Hello there, thank you for opening an Issue ! 🙏🏻 The team was notified and they will get back to you asap.

Copy link

Stale issue message

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working no-issue-activity
Projects
None yet
Development

No branches or pull requests

2 participants