Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bug in gradient size of mul_backward_kernel()? #27

Open
prashanthr05 opened this issue Nov 17, 2022 · 0 comments
Open

Possible bug in gradient size of mul_backward_kernel()? #27

prashanthr05 opened this issue Nov 17, 2022 · 0 comments

Comments

@prashanthr05
Copy link

Hi guys,

First of all, congratulations on this great work and thank you for making the code open-source.

While going through the implementation details, I noticed that the Grad type in the mul_backward_kernel() method in

Grad dZ(grad + i*Group::N);
is accessed from the memory with the dimension N which appears to be the embedding size (or the representation size). Shouldn't this instead be accessed with dimension K
using Grad = Eigen::Matrix<scalar_t,1,Group::K>;

which is the dimension of the tangent space that satisfies the chain rule described in Eq. (14) of the paper.

Please let me know if this observation is correct or I have misunderstood this as a bug? Thanks in advance. Keep up the great work :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant