Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

svd_backward: more memory and computationally efficient. #50109

Closed
wants to merge 13 commits into from

Conversation

nikitaved
Copy link
Collaborator

@nikitaved nikitaved commented Jan 5, 2021

As per title.

CC @IvanYashchuk (unfortunately I cannot add you as a reviewer for some reason).

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jan 5, 2021

💊 CI failures summary and remediations

As of commit 8b34f0b (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

This comment has been revised 11 times.

Comment on lines 1979 to 1980
L = L - L.conj();
Tensor imag_term = 0.5 * at::matmul(u * (L * sigma_inv).unsqueeze(-2), vh);
Copy link
Collaborator Author

@nikitaved nikitaved Jan 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could it be better to replace L with just 2 * L.imag() and use the float-float mul in L * sigma_inv instead of the float-complex one, then multiply by the imaginary unit in the most efficient way?..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a simple change that might give better performance, so I think it's worth adding it to this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you don't need to do the 2* if you remove the 0.5* right? :D

Copy link
Collaborator Author

@nikitaved nikitaved Jan 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so one more reason to follow the nit.

@smessmer smessmer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jan 6, 2021
@IvanYashchuk
Copy link
Collaborator

Resolves #38353.

Copy link
Collaborator

@IvanYashchuk IvanYashchuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks much better now! For completeness could you do some benchmarks to know how much performance and memory saving do we gain here?

Comment on lines 1979 to 1980
L = L - L.conj();
Tensor imag_term = 0.5 * at::matmul(u * (L * sigma_inv).unsqueeze(-2), vh);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a simple change that might give better performance, so I think it's worth adding it to this PR.

@@ -1947,9 +1944,12 @@ Tensor svd_backward(const std::vector<torch::autograd::Variable> &grads, const T

if (gu.defined()) {
auto guh = gu.conj().transpose(-2, -1);
u_term = at::matmul(u, at::matmul(F.mul(at::matmul(uh, gu) - at::matmul(guh, u)), sigma_mat));
u_term = at::matmul(u, (F.mul(at::matmul(uh, gu) - at::matmul(guh, u)) * sigma.unsqueeze(-2)));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit there is an extra parenthesis here that is not needed for the second argument of the outer most matmul:

    u_term = at::matmul(u, F.mul(at::matmul(uh, gu) - at::matmul(guh, u)) * sigma.unsqueeze(-2));

torch/csrc/autograd/FunctionsManual.cpp Show resolved Hide resolved
Comment on lines 1979 to 1980
L = L - L.conj();
Tensor imag_term = 0.5 * at::matmul(u * (L * sigma_inv).unsqueeze(-2), vh);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you don't need to do the 2* if you remove the 0.5* right? :D

@nikitaved nikitaved requested a review from albanD January 7, 2021 18:28
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the update!

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@albanD has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@albanD has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@albanD merged this pull request in eb87686.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants