Avoid cloning gradient tensor in embedding backward pass #2526

jhadidjojo · 2024-04-22T14:54:31Z

Summary: I found memory spike during embedding kernel backward split_embedding_backward_codegen_rowwise_adagrad_unweghted_exact_cuda, which was traced into the below code making a clone of the gradient tensor. This logic didn't seem to be there in the original code: https://github.com/pytorch/FBGEMM/pull/2347/files#diff-944ab49dcbcf54826cc3e1eab5e3c0c787b5a195f602c2d3052adae14c506d78.

Differential Revision: D56420646

facebook-github-bot · 2024-04-22T14:54:38Z

This pull request was exported from Phabricator. Differential Revision: D56420646

netlify · 2024-04-22T14:54:47Z

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`3f16122`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/6629059bd7f6f200086dbaa1
😎 Deploy Preview	https://deploy-preview-2526--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

flaviotruzzi · 2024-04-22T19:19:03Z

This will break the signature for const, will it not? The reason for that if I recall it was that we moved this memory alignment into the operator, inside the cuda kernel, the signatures ask that to be const. If we don’t do it inside the operator we need to figure out a way to adjust the meta kernel which don’t have access to the data pointers to check the alignment.

jhadidjojo · 2024-04-22T19:32:09Z

This will break the signature for const, will it not?

No, I don't think so. You're right that grad_output is a const reference, and if you try to assign that to a new value it will error (that's probably the error you saw?). Here I'm creating a new (non-const) reference called aligned_grad_output. You can assign it to the grad_output by reference without copy.

flaviotruzzi · 2024-04-22T20:47:58Z

This will break the signature for const, will it not?

No, I don't think so. You're right that grad_output is a const reference, and if you try to assign that to a new value it will error (that's probably the error you saw?). Here I'm creating a new (non-const) reference called aligned_grad_output. You can assign it to the grad_output by reference without copy.

Cool, I am only on my phone, and still on leave, can you check with someone else to approve it?

jhadidjojo · 2024-04-23T20:28:22Z

@pytorchbot merge

pytorch-bot · 2024-04-23T20:28:27Z

This PR needs to be approved by an authorized maintainer before merge.

Summary: I found memory spike during embedding kernel backward `split_embedding_backward_codegen_rowwise_adagrad_unweghted_exact_cuda`, which was traced into the below code making a clone of the gradient tensor. This logic didn't seem to be there in the original code: https://github.com/pytorch/FBGEMM/pull/2347/files#diff-944ab49dcbcf54826cc3e1eab5e3c0c787b5a195f602c2d3052adae14c506d78. Reviewed By: ezyang Differential Revision: D56420646

facebook-github-bot · 2024-04-24T13:14:12Z

This pull request was exported from Phabricator. Differential Revision: D56420646

facebook-github-bot · 2024-04-24T15:46:46Z

This pull request has been merged in a75037b.

facebook-github-bot added the cla signed label Apr 22, 2024

facebook-github-bot added the fb-exported label Apr 22, 2024

jhadidjojo requested review from flaviotruzzi and sryap April 22, 2024 18:09

jhadidjojo self-assigned this Apr 22, 2024

jhadidjojo force-pushed the export-D56420646 branch from b93d4fb to 3f16122 Compare April 24, 2024 13:14

facebook-github-bot closed this in a75037b Apr 24, 2024

facebook-github-bot added the Merged label Apr 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid cloning gradient tensor in embedding backward pass #2526

Avoid cloning gradient tensor in embedding backward pass #2526

jhadidjojo commented Apr 22, 2024

facebook-github-bot commented Apr 22, 2024

netlify bot commented Apr 22, 2024 •

edited

flaviotruzzi commented Apr 22, 2024 •

edited

jhadidjojo commented Apr 22, 2024 •

edited

flaviotruzzi commented Apr 22, 2024

jhadidjojo commented Apr 23, 2024

pytorch-bot bot commented Apr 23, 2024

facebook-github-bot commented Apr 24, 2024

facebook-github-bot commented Apr 24, 2024

Avoid cloning gradient tensor in embedding backward pass #2526

Avoid cloning gradient tensor in embedding backward pass #2526

Conversation

jhadidjojo commented Apr 22, 2024

facebook-github-bot commented Apr 22, 2024

netlify bot commented Apr 22, 2024 • edited

✅ Deploy Preview for pytorch-fbgemm-docs ready!

flaviotruzzi commented Apr 22, 2024 • edited

jhadidjojo commented Apr 22, 2024 • edited

flaviotruzzi commented Apr 22, 2024

jhadidjojo commented Apr 23, 2024

pytorch-bot bot commented Apr 23, 2024

facebook-github-bot commented Apr 24, 2024

facebook-github-bot commented Apr 24, 2024

netlify bot commented Apr 22, 2024 •

edited

flaviotruzzi commented Apr 22, 2024 •

edited

jhadidjojo commented Apr 22, 2024 •

edited