Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PT2 compliant - fbgemm::jagged_dense_elementwise_add_jagged_output #2327

Closed
wants to merge 1 commit into from

Conversation

davidberard98
Copy link
Contributor

Summary:
This op previously didn't have an autograd registration.

a) We would see this warning:

/data/users/dberard/fbsource/buck-out/v2/gen/fbcode/6f27a84d3075b0d5/scripts/dberard/jplusd/__jagged_plus_dense__/jagged_plus_dense#link-tree/torch/autograd/graph.py:744: UserWarning: fbgemm::jagged_dense_elementwise_add_jagged_output: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at fbcode/caffe2/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:72.)

(b) Sometimes we would get aot_autograd partitioner issues because this op would not show up as an op returning a tensor

Previous issue: a single implementation for both CPU and Autograd was registered, which would call DenseToJaggedOp::apply(); a separate CUDA implementation was registered which did not have a backward registration.

Updated implementation:

  • added a CPU implementation which does jagged + dense_to_jagged(dense, offsets)
  • added an AutogradFunction implementation, which: in forward, redispatches to jagged_dense_elementwise_add_jagged_output; and in backward, redispatches to jagged_to_dense.

Differential Revision: D53650907

Copy link

netlify bot commented Feb 12, 2024

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit 10fafe8
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/65ca9682d44a44000852093f
😎 Deploy Preview https://deploy-preview-2327--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53650907

davidberard98 added a commit to davidberard98/FBGEMM that referenced this pull request Feb 12, 2024
…ytorch#2327)

Summary:

This op previously didn't have an autograd registration.

a) We would see this warning:
```
/data/users/dberard/fbsource/buck-out/v2/gen/fbcode/6f27a84d3075b0d5/scripts/dberard/jplusd/__jagged_plus_dense__/jagged_plus_dense#link-tree/torch/autograd/graph.py:744: UserWarning: fbgemm::jagged_dense_elementwise_add_jagged_output: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at fbcode/caffe2/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:72.)
```
(b) Sometimes we would get aot_autograd partitioner issues because this op would not show up as an op returning a tensor

Previous issue: a single implementation for both CPU and Autograd was registered, which would call DenseToJaggedOp::apply(); a separate CUDA implementation was registered which did not have a backward registration.

Updated implementation:
- added a CPU implementation which does `jagged + dense_to_jagged(dense, offsets)`
- added an AutogradFunction implementation, which: in forward, redispatches to jagged_dense_elementwise_add_jagged_output; and in backward, redispatches to jagged_to_dense.

Differential Revision: D53650907
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53650907

davidberard98 added a commit to davidberard98/FBGEMM that referenced this pull request Feb 12, 2024
…ytorch#2327)

Summary:

This op previously didn't have an autograd registration.

a) We would see this warning:
```
/data/users/dberard/fbsource/buck-out/v2/gen/fbcode/6f27a84d3075b0d5/scripts/dberard/jplusd/__jagged_plus_dense__/jagged_plus_dense#link-tree/torch/autograd/graph.py:744: UserWarning: fbgemm::jagged_dense_elementwise_add_jagged_output: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at fbcode/caffe2/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:72.)
```
(b) Sometimes we would get aot_autograd partitioner issues because this op would not show up as an op returning a tensor

Previous issue: a single implementation for both CPU and Autograd was registered, which would call DenseToJaggedOp::apply(); a separate CUDA implementation was registered which did not have a backward registration.

Updated implementation:
- added a CPU implementation which does `jagged + dense_to_jagged(dense, offsets)`
- added an AutogradFunction implementation, which: in forward, redispatches to jagged_dense_elementwise_add_jagged_output; and in backward, redispatches to jagged_to_dense.

Differential Revision: D53650907
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53650907

davidberard98 added a commit to davidberard98/FBGEMM that referenced this pull request Feb 12, 2024
…ytorch#2327)

Summary:

This op previously didn't have an autograd registration.

a) We would see this warning:
```
/data/users/dberard/fbsource/buck-out/v2/gen/fbcode/6f27a84d3075b0d5/scripts/dberard/jplusd/__jagged_plus_dense__/jagged_plus_dense#link-tree/torch/autograd/graph.py:744: UserWarning: fbgemm::jagged_dense_elementwise_add_jagged_output: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at fbcode/caffe2/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:72.)
```
(b) Sometimes we would get aot_autograd partitioner issues because this op would not show up as an op returning a tensor

Previous issue: a single implementation for both CPU and Autograd was registered, which would call DenseToJaggedOp::apply(); a separate CUDA implementation was registered which did not have a backward registration.

Updated implementation:
- added a CPU implementation which does `jagged + dense_to_jagged(dense, offsets)`
- added an AutogradFunction implementation, which: in forward, redispatches to jagged_dense_elementwise_add_jagged_output; and in backward, redispatches to jagged_to_dense.

Differential Revision: D53650907
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53650907

davidberard98 added a commit to davidberard98/FBGEMM that referenced this pull request Feb 12, 2024
…ytorch#2327)

Summary:

This op previously didn't have an autograd registration.

a) We would see this warning:
```
/data/users/dberard/fbsource/buck-out/v2/gen/fbcode/6f27a84d3075b0d5/scripts/dberard/jplusd/__jagged_plus_dense__/jagged_plus_dense#link-tree/torch/autograd/graph.py:744: UserWarning: fbgemm::jagged_dense_elementwise_add_jagged_output: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at fbcode/caffe2/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:72.)
```
(b) Sometimes we would get aot_autograd partitioner issues because this op would not show up as an op returning a tensor

Previous issue: a single implementation for both CPU and Autograd was registered, which would call DenseToJaggedOp::apply(); a separate CUDA implementation was registered which did not have a backward registration.

Updated implementation:
- added a CPU implementation which does `jagged + dense_to_jagged(dense, offsets)`
- added an AutogradFunction implementation, which: in forward, redispatches to jagged_dense_elementwise_add_jagged_output; and in backward, redispatches to jagged_to_dense.

Differential Revision: D53650907
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53650907

@facebook-github-bot
Copy link
Contributor

@davidberard98 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

davidberard98 added a commit to davidberard98/FBGEMM that referenced this pull request Feb 12, 2024
…ytorch#2327)

Summary:
This op previously didn't have an autograd registration.

a) We would see this warning:
```
/data/users/dberard/fbsource/buck-out/v2/gen/fbcode/6f27a84d3075b0d5/scripts/dberard/jplusd/__jagged_plus_dense__/jagged_plus_dense#link-tree/torch/autograd/graph.py:744: UserWarning: fbgemm::jagged_dense_elementwise_add_jagged_output: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at fbcode/caffe2/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:72.)
```
(b) Sometimes we would get aot_autograd partitioner issues because this op would not show up as an op returning a tensor

Previous issue: a single implementation for both CPU and Autograd was registered, which would call DenseToJaggedOp::apply(); a separate CUDA implementation was registered which did not have a backward registration.

Updated implementation:
- added a CPU implementation which does `jagged + dense_to_jagged(dense, offsets)`
- added an AutogradFunction implementation, which: in forward, redispatches to jagged_dense_elementwise_add_jagged_output; and in backward, redispatches to jagged_to_dense.


Differential Revision: D53650907

Pulled By: davidberard98
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53650907

davidberard98 added a commit to davidberard98/FBGEMM that referenced this pull request Feb 12, 2024
…ytorch#2327)

Summary:
This op previously didn't have an autograd registration.

a) We would see this warning:
```
/data/users/dberard/fbsource/buck-out/v2/gen/fbcode/6f27a84d3075b0d5/scripts/dberard/jplusd/__jagged_plus_dense__/jagged_plus_dense#link-tree/torch/autograd/graph.py:744: UserWarning: fbgemm::jagged_dense_elementwise_add_jagged_output: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at fbcode/caffe2/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:72.)
```
(b) Sometimes we would get aot_autograd partitioner issues because this op would not show up as an op returning a tensor

Previous issue: a single implementation for both CPU and Autograd was registered, which would call DenseToJaggedOp::apply(); a separate CUDA implementation was registered which did not have a backward registration.

Updated implementation:
- added a CPU implementation which does `jagged + dense_to_jagged(dense, offsets)`
- added an AutogradFunction implementation, which: in forward, redispatches to jagged_dense_elementwise_add_jagged_output; and in backward, redispatches to jagged_to_dense.


Reviewed By: williamwen42

Differential Revision: D53650907

Pulled By: davidberard98
davidberard98 added a commit to davidberard98/FBGEMM that referenced this pull request Feb 12, 2024
…ytorch#2327)

Summary:
This op previously didn't have an autograd registration.

a) We would see this warning:
```
/data/users/dberard/fbsource/buck-out/v2/gen/fbcode/6f27a84d3075b0d5/scripts/dberard/jplusd/__jagged_plus_dense__/jagged_plus_dense#link-tree/torch/autograd/graph.py:744: UserWarning: fbgemm::jagged_dense_elementwise_add_jagged_output: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at fbcode/caffe2/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:72.)
```
(b) Sometimes we would get aot_autograd partitioner issues because this op would not show up as an op returning a tensor

Previous issue: a single implementation for both CPU and Autograd was registered, which would call DenseToJaggedOp::apply(); a separate CUDA implementation was registered which did not have a backward registration.

Updated implementation:
- added a CPU implementation which does `jagged + dense_to_jagged(dense, offsets)`
- added an AutogradFunction implementation, which: in forward, redispatches to jagged_dense_elementwise_add_jagged_output; and in backward, redispatches to jagged_to_dense.


Reviewed By: williamwen42

Differential Revision: D53650907

Pulled By: davidberard98
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53650907

@facebook-github-bot
Copy link
Contributor

@davidberard98 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

…ytorch#2327)

Summary:
This op previously didn't have an autograd registration.

a) We would see this warning:
```
/data/users/dberard/fbsource/buck-out/v2/gen/fbcode/6f27a84d3075b0d5/scripts/dberard/jplusd/__jagged_plus_dense__/jagged_plus_dense#link-tree/torch/autograd/graph.py:744: UserWarning: fbgemm::jagged_dense_elementwise_add_jagged_output: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at fbcode/caffe2/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:72.)
```
(b) Sometimes we would get aot_autograd partitioner issues because this op would not show up as an op returning a tensor

Previous issue: a single implementation for both CPU and Autograd was registered, which would call DenseToJaggedOp::apply(); a separate CUDA implementation was registered which did not have a backward registration.

Updated implementation:
- added a CPU implementation which does `jagged + dense_to_jagged(dense, offsets)`
- added an AutogradFunction implementation, which: in forward, redispatches to jagged_dense_elementwise_add_jagged_output; and in backward, redispatches to jagged_to_dense.


Reviewed By: williamwen42

Differential Revision: D53650907

Pulled By: davidberard98
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D53650907

@facebook-github-bot
Copy link
Contributor

@davidberard98 merged this pull request in 0c43bee.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants