Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lift jagged -> padded dense forward / backward kernels from fbgemm_gpu #125946

Closed
wants to merge 22 commits into from

Conversation

jbschlosser
Copy link
Contributor

@jbschlosser jbschlosser commented May 10, 2024

Stack from ghstack (oldest at bottom):

PyTorch can't depend on fbgemm_gpu as a dependency because fbgemm_gpu already has a dependency on PyTorch. So this PR copy / pastes kernels from fbgemm_gpu:

  • dense_to_jagged_forward() as CUDA registration for new ATen op _padded_dense_to_jagged_forward()
  • jagged_to_padded_dense_forward() as CUDA registration for new ATen op _jagged_to_padded_dense_forward()

CPU impls for these new ATen ops will be added in a follow-up PR.

Copy link

pytorch-bot bot commented May 10, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125946

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 96929ac with merge base f343f98 (image):

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `jagged_to_padded_dense_forward()`
* `jagged_to_padded_dense_backward()`

[ghstack-poisoned]
…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()`
* `jagged_to_padded_dense_backward()`

[ghstack-poisoned]
@jbschlosser jbschlosser added the topic: not user facing topic category label May 14, 2024
…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()`
* `jagged_to_padded_dense_backward()`

[ghstack-poisoned]
…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()`
* `jagged_to_padded_dense_backward()`

[ghstack-poisoned]
…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()`
* `jagged_to_padded_dense_backward()`

[ghstack-poisoned]
…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()`
* `jagged_to_padded_dense_backward()`

[ghstack-poisoned]
…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()`
* `jagged_to_padded_dense_backward()`

[ghstack-poisoned]
…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()`
* `jagged_to_padded_dense_backward()`

[ghstack-poisoned]
…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()`
* `jagged_to_padded_dense_backward()`

[ghstack-poisoned]
…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()`
* `jagged_to_padded_dense_backward()`

[ghstack-poisoned]
…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()`
* `jagged_to_padded_dense_backward()`

[ghstack-poisoned]
…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()`
* `jagged_to_padded_dense_backward()`

[ghstack-poisoned]
…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()`
* `jagged_to_padded_dense_backward()`

[ghstack-poisoned]
…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()` as CUDA registration for new ATen op `_padded_dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()` as CUDA registration for new ATen op `_jagged_to_padded_dense_forward()`

CPU impls for these new ATen ops will be added in a follow-up PR.

[ghstack-poisoned]
…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()` as CUDA registration for new ATen op `_padded_dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()` as CUDA registration for new ATen op `_jagged_to_padded_dense_forward()`

CPU impls for these new ATen ops will be added in a follow-up PR.

[ghstack-poisoned]
Copy link
Contributor

@davidberard98 davidberard98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally LGTM! I skimmed through the cuda implementations and they look fine from what I saw.

I guess we could also probably split this out from the stack to make sure there are no other failures masked by the failures introduced by the previous PR? (or we can wait if you prefer)

- func: _padded_dense_to_jagged_forward(Tensor dense, Tensor[] offsets, SymInt? total_L=None) -> Tensor
variants: function
dispatch:
CUDA: _fbgemm_dense_to_jagged_forward_symint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, some variants of padded_dense_to_jagged_forward also take a padding value, in case any of your sequences are longer than the sequence dimension of the dense tensor. I guess we probably don't care too much about this right now though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good point - I think for our initial op coverage purposes, we don't care too much. we may need to revisit this for more general usages though

…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()` as CUDA registration for new ATen op `_padded_dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()` as CUDA registration for new ATen op `_jagged_to_padded_dense_forward()`

CPU impls for these new ATen ops will be added in a follow-up PR.

[ghstack-poisoned]
@jbschlosser
Copy link
Contributor Author

@pytorchbot merge

jbschlosser added a commit that referenced this pull request Jun 3, 2024
ghstack-source-id: 151b65d9db8fd2c074dffcade497b644a3081078
Pull Request resolved: #125946
@jbschlosser
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / win-vs2019-cuda11.8-py3 / build

Details for Dev Infra team Raised by workflow job

…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()` as CUDA registration for new ATen op `_padded_dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()` as CUDA registration for new ATen op `_jagged_to_padded_dense_forward()`

CPU impls for these new ATen ops will be added in a follow-up PR.

[ghstack-poisoned]
jbschlosser added a commit that referenced this pull request Jun 3, 2024
ghstack-source-id: 8b3828fc71ce86a50a37239f69b0f2f47248a772
Pull Request resolved: #125946
@jbschlosser
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / win-vs2019-cuda11.8-py3 / build

Details for Dev Infra team Raised by workflow job

…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()` as CUDA registration for new ATen op `_padded_dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()` as CUDA registration for new ATen op `_jagged_to_padded_dense_forward()`

CPU impls for these new ATen ops will be added in a follow-up PR.

[ghstack-poisoned]
jbschlosser added a commit that referenced this pull request Jun 3, 2024
ghstack-source-id: beab99ee926adc176e3802eadd967cfdc142c50d
Pull Request resolved: #125946
@jbschlosser
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

…m fbgemm_gpu"


PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()` as CUDA registration for new ATen op `_padded_dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()` as CUDA registration for new ATen op `_jagged_to_padded_dense_forward()`

CPU impls for these new ATen ops will be added in a follow-up PR.

[ghstack-poisoned]
jbschlosser added a commit that referenced this pull request Jun 3, 2024
ghstack-source-id: c9fca7bec054de558d1184560099441b7c177dc4
Pull Request resolved: #125946
@jbschlosser
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

petrex pushed a commit to petrex/pytorch that referenced this pull request Jun 5, 2024
pytorch#125946)

PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`:
* `dense_to_jagged_forward()` as CUDA registration for new ATen op `_padded_dense_to_jagged_forward()`
* `jagged_to_padded_dense_forward()` as CUDA registration for new ATen op `_jagged_to_padded_dense_forward()`

CPU impls for these new ATen ops will be added in a follow-up PR.
Pull Request resolved: pytorch#125946
Approved by: https://github.com/davidberard98
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged Reverted topic: not user facing topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants