Lift jagged -> padded dense forward / backward kernels from fbgemm_gpu #125946

jbschlosser · 2024-05-10T18:51:06Z

Stack from ghstack (oldest at bottom):

-> Lift jagged -> padded dense forward / backward kernels from fbgemm_gpu #125946

PyTorch can't depend on fbgemm_gpu as a dependency because fbgemm_gpu already has a dependency on PyTorch. So this PR copy / pastes kernels from fbgemm_gpu:

dense_to_jagged_forward() as CUDA registration for new ATen op _padded_dense_to_jagged_forward()
jagged_to_padded_dense_forward() as CUDA registration for new ATen op _jagged_to_padded_dense_forward()

CPU impls for these new ATen ops will be added in a follow-up PR.

[ghstack-poisoned]

pytorch-bot · 2024-05-10T18:51:28Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125946

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 96929ac with merge base f343f98 ():

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

pull / linux-focal-cuda12.4-py3.10-gcc9-sm86 / test (default, 1, 5, linux.g5.4xlarge.nvidia.gpu, unstable) (gh) ()
inductor/test_triton_extension_backend 1/1 failed!

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…m fbgemm_gpu" PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`: * `jagged_to_padded_dense_forward()` * `jagged_to_padded_dense_backward()` [ghstack-poisoned]

aten/src/ATen/native/native_functions.yaml

…m fbgemm_gpu" PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`: * `dense_to_jagged_forward()` * `jagged_to_padded_dense_forward()` * `jagged_to_padded_dense_backward()` [ghstack-poisoned]

…m fbgemm_gpu" PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`: * `dense_to_jagged_forward()` as CUDA registration for new ATen op `_padded_dense_to_jagged_forward()` * `jagged_to_padded_dense_forward()` as CUDA registration for new ATen op `_jagged_to_padded_dense_forward()` CPU impls for these new ATen ops will be added in a follow-up PR. [ghstack-poisoned]

davidberard98

generally LGTM! I skimmed through the cuda implementations and they look fine from what I saw.

I guess we could also probably split this out from the stack to make sure there are no other failures masked by the failures introduced by the previous PR? (or we can wait if you prefer)

aten/src/ATen/native/nested/cuda/NestedTensorTransformerFunctions.cu

davidberard98 · 2024-05-23T17:35:54Z

aten/src/ATen/native/native_functions.yaml

+- func: _padded_dense_to_jagged_forward(Tensor dense, Tensor[] offsets, SymInt? total_L=None) -> Tensor
+  variants: function
+  dispatch:
+    CUDA: _fbgemm_dense_to_jagged_forward_symint


IIRC, some variants of padded_dense_to_jagged_forward also take a padding value, in case any of your sequences are longer than the sequence dimension of the dense tensor. I guess we probably don't care too much about this right now though?

this is a good point - I think for our initial op coverage purposes, we don't care too much. we may need to revisit this for more general usages though

…m fbgemm_gpu" PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`: * `dense_to_jagged_forward()` as CUDA registration for new ATen op `_padded_dense_to_jagged_forward()` * `jagged_to_padded_dense_forward()` as CUDA registration for new ATen op `_jagged_to_padded_dense_forward()` CPU impls for these new ATen ops will be added in a follow-up PR. [ghstack-poisoned]

jbschlosser · 2024-05-23T20:39:12Z

@pytorchbot merge

ghstack-source-id: 151b65d9db8fd2c074dffcade497b644a3081078 Pull Request resolved: #125946

jbschlosser · 2024-06-03T14:38:31Z

@pytorchbot merge

pytorchmergebot · 2024-06-03T14:41:22Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-06-03T15:35:53Z

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / win-vs2019-cuda11.8-py3 / build

Details for Dev Infra team

Raised by workflow job

…m fbgemm_gpu" PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`: * `dense_to_jagged_forward()` as CUDA registration for new ATen op `_padded_dense_to_jagged_forward()` * `jagged_to_padded_dense_forward()` as CUDA registration for new ATen op `_jagged_to_padded_dense_forward()` CPU impls for these new ATen ops will be added in a follow-up PR. [ghstack-poisoned]

ghstack-source-id: 8b3828fc71ce86a50a37239f69b0f2f47248a772 Pull Request resolved: #125946

jbschlosser · 2024-06-03T16:16:23Z

@pytorchbot merge

pytorchmergebot · 2024-06-03T16:18:54Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-06-03T16:40:07Z

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / win-vs2019-cuda11.8-py3 / build

Details for Dev Infra team

Raised by workflow job

…m fbgemm_gpu" PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`: * `dense_to_jagged_forward()` as CUDA registration for new ATen op `_padded_dense_to_jagged_forward()` * `jagged_to_padded_dense_forward()` as CUDA registration for new ATen op `_jagged_to_padded_dense_forward()` CPU impls for these new ATen ops will be added in a follow-up PR. [ghstack-poisoned]

ghstack-source-id: beab99ee926adc176e3802eadd967cfdc142c50d Pull Request resolved: #125946

jbschlosser · 2024-06-03T17:26:09Z

@pytorchbot merge

pytorchmergebot · 2024-06-03T17:29:15Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-06-03T17:50:42Z

Merge failed

Reason: 6 jobs have failed, first few of them are: trunk / linux-focal-cuda12.4-py3.10-gcc9-no-ops / build, trunk / linux-focal-cuda12.1-py3.10-gcc9-no-ops / build, trunk / libtorch-linux-focal-cuda12.1-py3.7-gcc9-debug / build, trunk / linux-focal-cuda12.1-py3.10-gcc9 / build, trunk / win-vs2019-cuda11.8-py3 / build

Details for Dev Infra team

Raised by workflow job

…m fbgemm_gpu" PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`: * `dense_to_jagged_forward()` as CUDA registration for new ATen op `_padded_dense_to_jagged_forward()` * `jagged_to_padded_dense_forward()` as CUDA registration for new ATen op `_jagged_to_padded_dense_forward()` CPU impls for these new ATen ops will be added in a follow-up PR. [ghstack-poisoned]

ghstack-source-id: c9fca7bec054de558d1184560099441b7c177dc4 Pull Request resolved: #125946

jbschlosser · 2024-06-03T19:03:41Z

@pytorchbot merge

pytorchmergebot · 2024-06-03T19:05:56Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorch#125946) PyTorch can't depend on `fbgemm_gpu` as a dependency because `fbgemm_gpu` already has a dependency on PyTorch. So this PR copy / pastes kernels from `fbgemm_gpu`: * `dense_to_jagged_forward()` as CUDA registration for new ATen op `_padded_dense_to_jagged_forward()` * `jagged_to_padded_dense_forward()` as CUDA registration for new ATen op `_jagged_to_padded_dense_forward()` CPU impls for these new ATen ops will be added in a follow-up PR. Pull Request resolved: pytorch#125946 Approved by: https://github.com/davidberard98

Lift jagged -> padded dense forward / backward kernels from fbgemm_gpu

f3b64f7

[ghstack-poisoned]

This was referenced May 10, 2024

Short-term fix to preserve NJT metadata cache in torch.compile #122836

Open

NJT <-> padded dense conversions #125947

Draft

(WIP) to_padded_tensor() triton kernel for NJT #121947

Draft

jbschlosser mentioned this pull request May 10, 2024

Lift inductor lowerings for jagged <-> padded dense kernels #125968

Closed

davidberard98 reviewed May 13, 2024

View reviewed changes

aten/src/ATen/native/native_functions.yaml Outdated Show resolved Hide resolved

jbschlosser added the topic: not user facing topic category label May 14, 2024

jbschlosser mentioned this pull request May 14, 2024

Traceable wrapper subclass support for deferred runtime asserts #126198

Closed

jbschlosser mentioned this pull request May 17, 2024

Use return_and_correct_aliasing() for NJT + compatible storage setting #126552

Open

jbschlosser added 11 commits May 17, 2024 14:15

jbschlosser requested a review from davidberard98 May 22, 2024 16:27

davidberard98 approved these changes May 23, 2024

View reviewed changes

jbschlosser mentioned this pull request May 23, 2024

Naive CPU kernels for jagged <-> padded dense conversions #127007

Closed

jbschlosser added a commit that referenced this pull request Jun 3, 2024

Lift jagged -> padded dense forward / backward kernels from fbgemm_gpu

b4e9df5

ghstack-source-id: 151b65d9db8fd2c074dffcade497b644a3081078 Pull Request resolved: #125946

pytorchmergebot added the merging label Jun 3, 2024

pytorchmergebot removed the merging label Jun 3, 2024

jbschlosser added a commit that referenced this pull request Jun 3, 2024

Lift jagged -> padded dense forward / backward kernels from fbgemm_gpu

16ec8a2

ghstack-source-id: 8b3828fc71ce86a50a37239f69b0f2f47248a772 Pull Request resolved: #125946

pytorchmergebot added the merging label Jun 3, 2024

pytorchmergebot removed the merging label Jun 3, 2024

jbschlosser added a commit that referenced this pull request Jun 3, 2024

Lift jagged -> padded dense forward / backward kernels from fbgemm_gpu

78e21b9

ghstack-source-id: beab99ee926adc176e3802eadd967cfdc142c50d Pull Request resolved: #125946

pytorchmergebot added the merging label Jun 3, 2024

pytorchmergebot removed the merging label Jun 3, 2024

jbschlosser added a commit that referenced this pull request Jun 3, 2024

Lift jagged -> padded dense forward / backward kernels from fbgemm_gpu

59a89dd

ghstack-source-id: c9fca7bec054de558d1184560099441b7c177dc4 Pull Request resolved: #125946

pytorchmergebot added the merging label Jun 3, 2024

pytorchmergebot added the Merged label Jun 3, 2024

pytorchmergebot closed this in b42cfca Jun 3, 2024

pytorchmergebot removed the merging label Jun 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lift jagged -> padded dense forward / backward kernels from fbgemm_gpu #125946

Lift jagged -> padded dense forward / backward kernels from fbgemm_gpu #125946

jbschlosser commented May 10, 2024 •

edited

pytorch-bot bot commented May 10, 2024 •

edited

davidberard98 left a comment

davidberard98 May 23, 2024

jbschlosser May 23, 2024

jbschlosser commented May 23, 2024

jbschlosser commented Jun 3, 2024

pytorchmergebot commented Jun 3, 2024

pytorchmergebot commented Jun 3, 2024

jbschlosser commented Jun 3, 2024

pytorchmergebot commented Jun 3, 2024

pytorchmergebot commented Jun 3, 2024

jbschlosser commented Jun 3, 2024

pytorchmergebot commented Jun 3, 2024

pytorchmergebot commented Jun 3, 2024

jbschlosser commented Jun 3, 2024

pytorchmergebot commented Jun 3, 2024

Lift jagged -> padded dense forward / backward kernels from fbgemm_gpu #125946

Lift jagged -> padded dense forward / backward kernels from fbgemm_gpu #125946

Conversation

jbschlosser commented May 10, 2024 • edited

pytorch-bot bot commented May 10, 2024 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125946

✅ You can merge normally! (1 Unrelated Failure)

davidberard98 left a comment

Choose a reason for hiding this comment

davidberard98 May 23, 2024

Choose a reason for hiding this comment

jbschlosser May 23, 2024

Choose a reason for hiding this comment

jbschlosser commented May 23, 2024

jbschlosser commented Jun 3, 2024

pytorchmergebot commented Jun 3, 2024

Merge started

pytorchmergebot commented Jun 3, 2024

Merge failed

jbschlosser commented Jun 3, 2024

pytorchmergebot commented Jun 3, 2024

Merge started

pytorchmergebot commented Jun 3, 2024

Merge failed

jbschlosser commented Jun 3, 2024

pytorchmergebot commented Jun 3, 2024

Merge started

pytorchmergebot commented Jun 3, 2024

Merge failed

jbschlosser commented Jun 3, 2024

pytorchmergebot commented Jun 3, 2024

Merge started

jbschlosser commented May 10, 2024 •

edited

pytorch-bot bot commented May 10, 2024 •

edited