Update nested tensor + MHA tutorial to include SDPA + torch.compile #2813

jbschlosser · 2024-03-27T20:13:41Z

As per title. Idea is to show how to implement MHA using NJT + torch.compile and get nice speedups.

TODO (future work):

Link to more thorough comparison of strided / jagged NTs (write this)
Update part before MHA to focus on jagged?
Update SDPA tutorial

pytorch-bot · 2024-03-27T20:13:44Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2813

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit a4feb18 with merge base d3cf027 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jbschlosser · 2024-03-28T14:44:10Z

@svekars is it possible to see a rendered version of the docs? I tried clicking the link Preview Python docs built from this PR` but I get:

<Error>
  <Code>AccessDenied</Code>
  <Message>Access Denied</Message>
  <RequestId>NJK553V5M50YSYCH</RequestId>
  <HostId>x0Kg4Bq9i51Qa+uIaXFdTbhzI6iAGdRlFYfDhjYjo1vUyaLDXESxv/8jEJMVjFHDU6LUUPFkNsI=</HostId>
</Error>

svekars · 2024-03-28T16:00:33Z

@jbschlosser - there was an error on one of the workers. The preview only becomes available after the manager finishes building. Now it passes and the preview is available.

svekars

An editorial pass. Looks good overall.

prototype_source/nestedtensor.py

drisspg · 2024-03-29T01:37:57Z

Should we also update the usage here: https://pytorch.org/tutorials/intermediate/scaled_dot_product_attention_tutorial.html?

jbschlosser · 2024-03-29T19:07:02Z

Should we also update the usage here: https://pytorch.org/tutorials/intermediate/scaled_dot_product_attention_tutorial.html?

good call yes

svekars · 2024-04-30T16:58:06Z

@jbschlosser the PR looks good - do you want to finish up the TODOs?

jbschlosser · 2024-04-30T17:29:35Z

do you want to finish up the TODOs?

this will take a fair amount of work; might be worth landing this for now and addressing all that later on when I can carve out some time

jbschlosser · 2024-04-30T20:02:05Z

@pytorchbot merge

jbschlosser · 2024-05-02T15:52:49Z

@svekars what's the merging procedure for this repo?

kkt-cohere · 2024-08-13T16:14:15Z

Hi, sorry to resurrect this. I had a qq: is it still the case that fused implementations of sdpa (like the FA kernels) don't support NestedTensor for training? It was mentioned in one of the older tutorials. But it is not clear to me if that still holds. Thanks in advance! cc @jbschlosser @drisspg

jbschlosser · 2024-08-27T13:50:32Z

@kkt-cohere sorry for the delay, I was out the last couple weeks.

is it still the case that fused implementations of sdpa (like the FA kernels) don't support NestedTensor for training?

This isn't really true anymore. Nested tensors with layout=torch.jagged (AKA NJTs) do support flash attention, etc. for training. I'll take a note to update the out-of-date SDPA docs to reflect this.

Update nested tensor + MHA tutorial to include SDPA + torch.compile

d458818

facebook-github-bot added the cla signed label Mar 27, 2024

jbschlosser mentioned this pull request Mar 27, 2024

Update nested tensor + MHA tutorial to include SDPA + torch.compile #2812

Closed

jbschlosser requested review from cpuhrsch, soulitzer and drisspg March 27, 2024 20:14

jbschlosser marked this pull request as draft March 27, 2024 20:14

jbschlosser mentioned this pull request Mar 28, 2024

Deprecate torch.nn.MultiHeadAttention and torch.nn.Transformer-related modules pytorch/pytorch#122660

Open

svekars reviewed Mar 28, 2024

View reviewed changes

Editorial update

c879422

jbschlosser force-pushed the njt_compile branch from 6cf13ed to c879422 Compare April 17, 2024 18:40

drisspg approved these changes Apr 30, 2024

View reviewed changes

Merge branch 'main' into njt_compile

a4feb18

jbschlosser marked this pull request as ready for review April 30, 2024 17:30

jbschlosser merged commit d2d26f5 into pytorch:main May 3, 2024

jbschlosser mentioned this pull request May 17, 2024

[Tracker] Move nested tensors to beta pytorch/pytorch#112398

Open

52 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update nested tensor + MHA tutorial to include SDPA + torch.compile #2813

Update nested tensor + MHA tutorial to include SDPA + torch.compile #2813

Uh oh!

jbschlosser commented Mar 27, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 27, 2024 •

edited

Loading

Uh oh!

jbschlosser commented Mar 28, 2024

Uh oh!

svekars commented Mar 28, 2024

Uh oh!

svekars left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

drisspg commented Mar 29, 2024

Uh oh!

jbschlosser commented Mar 29, 2024

Uh oh!

svekars commented Apr 30, 2024

Uh oh!

jbschlosser commented Apr 30, 2024

Uh oh!

jbschlosser commented Apr 30, 2024

Uh oh!

jbschlosser commented May 2, 2024

Uh oh!

kkt-cohere commented Aug 13, 2024

Uh oh!

jbschlosser commented Aug 27, 2024

Uh oh!

Uh oh!

Update nested tensor + MHA tutorial to include SDPA + torch.compile #2813

Update nested tensor + MHA tutorial to include SDPA + torch.compile #2813

Uh oh!

Conversation

jbschlosser commented Mar 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2813

✅ No Failures

Uh oh!

jbschlosser commented Mar 28, 2024

Uh oh!

svekars commented Mar 28, 2024

Uh oh!

svekars left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

drisspg commented Mar 29, 2024

Uh oh!

jbschlosser commented Mar 29, 2024

Uh oh!

svekars commented Apr 30, 2024

Uh oh!

jbschlosser commented Apr 30, 2024

Uh oh!

jbschlosser commented Apr 30, 2024

Uh oh!

jbschlosser commented May 2, 2024

Uh oh!

kkt-cohere commented Aug 13, 2024

Uh oh!

jbschlosser commented Aug 27, 2024

Uh oh!

Uh oh!

jbschlosser commented Mar 27, 2024 •

edited

Loading

pytorch-bot bot commented Mar 27, 2024 •

edited

Loading