Implementation of multiple attention mechanisms by flxst · Pull Request #138 · Modalities/modalities

flxst · 2024-05-27T15:11:19Z

This PR implements manual attention and pytorch flash attention, in addition to the previously implemented dao flash attention. Group Query Attention is supported.

s

…_flash

…uluations

le1nux · 2024-06-07T14:12:05Z

+    """
+    taken from https://pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html
+    """
+    L, S = query.size(-2), key.size(-2)


why do we specify L, S separately? Shouldn't the context size for query and key, i.e, query.size(-2) and key.size(-1), be always the same?
Also, what does L and S stand for?

le1nux · 2024-06-07T16:09:17Z

+    head_dim = n_embd // n_head_q
+    AttentionConfig(qkv_transforms=[])
+
+    q = torch.rand(batch_size, n_head_q, block_size - 1, head_dim, dtype=torch.bfloat16).cuda()


As an idea: instead of torch.rand we could also do torch.arange(0,batch_size*n_head_q*(block_size-1)*head_dim).reshape(batch_size, n_head_q, block_size - 1, head_dim). In this case, we can check for equality instead of approx. equality.

le1nux · 2024-06-07T16:23:55Z

+        output_tensor[attention_impl_2],
+        atol=2.5e-3,  # default for bfloat16: 1e-5
+        rtol=0.016,  # default for bfloat16: 0.016
+    )


We could add another test in which we test the output of the manual attention implementation against the precomputed output (pen and paper) given a very short input sequence. In this case, we can be entirely sure that the implementation is correct.

le1nux

Left some minor comments. But I think we can merge it already! Really nice how we can switch now between the different attention implementations and also have them fully tested. Nice work! :)

mali-git and others added 8 commits May 27, 2024 17:04

feat: make attention implementation configurable

19d85e3

fix: attention implementations & unit tests

b4a3569

refactor: manual attention implementation

79db900

s

test: approximate equivalence of attention implementations

3acc28b

refactor(attention): rename execute_attention + small doc string fixes

e9577c3

feat(attention): support group query attention for manual and pytorch…

c96b8fb

…_flash

refactor(attention): turn repeat_kv_heads into a classmethod

457f946

refactor(attention): get rid of redundant input argument block_size

5345554

flxst requested review from le1nux and mali-git May 27, 2024 15:12

flxst changed the title ~~Implementation of multiple attention implementations~~ Implementation of multiple attention mechanisms May 27, 2024

le1nux assigned flxst May 29, 2024

le1nux added the enhancement New feature or request label May 29, 2024

flxst requested a review from fromm-m June 3, 2024 13:01

le1nux added 2 commits June 7, 2024 17:58

refactor: added some comments and minor changes to the attention calc…

e177fdd

…uluations

refactor: added attention_implementation to config lorem ipsum

48df422

le1nux approved these changes Jun 7, 2024

View reviewed changes

le1nux reviewed Jun 7, 2024

View reviewed changes

mali-git approved these changes Jun 10, 2024

View reviewed changes

le1nux merged commit cb65b87 into dev_experiments Jun 10, 2024

le1nux deleted the feat/multiple_attention_implementations branch June 10, 2024 21:48

le1nux mentioned this pull request Jun 11, 2024

Towards stable modalities version #141

Merged

5 tasks

flxst mentioned this pull request Jul 4, 2024

Disable Flash Attention for inference #162

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of multiple attention mechanisms#138

Implementation of multiple attention mechanisms#138
le1nux merged 10 commits intodev_experimentsfrom
feat/multiple_attention_implementations

flxst commented May 27, 2024

Uh oh!

le1nux Jun 7, 2024

Uh oh!

le1nux Jun 7, 2024

Uh oh!

le1nux Jun 7, 2024

Uh oh!

le1nux left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

flxst commented May 27, 2024

Uh oh!

le1nux Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

le1nux Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

le1nux Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

le1nux left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

le1nux left a comment •

edited

Loading