scaled dot product attention #2500

mrityunjay-tripathi · 2020-07-04T08:30:12Z

@lozhnikov The single head attention is working. Probably we can now use Concat layer to implement multihead attention #2375
🚀 🚀 I am really delighted that this is working. :)

lozhnikov

Actually, the code looks good. I added a couple of comments.

src/mlpack/methods/ann/layer/scaled_dot_product_attention.hpp

src/mlpack/methods/ann/layer/scaled_dot_product_attention_impl.hpp

Co-authored-by: Mikhail Lozhnikov <lozhnikovma@gmail.com>

…tripathi/mlpack into scaled_dot_product_attention

mrityunjay-tripathi · 2020-07-08T17:30:38Z

@lozhnikov The PyTorch implementation and test of scaled dot product attention is here. The results are matching. Can you have a look at this?

src/mlpack/methods/ann/layer/scaled_dot_product_attention.hpp

lozhnikov

Some minor comments.

src/mlpack/methods/ann/layer/scaled_dot_product_attention.hpp

lozhnikov · 2020-07-19T18:51:34Z

src/mlpack/methods/ann/layer/scaled_dot_product_attention_impl.hpp

+    key = const_cast<arma::Mat<eT>&>(input);
+    value = const_cast<arma::Mat<eT>&>(input);


Why do you need const_cast here? Why doesn't the following code work?

Suggested change

key = const_cast<arma::Mat<eT>&>(input);

value = const_cast<arma::Mat<eT>&>(input);

key = input;

value = input;

src/mlpack/methods/ann/layer/scaled_dot_product_attention_impl.hpp

src/mlpack/tests/ann_layer_test.cpp

lozhnikov · 2020-07-19T19:05:50Z

src/mlpack/tests/ann_layer_test.cpp

+
+  //! Test Backward function with mask.
+  module.Backward(input, gy, g);
+  expGrad = arma::mat("0.00000000 0.00000000;\


By the way, do you have a notebook for this? I'd like to play with the values a bit.

Yup. Here it is.

lozhnikov · 2020-07-19T19:06:52Z

Looks like it's almost done. Let me look a couple of times.

Co-authored-by: Mikhail Lozhnikov <lozhnikovma@gmail.com>

…_dot_product_attention

mrityunjay-tripathi · 2020-07-20T04:28:08Z

@lozhnikov Let me know if I need to clarify anything about the new changes. Basically I have tried to--

Handle empty key and value in a better way.
Get rid of code duplication.
Get rid of using const_cast.

Any suggestions to make it better?

…_dot_product_attention

mlpack-bot · 2020-08-25T17:35:34Z

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

lozhnikov · 2020-08-25T18:10:17Z

Keep open

mlpack-bot · 2020-09-24T18:35:20Z

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

lozhnikov · 2020-09-24T19:31:42Z

Keep open

mlpack-bot · 2020-10-24T19:35:28Z

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

mrityunjay-tripathi added 6 commits July 1, 2020 11:41

initial commit

88b3396

adding scalar dot product attention

8623e35

few corrections

03b26d8

corrections

2fbdb0d

some changes

a2fee92

adding tests for scaled dot product attention

d9f2adf

mlpack-bot bot added s: needs review s: unanswered s: unlabeled labels Jul 4, 2020

lozhnikov reviewed Jul 4, 2020

View reviewed changes

mrityunjay-tripathi and others added 4 commits July 5, 2020 07:23

apply suggestions from code review

a582a8b

Co-authored-by: Mikhail Lozhnikov <lozhnikovma@gmail.com>

styling fixes

f783724

Merge branch 'scaled_dot_product_attention' of github.com:mrityunjay-…

4ce8475

…tripathi/mlpack into scaled_dot_product_attention

some styling fixes and change in variable names

c5140a0

birm added c: methods t: bugfix and removed s: unanswered s: unlabeled labels Jul 5, 2020

mrityunjay-tripathi added 2 commits July 5, 2020 23:53

added masked attention test

e2f917e

using static key and value

500cd76

zoq reviewed Jul 9, 2020

View reviewed changes

src/mlpack/methods/ann/layer/scaled_dot_product_attention.hpp Outdated Show resolved Hide resolved

src/mlpack/methods/ann/layer/scaled_dot_product_attention.hpp Outdated Show resolved Hide resolved

mrityunjay-tripathi added 3 commits July 9, 2020 19:05

adding documentation and fixing CubeMultiply function

861c418

keep include statements consistent and minimal

7839d07

from where that came...haha

589e15c

lozhnikov reviewed Jul 19, 2020

View reviewed changes

mrityunjay-tripathi and others added 3 commits July 20, 2020 04:17

apply suggestions from code review

6a6589f

Co-authored-by: Mikhail Lozhnikov <lozhnikovma@gmail.com>

Merge branch 'master' of https://github.com/mlpack/mlpack into scaled…

b61c814

…_dot_product_attention

some changes

d1dc9b2

Merge branch 'master' of https://github.com/mlpack/mlpack into scaled…

195a12e

…_dot_product_attention

mlpack-bot bot added the s: stale label Aug 25, 2020

mlpack-bot bot removed the s: stale label Aug 25, 2020

mlpack-bot bot added the s: stale label Sep 24, 2020

mlpack-bot bot removed the s: stale label Sep 24, 2020

mlpack-bot bot added the s: stale label Oct 24, 2020

mlpack-bot bot closed this Oct 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scaled dot product attention #2500

scaled dot product attention #2500

mrityunjay-tripathi commented Jul 4, 2020 •

edited

lozhnikov left a comment

mrityunjay-tripathi commented Jul 8, 2020

lozhnikov left a comment

lozhnikov Jul 19, 2020

lozhnikov Jul 19, 2020

mrityunjay-tripathi Jul 19, 2020

lozhnikov commented Jul 19, 2020

mrityunjay-tripathi commented Jul 20, 2020 •

edited

mlpack-bot bot commented Aug 25, 2020

lozhnikov commented Aug 25, 2020

mlpack-bot bot commented Sep 24, 2020

lozhnikov commented Sep 24, 2020

mlpack-bot bot commented Oct 24, 2020

		key = const_cast<arma::Mat<eT>&>(input);
		value = const_cast<arma::Mat<eT>&>(input);

scaled dot product attention #2500

scaled dot product attention #2500

Conversation

mrityunjay-tripathi commented Jul 4, 2020 • edited

lozhnikov left a comment

Choose a reason for hiding this comment

mrityunjay-tripathi commented Jul 8, 2020

lozhnikov left a comment

Choose a reason for hiding this comment

lozhnikov Jul 19, 2020

Choose a reason for hiding this comment

lozhnikov Jul 19, 2020

Choose a reason for hiding this comment

mrityunjay-tripathi Jul 19, 2020

Choose a reason for hiding this comment

lozhnikov commented Jul 19, 2020

mrityunjay-tripathi commented Jul 20, 2020 • edited

mlpack-bot bot commented Aug 25, 2020

lozhnikov commented Aug 25, 2020

mlpack-bot bot commented Sep 24, 2020

lozhnikov commented Sep 24, 2020

mlpack-bot bot commented Oct 24, 2020

mrityunjay-tripathi commented Jul 4, 2020 •

edited

mrityunjay-tripathi commented Jul 20, 2020 •

edited