Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi Head Attention Layer #7803

Closed
grafael opened this issue Sep 1, 2017 · 8 comments
Closed

Multi Head Attention Layer #7803

grafael opened this issue Sep 1, 2017 · 8 comments

Comments

@grafael
Copy link

grafael commented Sep 1, 2017

I think is a good idea start to think how to implement this sort of layer in Keras.
I know that is a really fresh algorithm, but I believe that's a new cutting edge tech in Deep Learning for the next years.

Paper: Attention is all you need (https://arxiv.org/abs/1706.03762)

Blog showing some results: Google Research Blog
Tensor2Tensor library tensor2tensor
Pytorch implementation pytorch-t2t

@andhus
Copy link
Contributor

andhus commented Oct 30, 2017

@grafael Please keep an eye on this #8296. I'll try to have a look at the specific case to see if it will be covered.

@grafael
Copy link
Author

grafael commented Oct 30, 2017

Thank @andhus I'll follow the updates

@soham97
Copy link

soham97 commented May 28, 2018

Yeah attention layer is defacto standard used in NLP problems to achieve state of art be it generative or classification. I have implemented the attention layer in keras, and have obtained good results from it. It could be be much better if the layer is added to keras, so public can directly use it. Should I share the implementation in the thread or how is the procedure ?

@utkarshshukla2912
Copy link

@soham97 it would be great if you could i was trying to implement it couldn't get the thing working

@lvapeab
Copy link
Contributor

lvapeab commented Jun 11, 2018

Hi,

I implemented this some time ago in a fork. It is somewhat dirty and lacks test suites, but it works (an NMT example of this).

Cheers.

@utkarshshukla2912
Copy link

@lvapeab thanks, man !!

@miniwa
Copy link

miniwa commented May 5, 2019

Whats the status on this?
Seeing OpenAI's success with them made me want to try it.

@dynamicwebpaige
Copy link

Closing, as there is now a Keras-friendly multi-head attention layer in TensorFlow Addons. Thanks for the feature request!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants