New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
attention trick implementation #2525
Comments
Typically the attention is done with a single MLP that maps vectors to scores, then uses those scores in a softmax to get a probability distribution over the vectors. Finally, you hadamard the probability distribution and initial vectors, sum over the sequence dimension (dimension 1). here's some code I wrote to do this. Though, you should definitely read up on attention further. Bahdanau et al have a great paper on it. |
Have a look at this: https://github.com/philipperemy/keras-simple-attention-mechanism It's a very simple Hello world attention mechanism but might address your needs! |
@braingineer Thanks for your code. Do you have a toy example on how to use it? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed. |
@briangineer does your code include bahdanu's attention |
Hey guys!
Inspired by the attention LSTM,
Each time I have 3 input vectors, say x_1, x_2 and x_3.
I wish to first make a linear combination layer_I = a_1_x1+a2_x2+a3*x3
Then I merge this layer with some other sequencial layers.
I wish to learn the a_1 a_2 and a_3
How to do it in keras????
THANKS !!!!!!!!!!!!
with love
The text was updated successfully, but these errors were encountered: