Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some question about ATRank #11

Open
crazygirlfym opened this issue Sep 2, 2019 · 1 comment
Open

Some question about ATRank #11

crazygirlfym opened this issue Sep 2, 2019 · 1 comment

Comments

@crazygirlfym
Copy link

It seems that you atrank is different from the one described in paper. For example, bilinear attention is used in paper, but scale-dot attention here. Vanilla attention in paper, but multi-head attention here.

@lizy124
Copy link

lizy124 commented Oct 11, 2019

Query comes from the decoding layer, key and value comes from the encoding layer and is called vanilla attention (which the paper didn't say), the most basic attention. Query, key and value are all from the encoding layer called self attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants