Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current querying doesn't make much sense #18

Closed
mtanghu opened this issue Sep 5, 2022 · 1 comment
Closed

Current querying doesn't make much sense #18

mtanghu opened this issue Sep 5, 2022 · 1 comment

Comments

@mtanghu
Copy link
Owner

mtanghu commented Sep 5, 2022

The current querying system of LEAP seems quite inflexible. Like take for example the winograd task of knowing what a pronoun refers to. For the pronoun token information to make it to the prediction token, the query vector for that prediction token would need to match the focus vector, thus requiring the prediction to already "know" the pronoun apriori.

Using a sigmoid gate would make sense (to replace the querying), though I'm concerned about explainability, and if the sigmoid will saturate at large scales.

A different tempting option would be to create focus weighted keys and focus weighted values which will be queried. This would require a total of 5 linear projections though...

I think one of the Fs can be shared though?, i.e. LEAP = Q \cdot w-Focus(F, K, K) * w-Focus(F, K, V) -- is this too complicated though?

@mtanghu
Copy link
Owner Author

mtanghu commented Sep 26, 2022

Fixed in #19

@mtanghu mtanghu closed this as completed Sep 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant