Current querying doesn't make much sense #18

mtanghu · 2022-09-05T07:13:33Z

The current querying system of LEAP seems quite inflexible. Like take for example the winograd task of knowing what a pronoun refers to. For the pronoun token information to make it to the prediction token, the query vector for that prediction token would need to match the focus vector, thus requiring the prediction to already "know" the pronoun apriori.

Using a sigmoid gate would make sense (to replace the querying), though I'm concerned about explainability, and if the sigmoid will saturate at large scales.

A different tempting option would be to create focus weighted keys and focus weighted values which will be queried. This would require a total of 5 linear projections though...

I think one of the Fs can be shared though?, i.e. LEAP = Q \cdot w-Focus(F, K, K) * w-Focus(F, K, V) -- is this too complicated though?

mtanghu · 2022-09-26T03:04:38Z

Fixed in #19

mtanghu closed this as completed Sep 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Current querying doesn't make much sense #18

Current querying doesn't make much sense #18

mtanghu commented Sep 5, 2022 •

edited

Loading

mtanghu commented Sep 26, 2022

Current querying doesn't make much sense #18

Current querying doesn't make much sense #18

Comments

mtanghu commented Sep 5, 2022 • edited Loading

mtanghu commented Sep 26, 2022

mtanghu commented Sep 5, 2022 •

edited

Loading