Triangular matrices ? #42

jeremycochoy · 2020-12-07T18:44:07Z

Does the current implementation provide triangular matrices (to constrain the attention always on the "left" of the sequence, both for input and encoded values) as described in the last section of the original paper?

lucidrains · 2020-12-07T18:56:07Z

@jeremycochoy Hi Jeremy, do you mean in the autoregressive (unidirectional) case? I only see triangular matrices being mentioned in that context

lucidrains · 2020-12-07T19:32:22Z

@jeremycochoy can you point me at this passage in the paper?

jeremycochoy · 2020-12-07T19:43:13Z

Yes, its page 17, Annexe B.1. I don't know to which extends it is complex to implement this, if not already there.

lucidrains · 2020-12-07T19:51:15Z

@jeremycochoy ohh I see, yeah, that is for the unidirectional case, and it is already taken care of, through a cumulative sum actually (no masking needed)

lucidrains · 2020-12-07T19:51:37Z

@jeremycochoy you don't need to worry about that detail, you just need to set causal = True and you are good to go

lucidrains · 2020-12-07T19:53:23Z

just to make sure we are looking at the same thing lol

jeremycochoy · 2020-12-07T19:58:12Z

There is no words to say how happy I am to learn it, thats awesome (yes we are looking at the same thing). I can't wait to test it. :)

lucidrains · 2020-12-07T20:06:46Z

@jeremycochoy good timing, since @Sleepychord just caught and fixed a big bug in that part of the code loll

Muennighoff · 2021-03-03T12:28:12Z

@jeremycochoy good timing, since @Sleepychord just caught and fixed a big bug in that part of the code loll

Am I understanding it correctly that because of the pretty neat cumsum, we could even run the EncDec version without a decoder mask & still wouldn't spoil the ground truth to the model?

Muennighoff · 2021-03-03T16:53:36Z

& so in practice we can construct attn masks the same way for inputs & outputs and they are treated the same way by the model? @lucidrains

lucidrains closed this as completed Dec 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Triangular matrices ? #42

Triangular matrices ? #42

jeremycochoy commented Dec 7, 2020 •

edited

lucidrains commented Dec 7, 2020

lucidrains commented Dec 7, 2020

jeremycochoy commented Dec 7, 2020

lucidrains commented Dec 7, 2020

lucidrains commented Dec 7, 2020

lucidrains commented Dec 7, 2020

jeremycochoy commented Dec 7, 2020

lucidrains commented Dec 7, 2020 •

edited

Muennighoff commented Mar 3, 2021 •

edited

Muennighoff commented Mar 3, 2021

Triangular matrices ? #42

Triangular matrices ? #42

Comments

jeremycochoy commented Dec 7, 2020 • edited

lucidrains commented Dec 7, 2020

lucidrains commented Dec 7, 2020

jeremycochoy commented Dec 7, 2020

lucidrains commented Dec 7, 2020

lucidrains commented Dec 7, 2020

lucidrains commented Dec 7, 2020

jeremycochoy commented Dec 7, 2020

lucidrains commented Dec 7, 2020 • edited

Muennighoff commented Mar 3, 2021 • edited

Muennighoff commented Mar 3, 2021

jeremycochoy commented Dec 7, 2020 •

edited

lucidrains commented Dec 7, 2020 •

edited

Muennighoff commented Mar 3, 2021 •

edited