Why are there so many position embeddings? #34

jasperhyp · 2022-12-06T15:43:46Z

Hi! Thanks for your great work, it's very helpful for my project! I was just curious why there are so many position embeddings. Essentially it looks like the sequence is also being added a (1 to n) pos emb initially in the RETRO class, and then in each attention module rotary embeddings are added again. I thought just two in the Attention and CCA would be quite enough. Thanks in advance!

lucidrains · 2022-12-06T18:44:40Z

one is absolute positional embedding, the other is relative positional embedding (you need the relative positional embeddings for the CCA to work well)

rotary embeddings is one of the strongest relative positional embeddings out there

jasperhyp · 2022-12-06T19:54:35Z

Makes sense, thank you! And meanwhile only the sequence being modeled is added the absolute position embedding (the context/retrieved is not), is that also deliberate?

jasperhyp · 2022-12-06T19:59:58Z

Also, another unrelated question: Just to confirm, sequences are already retrieved before training (both retrieval corpus and the training sequences are encoded by frozen BERT), is this correct?

lucidrains · 2022-12-08T17:46:44Z

@jasperhyp yup, that is correct

the retrieved content undergoes relative positional embedding during cross attention iirc

yes, the retrieval is done prior to training for efficiency

jasperhyp · 2022-12-08T23:26:37Z

Thank you!

jasperhyp closed this as completed Dec 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why are there so many position embeddings? #34

Why are there so many position embeddings? #34

jasperhyp commented Dec 6, 2022

lucidrains commented Dec 6, 2022

jasperhyp commented Dec 6, 2022

jasperhyp commented Dec 6, 2022

lucidrains commented Dec 8, 2022

jasperhyp commented Dec 8, 2022

Why are there so many position embeddings? #34

Why are there so many position embeddings? #34

Comments

jasperhyp commented Dec 6, 2022

lucidrains commented Dec 6, 2022

jasperhyp commented Dec 6, 2022

jasperhyp commented Dec 6, 2022

lucidrains commented Dec 8, 2022

jasperhyp commented Dec 8, 2022