Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Adding fixed positional embeddings as an option #47

Closed
gulnazaki opened this issue Dec 17, 2020 · 3 comments
Closed

[Feature] Adding fixed positional embeddings as an option #47

gulnazaki opened this issue Dec 17, 2020 · 3 comments

Comments

@gulnazaki
Copy link
Contributor

gulnazaki commented Dec 17, 2020

I believe that, although using learnable positional embeddings is the trend nowadays, it would help to use fixed embeddings (sinusoidal, as in the original implementation), in relatively small dataset scenarios, where it would be hard to learn a meaningful embedding. At least, it would be interesting to compare both methods.

I see you included fixed embeddings in the reformer implementation, but don't you think it would be more efficient to calculate them once during the initialization? (like here)

Btw, I read a cool paper that compares fixed positional ambeddings and the ones learned by BERT, GPT2 and roBERTa.

If you prefer, I could do a PR on this adding the implementation in the above pytorch tutorial but it is no big deal.

@lucidrains
Copy link
Owner

@gulnazaki yea sure, I would welcome a PR on that :D I'll check out the paper you recommended tonight, thank you! Another good one I read recently is https://arxiv.org/abs/2006.15595

@gulnazaki
Copy link
Contributor Author

Seems pretty interesting, I'll check it out thanks.

Ok, I'll give it a look later. Do you think axial would also be a good embedding option I should include?

@lucidrains
Copy link
Owner

@gulnazaki yea, axial is great! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants