Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relative positonal encoding for continuous positions #161

Closed
cswinter opened this issue Jan 25, 2022 · 2 comments
Closed

Relative positonal encoding for continuous positions #161

cswinter opened this issue Jan 25, 2022 · 2 comments
Labels

Comments

@cswinter
Copy link
Collaborator

The current implementation of relative positional encoding (#139) only supports discrete positions (technically, you can give already it continuous positions, which then get truncated by tensor.long). The most basic version of relative positional encoding for vector spaces would apply a function that discretizes all positions, e.g. by dividing by a specified width and then rounding to the nearest integer. There are various extensions to this that might improve performance:

  • Discretize positions in a way that has a higher resolution for nearby positions and lower resolution for distant positions (only skimmed this very briefly, but https://arxiv.org/pdf/2107.14222.pdf seems to be doing something similar). The idea is that you might care a lot more about small changes in position for nearby entities, but only care about the rough location for more distant entities. Lots of different possibilities for the discretization function.
  • When an entity position doesn't fall exactly onto one of the discrete locations, interpolate between nearby positions.

This could be prototyped with the Minefield task (#50), on which relative positional encoding should achieve the same performance as translation.

@cswinter
Copy link
Collaborator Author

One alternative to discretizing positions to a square grid would be to go with a radial grid.

@cswinter cswinter mentioned this issue Mar 14, 2022
16 tasks
@cswinter
Copy link
Collaborator Author

cswinter commented Apr 9, 2022

Interpolation implemented in #203 and appears to work quite well.

@cswinter cswinter closed this as completed Apr 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant