Both embedding ( e.g., Word2Vec) and encoding (e.g., Bag of words) is about representing data in a different space. Embedding isually talks about continous vector spaces (aka sequences), usually capturing semantic relationships, where encoding also includes compressing and dimensional reduction.
The following embeddings shall be integrated from Candle:
Type | From where |
---|---|
Embedding | Integrated - Standard layer |
Timestep Embedding | Not Integrated so far - here for relative (aka local) positional encoding |
Positional Embedding | Not Integrated so far - here for absolute positional encoding |
Falcon Rotary Positional Embedding | Not Integrated so far - here for absolute and relative positional encoding |
Sinusoidal Positional Embedding | Not Integrated so far - here for absolute and relative positional encoding |
Notes:
- Dimensional reduction such as PCA tend to perform poorly. Check e.g., here.
- Next steps will be