<a href="https://colab.research.google.com/github/adnaen/machine-learning-notes/blob/main/llm/transformers/positional_encoding/positional_encoding.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Positional Encoding**

- To add sequential order in the input embedding.
- Unlike RNN, LSTM. Transformer architecture not follow seqeuntial order.
- In Natural Languages order is important.
- So for that we explicitly inject position in embedding.

In [1]:
import torch

In [14]:
vocab: list[str] = ["i", "love", "biriyani"]
text: str = "i love biriyani"

vocab_idx = [vocab.index(each) for each in text.split(" ")]
vocab_idx = torch.tensor(vocab_idx)
vocab_idx

tensor([0, 1, 2])

In [104]:
embedd = torch.nn.Embedding(num_embeddings=3, embedding_dim=5)
embedded_ip = embedd(vocab_idx)
embedded_ip

tensor([[ 2.0258, -0.5164,  2.2561,  0.5965, -1.0522],
        [ 1.2649, -0.3838,  0.0201,  0.3150,  1.7188],
        [-0.5363, -1.1554,  2.0050,  1.3424, -0.6690]],
       grad_fn=<EmbeddingBackward0>)

In [105]:
# PE = sin(pos / (10000 ^ (2i)/d_model))  : for even
# PE = cos(pos / (10000 ^ (2i)/d_model))  : for odd
#
#  pos     : position of the vector in embedding
#  i       : position of each value of the pos vector
#  d_model : similar to  embedding_dim

In [106]:
def sin_encoder(
        pos: int,
        d_model: int,
        i: int) -> torch.Tensor:
    val = pos / torch.pow(torch.tensor(10000), (torch.tensor(2*i)/d_model))

    return torch.sin(val)

In [107]:
def cos_encoder(
        pos: int,
        d_model: int,
        i: int) -> torch.Tensor:
    val = pos / torch.pow(torch.tensor(10000), (torch.tensor(2*i)/d_model))

    return torch.cos(val)

In [115]:
d_model = 5 # d_model is just embedding_dim
result = []
for pos, vec in enumerate(embedded_ip):
    for i, each in enumerate(vec):
        if i % 2 == 0:
            result.append([sin_encoder(pos, d_model, i)])
        else:
            result.append([cos_encoder(pos, d_model, i)])

positional_encoding = torch.tensor(result).reshape(-1, 5)
positional_encoding # it same shape as the embedding

tensor([[0.0000e+00, 1.0000e+00, 0.0000e+00, 1.0000e+00, 0.0000e+00],
        [8.4147e-01, 9.9968e-01, 6.3096e-04, 1.0000e+00, 3.9811e-07],
        [9.0930e-01, 9.9874e-01, 1.2619e-03, 1.0000e+00, 7.9621e-07]])

In [120]:
# just add the positional_encoding to the emebddings

print(f"Embedding : \n{embedded_ip}")
pe_embedding = embedded_ip + positional_encoding
print(f"\n\nEmbedding after add Positioinal Encode : \n{pe_embedding}")

Embedding : 
tensor([[ 2.0258, -0.5164,  2.2561,  0.5965, -1.0522],
        [ 1.2649, -0.3838,  0.0201,  0.3150,  1.7188],
        [-0.5363, -1.1554,  2.0050,  1.3424, -0.6690]],
       grad_fn=<EmbeddingBackward0>)


Embedding after add Positioinal Encode : 
tensor([[ 2.0258,  0.4836,  2.2561,  1.5965, -1.0522],
        [ 2.1063,  0.6159,  0.0208,  1.3150,  1.7188],
        [ 0.3730, -0.1567,  2.0062,  2.3424, -0.6690]], grad_fn=<AddBackward0>)


**the added value doest have reverse equation to calcualte the position.**

**The model learn the positional features during training.**