<a href="https://colab.research.google.com/github/liangli217/PyTorch_ML/blob/main/Self_Attention_solution.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Implement Attention from scratch**



# Problem Statement
Implement self attention mechanism from scratch.

# Requirements
1. Instatniate the linear layers in the following oder: Key, Query, and Value



*   Biases are not used in Attention, so for all 3 nn.Linear() instances, pass in bias = False
*   Use function torch.nn.functional.softmax
*   Apply the masking to the T*T scores BEFORE calling softmax() so that the future tokens don't get factored in at all. To implement masking, look into using torch.ones(), toch.tril(), and tensor.masked_fill()



In [None]:
Instantiate the linear layers in the following order: Key, Query, Value.
# 1. Biases are not used in Attention, so for all 3 nn.Linear() instances, pass in bias=False.
# 2. torch.transpose(tensor, 1, 2) returns a B x T x A tensor as a B x A x T tensor.
# 3. This function is useful:
#    https://pytorch.org/docs/stable/generated/torch.nn.functional.softmax.html
# 4. Apply the masking to the TxT scores BEFORE calling softmax() so that the future
#    tokens don't get factored in at all.
#    To do this, set the "future" indices to float('-inf') since e^(-infinity) is 0.
# 5. To implement masking, note that in PyTorch, tensor == 0 returns a same-shape tensor
#    of booleans. Also look into utilizing torch.ones(), torch.tril(), and tensor.masked_fill(),
#    in that order.

In [5]:
!pip install torchtyping

Collecting torchtyping
  Downloading torchtyping-0.1.5-py3-none-any.whl.metadata (9.5 kB)
Collecting typeguard<3,>=2.11.1 (from torchtyping)
  Downloading typeguard-2.13.3-py3-none-any.whl.metadata (3.6 kB)
Downloading torchtyping-0.1.5-py3-none-any.whl (17 kB)
Downloading typeguard-2.13.3-py3-none-any.whl (17 kB)
Installing collected packages: typeguard, torchtyping
  Attempting uninstall: typeguard
    Found existing installation: typeguard 4.4.4
    Uninstalling typeguard-4.4.4:
      Successfully uninstalled typeguard-4.4.4
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
inflect 7.5.0 requires typeguard>=4.0.1, but you have typeguard 2.13.3 which is incompatible.[0m[31m
[0mSuccessfully installed torchtyping-0.1.5 typeguard-2.13.3


In [6]:
import torch
import torch.nn as nn
from torchtyping import TensorType

In [7]:
class SingleHeadAttention(nn.Module):

    def __init__(self, embedding_dim: int, attention_dim: int):
        super().__init__()
        torch.manual_seed(0)
        # input is B * T * E
        self.key_generate = nn.Linear(embedding_dim, attention_dim, bias= False)
        self.query_generate = nn.Linear(embedding_dim, attention_dim, bias = False)
        self.value_generate = nn.Linear(embedding_dim, attention_dim, bias= False)

    def forward(self, embedded: TensorType[float]) -> TensorType[float]:
        # Return your answer to 4 decimal places
        key = self.key_generate(embedded)
        query = self.query_generate(embedded)
        value = self.value_generate(embedded)

        # key, query and value are of shape B * T * A
        B,T, A = key.shape
        score = key@torch.transpose(query, 1,2 )/ torch.sqrt(A)


        score = nn.functional.softmax(score, 2)

        return score@value

