In [1]:
! pip install torch


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.11 -m pip install --upgrade pip[0m


# What is a transformer?
The Transformer model acts like a sophisticated tool that helps you understand and summarize conversations more effectively. It processes input sequences through various layers to extract important information and provide a concise summary of the conversation's key points.

Imagine you're in a bustling classroom where students are engaged in conversations. Your task is to understand these conversations and provide a summary of what's being discussed.

1. **Positional Encoding**:
   - You have a special notebook to write down what each student says. However, you also need to remember who said what and in what order. So, you assign a unique code to each student's message and note down their position in the conversation. This way, you can keep track of the conversation's flow and structure.

2. **Transformer Model**:
   - Now, you start processing the conversation using a special device called a "Transformer". This device helps you understand and summarize the conversation more effectively.
   - **Embedding Layer**: You first listen to each student's message and translate it into a language that the Transformer understands. It's like translating each student's words into a common language that you and the Transformer can both understand.
   - **Positional Encoding**: Next, you add additional information to your notes to indicate the order in which each student spoke. This helps the Transformer understand the flow of the conversation and who said what.
   - **Transformer Encoder Layers**: You then process your notes through a series of special layers within the Transformer device. Each layer helps you focus on different aspects of the conversation, such as understanding the relationships between different students' messages and identifying important points.
   - **Linear Output Layer**: Finally, after processing the conversation through all the layers of the Transformer, you summarize the main points and provide a clear summary of what was discussed. This summary is like the final output of the Transformer, which captures the key information from the conversation.

In [2]:
import torch
import torch.nn as nn
import torch.nn.functional as F

class PositionalEncoding(nn.Module):
    def __init__(self, d_model, max_len=512):
        super(PositionalEncoding, self).__init__()
        pe = torch.zeros(max_len, d_model)
        position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
        div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
        pe[:, 0::2] = torch.sin(position * div_term)
        pe[:, 1::2] = torch.cos(position * div_term)
        pe = pe.unsqueeze(0).transpose(0, 1)
        self.register_buffer('pe', pe)

    def forward(self, x):
        return x + self.pe[:x.size(0), :]

class Transformer(nn.Module):
    def __init__(self, input_dim, output_dim, d_model, nhead, num_layers):
        super(Transformer, self).__init__()
        self.embedding = nn.Linear(input_dim, d_model)
        self.positional_encoding = PositionalEncoding(d_model)
        self.transformer_encoder = nn.TransformerEncoder(nn.TransformerEncoderLayer(d_model, nhead), num_layers)
        self.fc = nn.Linear(d_model, output_dim)

    def forward(self, src):
        src = self.embedding(src)
        src = self.positional_encoding(src)
        output = self.transformer_encoder(src)
        output = self.fc(output)
        return F.log_softmax(output, dim=-1)
