ON MY WAY TO WRITE A BERT ENCODER

Bi-directional encoder representations from transformers, here are the things I need to cover for building one

Implement the multi-head self-attention mechanism
- the input is given to queries, keys, and values with linear transformation
- splitting the heads and compute scaled dot product attention for each head
- Concatenate the heads and project back
Implement the position-wise feed-forward network.
- Two Linear transformations with GELU activation in between

Math behind the TransformerEncoderLayer : - Attention sub-layer : $ x' = LayerNorm(x + Dropout(Attention(x))) $

Implement layer normalization and residual connections
- stack of num_hidden_layers encoder layers
Combining these into a single encoder layer
Stacking multiple encoder layers to form the BERT encoder

Here's what I am doing next [Visualizing Attention Heads]

Extracting the attention_probs from MultiHeadAttention module and create a heatmap to show how tokens attend to each other

My Approach

Take the first sequence in the batch and then plot the heatmap

How is visualization part is working ?

Experimenting next with text classification

Approach :

Classification Head (BERTClassifier) : the BERTClassifier class wrap the BERTEncoder and adds a linear layer to predict class probabilities

How are we doing it ??

Extract the CLS token's embedding (output[:, 0, :]) from the encoder's output, as BERT uses this for classification tasks.
Applies dropout for regularization.

Tokenizer integration : using transformers.BertTokenizer from hugging face, pre-trained on bert-base-uncased.

Experimenting with GPTDecoder

Convert this whole into a hybrid model

Maths here

Bert encoder : input token ID $x = [x_1, ... , x_n]$, n is the sequence length

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.vscode		.vscode
.gitignore		.gitignore
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ON MY WAY TO WRITE A BERT ENCODER

Here's what I am doing next [Visualizing Attention Heads]

Experimenting next with text classification

Experimenting with GPTDecoder

About

Uh oh!

Releases

Packages

Uh oh!

Languages

mrinalxdev/BERTencoder

Folders and files

Latest commit

History

Repository files navigation

ON MY WAY TO WRITE A BERT ENCODER

Here's what I am doing next [Visualizing Attention Heads]

Experimenting next with text classification

Experimenting with GPTDecoder

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages