This repository focuses on a standard Transformer architecture based on the "Attention Is All You Need" paper created using the PyTorch library.
It accompanies a series of Transformer blog posts found on Medium that intend to provide a deep understanding of the architecture. You can view a roadmap of the repository below.
We divide the roadmap into two sections: components
and demos
.
This section focuses on the components of the architecture that are progressively added to the repository. They are housed in the /model
folder with
respective .py
files.
- Self-Attention |
model/attention.py
- Multi-Headed Attention |
model/attention.py
- Mask Generation |
model/mask.py
- Embedding Methods |
model/embed.py
- Absolute (Sinusoidal) Positional Encoding |
model/encoding.py
- Position-Wise Feed-Forward Networks |
model/ffn.py
- Residual Connections |
model/normalize.py
- Layer Normalisation |
model/normalize.py
- Encoders |
model/transformer.py
- Decoders |
model/transformer.py
- Transformer |
model/transformer.py
Additional components are found in the /utils
folder with the respective .py
files.
- Raw Text File Reader |
utils/reader.py
- Word Tokenizer |
utils/tokenize.py
- Logger |
utils/logger.py
- Latent Semantic Analysis (LSA) |
utils/lsa.py
- Pairwise Inner Product (PIP) Loss |
utils/pip.py
These are found in the /examples
folder and consist of simple demos (small tutorials) for specific components to help with debugging the code (e.g., checking tensor dimensions) and act as a quickstart guide for them. You will find them in the format of [name].py
for code only (with comments).
- Attention |
examples/attention.py
- Masking Attention |
examples/attention_masking.py
- Layer Normalisation |
examples/layer_norm.py
- Word Embeddings |
examples/embed_vocab.py
- Image Embeddings |
examples/embed_imgs.py
- Finding Optimal Embedding Dimension |
examples/optimal_embed_dim.py
- Visualising Positional Encoding |
examples/pos_encoding.py
- Transformer Creation |
examples/transformer_demo.py
Huang, A., Subramanian, S., Sum, J., Almubarak, K., and Biderman, S., 2022. The Annotated Transformer. [online] Harvard University. Available from: http://nlp.seas.harvard.edu/annotated-transformer/.
Sarkar, A., 2023. Build Your Own Transformer From Scratch Using PyTorch. [online] Medium. Available from: https://towardsdatascience.com/build-your-own-transformer-from-scratch-using-pytorch-84c850470dcb.