Welcome to Transformers a repository for understanding how the state-of-the-art Transformers work. In this repository you will find how to build a Transformer from scratch and also how to use the SOTA Transformers from 🤗Huggingface.
PS: This was a weekend project I did a year ago and thought of publishing now !
The Transformers architecture is a type of deep neural network architecture that has revolutionized the field of natural language processing (NLP) and has been extended to many other applications such as computer vision, speech recognition, and recommendation systems. It was first introduced by Vaswani et al. in their seminal paper "Attention Is All You Need" in 2017, and has since become the state-of-the-art architecture for various NLP tasks. In this chapter, we will provide a detailed overview of the Transformers architecture, its components, and its working mechanism.
Building a Transformer from scratch.
| Topic | Link |
|---|---|
| Transformers Overview | Link |
| Sentence Tokenizer | Code |
| Positional Encodings | Code |
| Layer Normalization | Code |
| Self Attention | Code |
| MultiHeaded Attention | Code |
| Encoder | Code |
| Decoder | Code |
| Transformer | Code |
Contributions to this repo are welcome and encouraged! If you would like to contribute, please follow these guidelines in contribution.md
If you have any questions about how to contribute, please open an issue in the repo or reach out to the project maintainers.
Thank you for your contributions!
This repository is possible only because of Ajay Halthor & the entire repository credits go to him. This repository is just my personal take on transformers after learning it from him.
@citation{
SRDdev/Transformer,
author = {Shreyas Dixit},
year = {2023},
url = {https://huggingface.co/SRDdev/Transformer}
}