Skip to content

shahrukhx01/applied-transformers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Applied Transformers (PyTorch)

A playground-like experimental project to explore various transformer architectures from scratch.

Resources:

Intuitions:

  1. Intuition behind Attention Mechanism | Notebook
  2. Intuition behind individual Transformer Blocks | Notebook
  3. Intuition behind Chunked Cross-attention by RETRO Deepmind | Notebook

Implementations from Scratch:

Create virtual environment:

conda create -n applied-transformers python=3.10
conda activate applied-transformers

Install Dependencies:

pip install -r requirements.txt
  1. Transformer Model from Scratch {Vaswani et. al, 2017} | Dataset Sample | Python Code
# example training run
python transformer_architectures/vanilla/run.py --num_layers=5\
 --d_model=256 --d_ff=1024 --num_heads=4 --dropout=0.2 \
--train_path=<PATH_TO_TRAIN_DATASET>.csv  --valid_path=<PATH_TO_VALIDATION_DATASET>.csv
  1. GPT Model from Scratch {Radford et. al, 2018} | Coming Soon
  2. BERT Model from Scratch {Lewis et. al, 2019} | Coming Soon
  3. RETRO Model from Scratch {Borgeaud et. al, 2021} | Coming Soon
  4. BART Model from Scratch {Lewis et. al, 2019} | Coming Soon

TODO:

  • Text Generation Schemes
  • Text Generation Eval Metrics
  • Sequence Tokenization Algorithms
  • Optimized Einsum Implementation

References

  1. http://nlp.seas.harvard.edu/annotated-transformer/
  2. https://nn.labml.ai/transformers/models.html
  3. Transformers from scratch | CodeEmporium

About

A playground-like experimental project to explore various transformer architectures from scratch.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published