Skip to content

A standard Transformer architecture created from scratch in PyTorch

License

Notifications You must be signed in to change notification settings

Achronus/pytorch-transformer

Repository files navigation

PyTorch Transformer

This repository focuses on a standard Transformer architecture based on the "Attention Is All You Need" paper created using the PyTorch library.

It accompanies a series of Transformer blog posts found on Medium that intend to provide a deep understanding of the architecture. You can view a roadmap of the repository below.

Roadmap

We divide the roadmap into two sections: components and demos.

Components

This section focuses on the components of the architecture that are progressively added to the repository. They are housed in the /model folder with respective .py files.

  • Self-Attention | model/attention.py
  • Multi-Headed Attention | model/attention.py
  • Mask Generation | model/mask.py
  • Embedding Methods | model/embed.py
  • Absolute (Sinusoidal) Positional Encoding | model/encoding.py
  • Position-Wise Feed-Forward Networks | model/ffn.py
  • Residual Connections | model/normalize.py
  • Layer Normalisation | model/normalize.py
  • Encoders | model/transformer.py
  • Decoders | model/transformer.py
  • Transformer | model/transformer.py

Additional components are found in the /utils folder with the respective .py files.

  • Raw Text File Reader | utils/reader.py
  • Word Tokenizer | utils/tokenize.py
  • Logger | utils/logger.py
  • Latent Semantic Analysis (LSA) | utils/lsa.py
  • Pairwise Inner Product (PIP) Loss | utils/pip.py

Demos

These are found in the /examples folder and consist of simple demos (small tutorials) for specific components to help with debugging the code (e.g., checking tensor dimensions) and act as a quickstart guide for them. You will find them in the format of [name].py for code only (with comments).

  • Attention | examples/attention.py
  • Masking Attention | examples/attention_masking.py
  • Layer Normalisation | examples/layer_norm.py
  • Word Embeddings | examples/embed_vocab.py
  • Image Embeddings | examples/embed_imgs.py
  • Finding Optimal Embedding Dimension | examples/optimal_embed_dim.py
  • Visualising Positional Encoding | examples/pos_encoding.py
  • Transformer Creation | examples/transformer_demo.py

References

Huang, A., Subramanian, S., Sum, J., Almubarak, K., and Biderman, S., 2022. The Annotated Transformer. [online] Harvard University. Available from: http://nlp.seas.harvard.edu/annotated-transformer/.

Sarkar, A., 2023. Build Your Own Transformer From Scratch Using PyTorch. [online] Medium. Available from: https://towardsdatascience.com/build-your-own-transformer-from-scratch-using-pytorch-84c850470dcb.

About

A standard Transformer architecture created from scratch in PyTorch

Resources

License

Stars

Watchers

Forks

Releases

No releases published