🤖
A variety of resources that really helped us out in understanding and implementing the Transformer model
- Visualizing Attention, a Transformer's Heart by 3Blue1Brown
- Understanding and Coding the Self-Attention Mechanism of Large Language Models From Scratch by Sebastian Raschka
- Self Attention in Transformer Neural Networks (with Code!) by CodeEmporium
- Visualizing attention matrix using BertViz