Skip to content

Latest commit

 

History

History
8 lines (7 loc) · 299 Bytes

README.md

File metadata and controls

8 lines (7 loc) · 299 Bytes

Nano-GPT : Decoder only Transformer

Simple GPT with multiheaded attention for char level tokens, inspired from Andrej Karpathy's video lectures : https://github.com/karpathy/ng-video-lecture

Features

  1. Multi-headed self attention
  2. Layer Norm layers
  3. Skip connections
  4. Feed Forward layer