Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 365 Bytes

README.md

File metadata and controls

5 lines (3 loc) · 365 Bytes

Based on Andrej Karpathy's 'NanoGPT' lecture (training a small transformer architecture on a shakespearean dataset), refactoring for training with PyTorch Lightning.

Character level tokenizer and decoder only transformer architecture trained with masked self-attention.

Training tested on an A100-40 and M2 Macbook.