Based on Andrej Karpathy's 'NanoGPT' lecture (training a small transformer architecture on a shakespearean dataset), refactoring for training with PyTorch Lightning.

Character level tokenizer and decoder only transformer architecture trained with masked self-attention.

Training tested on an A100-40 and M2 Macbook.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Files

README.md

Latest commit

History

README.md

File metadata and controls