This repo contains experiments I did on the simple transformer developed in Andrej Karpathy’s excellent video, Let’s build GPT: from scratch, in code, spelled out. The transformer architecture, training, and basic inference code comes from that video. The experiments and analyses are my own.
- Create a virtual environment
python3 -m venv ~/venv/venv-transformer-experiments
source ~/venv/venv-transformer-experiments/bin/activate
- Install dependencies
pip install -r requirements.txt
pip install -r requirements.dev.txt
- Install this library in editable mode
pip install -e '.[dev]'
Run the following to ensure all dependencies are generated from notebooks and notebooks are cleaned:
make all