A simple Large Language Model built from scratch for text completion.
➡️ See the full article here.
VaniLLM is a small educational LLM implemented entirely from scratch in Python and PyTorch. It is trained and tested on a single poem — Shakespeare’s Sonnet 18 — and can generate the continuation of the text given a prompt.
This project is a toy example designed to understand how transformers, embeddings, attention mechanisms, and autoregressive text generation work internally.
pip install -r install.txt
All common hyperparameters (block size, embedding dimension, paths, etc.) are defined in:
shared.py
You can modify them freely.
python training.py
This script will:
- tokenize the poem
- build the training batches
- train the transformer from scratch
- save the model to vanillm.pth
python testing.py
The script will ask:
- How many characters to generate
- A starting sentence or a few words from the poem
The model will then generate the continuation of the text directly in the terminal, character by character.
- This is not a full LLM — it is a minimal and educational implementation of a transformer for next-character prediction.
- The network is intentionally small and trained on a tiny dataset to keep the entire pipeline understandable.
- Ideal for learning the mechanics of LLMs: embeddings, positional encodings, attention, and autoregressive sampling.
