How do transformers relate to short context statistical patterns such as N-grams?
Use python3.12. Be certain python3.12-dev and c compiler are installed (for torch.compile)
First, install uv:
curl -LsSf https://astral.sh/uv/install.sh | shThen running any script with uv will automatically install the dependencies. Optionally, you can install the dependencies yourself with:
uv syncAdd data to data/
Given wikipedia is a dataset in data/
uv run utils/tokenizer.py tinystories_1gb --name tinystories_1gb
#uv run utils/dataset.py [dataset] [tokenizer] --batch_size 100000
uv run utils/dataset.py tinystories_1gb tinystories_1gb --delineateuv run utils/ngram.py tinystories_1gb --tokenizer_name tinystories_1gb --ngram_file tinystories_1gb --ngram_size 8cargo runuv run train.py --config llama_medium --dataset tinystories_1gb