GPT-2 experiments This repo is a fork of llm.c Features HellaSwag benchmarking and model sampling while using torch.compile A shorter time for downloading fineweb10B thanks to the use of aria2 Visualization with Tensorboard Checkpointing that allows to resume training