This is the Github repo for the paper "Language Models Use Trigonometry to Do Addition." We find that LLMs represent numbers on a helix and manipulate that helix to do addition.
All figures from the main body of the paper can be reproduced using GPT-J in paper_figures.ipynb
. For details on the experiments used to generate these figures, please refer to the experimentation/
directory. Instructions for each specific experiment can be found within paper_figures.ipynb
.
All required libraries are listed in requirements.txt
.
There are two pre-generated data folders downloadable on Dropbox for reproducing all results in paper_figures.ipynb
. These include model activations and helical fits.
For questions, please reach out to me at subhashk@mit.edu
For now, please cite this paper as
@misc{KantamneniAddition,
Author = {Subhash Kantamneni and Max Tegmark},
Title = {Language Models Use Trigonometry to Do Addition},
Year = {2025},
Eprint = {arXiv:2502.00873},
}