Skip to content

borjanG/2023-transformers-rotf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A mathematical perspective on Transformers

Python codes for the paper A mathematical perspective on Transformers by Borjan Geshkovski, Cyril Letrouit, Yury Polyanskiy, and Philippe Rigollet.

animated animated

Abstract

Transformers play a central role in the inner workings of large language models. We develop a mathematical framework for analyzing Transformers based on their interpretation as interacting particle systems, which reveals that clusters emerge in long time. Our study explores the underlying theory and offers new perspectives for mathematicians as well as computer scientists.

Citing

@article{geshkovski2023perspective,
      title={A mathematical perspective on Transformers}, 
      author={Borjan Geshkovski and Cyril Letrouit and Yury Polyanskiy and Philippe Rigollet},
      year={2023},
      eprint={2312.10794},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

About

Codes for the paper "A mathematical perspective on Transformers".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published