Skip to content

Leiay/looped_transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Looped Transformers are Better at Learning Learning Algorithms

Liu Yang, Kangwook Lee, Robert D. Nowak, Dimitris Papailiopoulos

You can find the paper in arxiv.

Overview

This codebase contains the implementation for the looped transformer, which learns from inputs generated by (1) linear functions, (2) sparse linear functions, (3) decision trees, and (4) ReLU 2-layer NN. Besides these function classes, we also include code to generate and train datasets from OpenML, as well as code for model probing. The backbone transformer code is based on NanoGPT, while the prompt generation code is based on Garg et al.'s codebase.

@article{yang2023looped,
  title={Looped Transformers are Better at Learning Learning Algorithms},
  author={Yang, Liu and Lee, Kangwook and Nowak, Robert and Papailiopoulos, Dimitris},
  journal={arXiv preprint arXiv:2311.12424},
  year={2023}
}

Setup

Please install and activate the environment through

conda env create -f environment.yml
conda activate loop_tf

Running Experiments

  • For standard transformer training, refer to and execute bash exec/script_baseline.sh.
  • For looped transformer training, refer to and execute bash exec/script_loop.sh.
    • The parameter b determines the maximum loop iteration during training.
    • The parameter T sets the loop window size.
  • To probe a trained model, refer to and execute bash exec/script_probe.sh.
  • To work with the OpenML dataset for both standard and looped transformers, refer to and execute bash exec/script_openml.sh.
  • To plot and compare with baseline methods, refer to notebooks in the jupyter_notebooks folder.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published