Code for Replicating Our the Primary Experiment of Our Paper "On Learning the Transformer Kernel"

Directories

fast_transformers: Contains actual implementation of the transformer models. Most of the code in this folder is adapted from https://github.com/idiap/fast-transformers
utils: Contains code shared between all experiments, not pertaining to the underlying transformer
text: Contains code to replicate the Text Experiment
retrieval: Contains code to replicate the Text Experiment
listops: Contains code to replicate the Text Experiment

Python(tesed with 3.6.12)
PyTorch(tested with 1.5.0)
Tensorflow(tested with 2.2.0)
TensorBoard(tested with 2.4.0)
TensorFlow Datasets(tested with 1.2.0)

Install Requirements listed above
Install fast_transformers from the main directory, preferably in editable mode with pip install -e .
Download the LRA datasets for ListOps and Retrieval, and fix the "DATA_PATH" variables in retrieval/train.py and listops/train.py to pint to the folder containing the tsv files.
For Fastfood methods to work, one additionally needs to install cuda kernels for fastfood
Experiments can now be run using <experiment_name>/train.py <attention_type> where "experiment name is one of "text", "retrieval" or "listops" and attention type is:

Attention Name in Paper	attention_type to pass to code
Softmax Transformer	softmax
GMM-PRF	mix-gauss-positive
GMM-RKS	mix-gauss-fourier
FastFood-RKS	fsgb-fastfood
FastFood-PRF	fsgb-positive-fastfood
Generator-RKS	generative-fourier
Generator-PRF	generative-positive

For text experiments, the maximum length can additionally be passed as the second parameter to replicate figure 4 from our table.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
fast_transformers		fast_transformers
listops		listops
retrieval		retrieval
text		text
utils		utils
README.md		README.md
setup.py		setup.py