Skip to content

Source code for TransJect: Transformer with enforced injectivity

License

Notifications You must be signed in to change notification settings

victor7246/TransJect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TransJect is an encoder-decoder model that guarantees a theoretical bound for layer-wise distance preservation between a pair of tokens. We propose a simple alternative to dot-product attention to ensure Lipschitz continuity. This allows TransJect to learn injective mappings to transform token representations to different manifolds with similar topology and preserve Euclidean distance between every pair of tokens in subsequent layers.

How to run

Run TransJect on IMDb classification

python imdb_classification.py \
	--use_ortho \
	--epochs 10

Run TransJect on PTB language modelling

python language_modeling.py \
	--use_ortho \
	--use_rezero \
	--n_head 4 \
	--dataset ptb \
	--d_model 512 \
	--epochs 10

Citation

If you find this repo useful, please cite our paper:

@inproceedings{,
  author    = {Ayan Sengupta and
               Md. Shad Akhtar and
               Tanmoy Chakraborty},
  title     = {Manifold-Preserving Transformers are Effective for Short-Long Range Encoding},
  booktitle = {},
  publisher = {},
  year      = {},
  url       = {},
  doi       = {},
}

About

Source code for TransJect: Transformer with enforced injectivity

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages