SQ-Transformer: Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings

This repo host the code for the paper Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings, by Yichen Jiang, Xiang Zhou, and Mohit Bansal.

0 Installation

This project is built on Python 3.6.8, pytorch 1.10.1 and fairseq 0.10.2. All dependencies can be installed via:

pip install -r requirements.txt

The vector quantization code ./ar_seq2seq/vector_quantization.py is adapted from an older version of the vector-quantize-pytorch repo.
The main components of the model is implemented in ./ar_seq2seq.
We also use some basic Transformer layer and modules from Latent-Glat in ./nat.

1 Running SQ-Transformer

1.1: Prepare Data

In this work, we use the SCAN, COGS, CoGnition, and WMT17 En-De, WMT14 En-Fr datasets to train and evaluate our models.
Since SCAN, COGS, and CoGnition do not have a validation set that require compositional generalization, we randomly select 20% examples from their test sets as the corresponding validation sets.
The processed data binaries of SCAN, COGS, and CoGnition can be downloaded at this Google Drive.
Please follow fairseq official documentation to process the WMT data.

The complete data directory structure should be:

SQ-Transformer/
├── data
    ├── scan_jump_x2_v2_1000prim/
        ├── dict.src.txt  
        ├── dict.tgt.txt  
        ├── test.src-tgt.src.bin  
        ├── test.src-tgt.tgt.idx   
        ├── train.src-tgt.tgt.bin 
        ├── valid.src-tgt.src.idx
        ├── test.src-tgt.src.idx  
        ├── train.src-tgt.src.bin  
        ├── train.src-tgt.tgt.idx  
        ├── valid.src-tgt.tgt.bin
        ├── preprocess.log  
        ├── test.src-tgt.tgt.bin  
        ├── train.src-tgt.src.idx  
        ├── valid.src-tgt.src.bin  
        └── valid.src-tgt.tgt.idx
    ├── scan_around_right/
        ├── dict.src.txt  
        ├── dict.tgt.txt  
        ├── test.src-tgt.src.bin  
        ├── test.src-tgt.tgt.idx   
        ├── train.src-tgt.tgt.bin 
        ├── valid.src-tgt.src.idx
        ├── test.src-tgt.src.idx  
        ├── train.src-tgt.src.bin  
        ├── train.src-tgt.tgt.idx  
        ├── valid.src-tgt.tgt.bin
        ├── preprocess.log  
        ├── test.src-tgt.tgt.bin  
        ├── train.src-tgt.src.idx  
        ├── valid.src-tgt.src.bin  
        └── valid.src-tgt.tgt.idx
    ├── cogs/
        ├── dict.src.txt  
        ├── dict.tgt.txt  
        ├── test.src-tgt.src.bin  
        ├── test.src-tgt.tgt.idx   
        ├── train.src-tgt.tgt.bin 
        ├── valid.src-tgt.src.idx
        ├── test.src-tgt.src.idx  
        ├── train.src-tgt.src.bin  
        ├── train.src-tgt.tgt.idx  
        ├── valid.src-tgt.tgt.bin
        ├── preprocess.log  
        ├── test.src-tgt.tgt.bin  
        ├── train.src-tgt.src.idx  
        ├── valid.src-tgt.src.bin  
        └── valid.src-tgt.tgt.idx
    ├── cognition_cg/
        ├── dict.en.txt  
        ├── dict.zh.txt  
        ├── preprocess.log  
        ├── test.en-zh.en  
        ├── test.en-zh.zh  
        ├── train.en-zh.en  
        ├── train.en-zh.zh  
        ├── valid.en-zh.en  
        └── valid.en-zh.zh
    ├── wmt14_en_fr/
    └── wmt17_en_de
├── raw_data
    ├── cognition
        ├── cg-test  
            ├── cg-test.compound  
            ├── cg-test.en  
            ├── cg-test.zh  
            ├── NP  
            ├── PP  
            └── VP
        ├── processed  
        ├── test.en  
        ├── test.zh  
        ├── train.en  
        ├── train.zh  
        ├── valid.en  
        └── valid.zh

1.2: Train the SQ-Transformer

To train on SCAN AddJump 2x (augmented), run ./train_scripts/train_vq_seq2seq_scan_jump.sh.
To train on SCAN AroundRight, run ./train_scripts/train_vq_seq2seq_scan_aroundright.sh.
To train on COGS, run ./train_scripts/train_vq_seq2seq_cogs.sh.
To train on CoGnition, run ./train_scripts/train_vq_seq2seq_cognition.sh.

1.3: Evaluate the SQ-Transformer

To evaluate on SCAN AddJump 2x (augmented), run ./eval_scripts/eval_vq_seq2seq_scan_jump.sh.
To evaluate on SCAN AroundRight, run ./eval_scripts/eval_vq_seq2seq_scan_aroundright.sh.
To evaluate on COGS, run ./eval_scripts/eval_vq_seq2seq_cogs.sh.
To evaluate on CoGnition, run ./eval_scripts/eval_vq_seq2seq_cognition.sh.

Citation

@article{jiang2024SQTransformer,
  title={Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings},
  author={Jiang, Yichen and Zhou, Xiang and Bansal, Mohit},
  journal={arXiv preprint arXiv:2402.06492},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
ar_seq2seq		ar_seq2seq
eval_scripts		eval_scripts
nat		nat
train_scripts		train_scripts
LICENSE		LICENSE
README.md		README.md
figures.png		figures.png
generate.py		generate.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SQ-Transformer: Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings

0 Installation

1 Running SQ-Transformer

1.1: Prepare Data

The complete data directory structure should be:

1.2: Train the SQ-Transformer

1.3: Evaluate the SQ-Transformer

Citation

About

Releases

Packages

Languages

License

jiangycTarheel/SQ-Transformer

Folders and files

Latest commit

History

Repository files navigation

SQ-Transformer: Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings

0 Installation

1 Running SQ-Transformer

1.1: Prepare Data

The complete data directory structure should be:

1.2: Train the SQ-Transformer

1.3: Evaluate the SQ-Transformer

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages