Syntactic Attention

This repository contains the code associated with this paper

Title: Compositional generalization in a deep seq2seq model by separating syntax and semantics

Abstract: Standard methods in deep learning for natural language processing fail to capture the compositional structure of human language that allows for systematic generalization outside of the training distribution. However, human learners readily generalize in this way, e.g. by applying known grammatical rules to novel words. Inspired by work in neuroscience suggesting separate brain systems for syntactic and semantic processing, we implement a modification to standard approaches in neural machine translation, imposing an analogous separation. The novel model, which we call Syntactic Attention, substantially outperforms standard methods in deep learning on the SCAN dataset, a compositional generalization task, without any hand-engineered features or additional supervision. Our work suggests that separating syntactic from semantic learning may be a useful heuristic for capturing compositional structure.

Summary

This directory contains the following:

data: directory containing the add-jump split of the SCAN dataset
data.py: classes for building PyTorch dataset and dataloader with SCAN
SyntacticAttention.py: Syntactic Attention model
train_test.py: Training script with periodic validation and testing
results: contains results of one run of Syntactic Attention on add-jump split
utils.py: contains helper function
vocab.json: contains vocabulary for SCAN

Requirements

Python 3 (here)
Install NumPy (here)
Install PyTorch (pytorch.org)

conda install pytorch torchvision -c pytorch

The add-jump dataset is already in the data directory, but the rest of the dataset can be downloaded here

Training and testing

To train and test the Syntactic Attention model on the add-jump split with the default hyperparameters:

python train_test.py

This will train the model and produce a json file in the results directory with loss data, along with periodic tests on the full training, validation, and test sets. The default hyperparameters and descriptions can be found in train_test.py.

Usage

usage: train_test.py [--train_data_file data/tasks_train_addprim_jump.txt]
                     [--test_data_file data/tasks_test_addprim_jump.txt]
                     [--load_vocab_json vocab.json][--batch_size 1]
                     [--num_iters 200000][--rnn_type LSTM]
                     [--m_hidden_dim 120][--x_hidden_dim 200][--n_layers 1]
                     [--dropout_p 0.5]
                     [--seq_sem False][--syn_act False][--sem_mlp False]
                     [--load_weights_from CHECKPOINT.pt][--learning_rate 0.001]
                     [--clip_norm 5.0]
                     [--results_dir results][--out_data_file RESULTS.json]
                     [--checkpoint_path CHECKPOINT.pt][--checkpoint_every 5]
                     [--record_loss_every 400]

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
data		data
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SyntacticAttention.py		SyntacticAttention.py
data.py		data.py
evaluate_bleu.py		evaluate_bleu.py
train_test.py		train_test.py
utils.py		utils.py
vocab.json		vocab.json
vocab_eng_MT_all.json		vocab_eng_MT_all.json
vocab_eng_MT_daxiste.json		vocab_eng_MT_daxiste.json
vocab_fra_MT_all.json		vocab_fra_MT_all.json
vocab_fra_MT_daxiste.json		vocab_fra_MT_daxiste.json

License

jlrussin/syntactic_attention

Folders and files

Latest commit

History

Repository files navigation

Syntactic Attention

Summary

Requirements

Training and testing

Usage

About

Resources

License

Stars

Watchers

Forks

Languages