Skip to content

Aatlantise/syntactic-augmentation-nli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

syntactic-augmentation-nli

This repository contains the syntactic augmentation dataset to improve robustness in NLI, used in our ACL 2020 paper, Syntactic Data Augmentation Increases Robustness to Inference Heuristics, by Junghyun Min1, Tom McCoy1, Dipanjan Das2, Emily Pitler2, and Tal Linzen1. A 7 minnute presentation on the paper can be accessed here.

1Department of Cognitive Science, Johns Hopkins University, Baltimore, MD

2Google Research, New York, NY

Data

Augmentation datasets are in the datasets folder. Each file is named using the following abbreviations:

Transformation strategies:

  • inv: inversion
  • pass: passivization
  • comb: combination of inversion and passivization
  • chaos: random shuffling condition

Sentence pair:

  • orig: original premise as premise, tranformed hypothesis as hypothesis
  • trsf: original hypothesis as premise, transformed hypothesis as hypothesis

Label:

  • pos: augmentation examples whose label is entailment
  • neg: augmentation examples whose label is nonentailment

Size:

  • small: 101 examples
  • medium: 405 examples
  • large: 1215 examples

For example, pass_trsf_pos_small.tsv is an set of 101 passivization with transformed hypothesis examples whose labels are entailment. Also, please note that the negative combined transformed-hypothesis nonentailed datasets (comb_trsf_neg_large.tsv, etc) are not discussed or reported in our paper.

Fields within each file are equivalent to the MNLI datasets downloadable from GLUE. However, only four fields index, sentence1 (premise), sentence2 (hypothesis), and gold_label are populated.

Script

The attached .tsv data files were used to augment the MultiNLI training set in our experiments. They are randomly selected subsets or unions of subsets of transformations created by running generate_dataset.py, which requires MultiNLI's json file multinli_1.0_train.jsonl to run. Simply modify the MNLI path argument before running python2 generate_dataset.py.

This will create four files: inv_orig.tsv, inv_trsf.tsv, pass_orig.tsv, and pass_trsf.tsv. From these four files, individual augmentation sets similar to those included in the datasets folder can be created by subsetting and/or concatenating.

Config

In the config folder, bert_config.json contains BERT configurations, while train.sh and hans_pred.sh contain training, evaluation, and prediction parameters for running BERT's run_classifier.py.

Training and evaluating on MNLI and HANS

If you already haven't downloaded BERT and MNLI data, now is the time. You can download BERT from its repository, and MNLI data from running download_glue_data.py. It includes files mentioned below like train.tsv and test_matched.tsv:

python download_glue_data.py --data_dir ~/download/path --tasks MNLI

To finetune BERT with an augmented training set, you can concatenate an augmentation set to training set train.tsv:

shuf -n1215 inv_trsf.tsv > inv_trsf_large.tsv
mv train.tsv train_orig.tsv
cat train_orig.tsv inv_trsf_large.tsv > train.tsv

and finetune BERT as you would on an unaugmented set by running train.sh.

Once the model is trained, it will also be evaluated on MNLI, and the results will be recorded in eval_results.txt in your output folder. It'll look something like this:

eval_accuracy = 0.8471727
eval_loss = 0.481841
global_step = 36929
loss = 0.48185167

Along with the results file, you'll also see checkpoint files starting with model.ckpt-some-number. They are model weights at a particular point in training, the higher the number, the closer it is to completion of training. If you used large augmentation, you'll have model.ckpt-36929 as your trained model.

To evaluate the model on HANS, you'll need to have downloaded scripts and datasets from HANS. And, format heuristics_evaluation_set.txt to resemble test_matched.tsv and have fields sentence1 (premise) and sentence2 (hypothesis) as 9th and 10th fields. Other fields can be filled with dummy fillers. The formatted file will also need to be named test_matched.tsv, so it is a good idea to keep MNLI and HANS directories separate.

Then, you can create the model's predictions on HANS with hans_pred.sh.

Once it is finished, it will produce test_results.tsv in your output folder. To analyze it, you need to process the results:

python process_results.py
python evaluate_heur_output.py preds.txt

This will output HANS performance by heuristic, by subcase, and by template.

License

This repository is licenced under the MIT license.

About

Create augmentation examples from MultiNLI by subject-object inversion and passivizing.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published