Skip to content
A PyTorch Implementation of FastFusionNet on SQuAD 1.1
Python Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
qa use oldsru Mar 19, 2019
.gitignore add code Mar 2, 2019
LICENSE Create LICENSE Mar 7, 2019
README.md Merge branch 'master' of github.com:felixgwu/FastFusionNet Mar 20, 2019
download.sh add code Mar 2, 2019
eval.py add evaluation script & acknowledgement Mar 7, 2019
prepro.py add code Mar 2, 2019
requirements.txt add code Mar 2, 2019
train.py add code Mar 2, 2019

README.md

FastFusionNet

Overview

This repo contains the code of FastFusionNet: New State-of-the-Art for DAWNBench SQuAD.

News

We now support PyTorch version >=0.4.1 in a new branch. However, it is slightly slower.

Requirements

torch==0.3.1
spacy==1.9.0
numpy
pandas
tqdm
tesnorboardX
oldsru

Please also install the SRU version 1 (oldsru) from here. Please download GloVe (Pennington et al., EMNLP 2014) and CoVe (McCann et al., NIPS 2017) by

bash download.sh

Preprocessing

Preprocessing the data set. This takes about 10 minutes. PATH_TO_SQAUD_TRAIN should be the path to train-v1.1.json and PATH_TO_SQAUD_DEV should be the path to dev-v1.1.json. This will generate the preprocessed data file at data/squad/data-fusion.pth.

mkdir -p data/squad
python prepro.py --train PATH_TO_SQAUD_TRAIN --dev PATH_TO_SQUAD_DEV

Training

To train FastFusionNet (Wu et al., arXiv 2019):

SAVE='save/fastfusionnet'
mkdir -p $SAVE
python train.py --model_type fusionnet --hidden_size 125 --end_gru \
    --dropout_rnn 0.2 --data_suffix fusion --save_dir $SAVE \
    -lr 0.001 -gc 20  -e 100 --batch_size 32 \
    --rnn_type sru --fusion_reading_layers 2 --fusion_understanding_layers 2 --fusion_final_layers 2

To train FusionNet (Huang et al., ICLR 2018):

SAVE='save/fusionnet'
mkdir -p $SAVE
python train.py --model_type fusionnet --hidden_size 125 --end_gru \
    --dropout_rnn 0.4 --data_suffix fusion --save_dir $SAVE \
    -lr 0.001 -gc 20  -e 100 --batch_size 32 \
    --rnn_type lstm --fusion_reading_layers 1 --fusion_understanding_layers 1 --fusion_final_layers 1 --use_cove

To train GLDR-DrQA (Wu et al., arXiv 2017):

python train.py --model_type gldr-drqa --hidden_size 128 \
    --dropout_rnn 0.2 --data_suffix fusion --save_dir $SAVE \
    -lr 0.001 -gc 20  -e 100 --batch_size 32 \
    -doc_layers 17 --question_layers 9

Evalutation

To evaluate the best trained model in 'save/fastfusionnet' and get the latency (batch size=1):

python eval.py --save_dir save/fastfusionnet --resume best_model.pt --eval_batch_size 1

Pre-trained model

FastFusionNet model link dev EM: 73.58 F1: 82.42

Reference

@article{wu2019fastfusionnet,
  title={FastFusionNet: New State-of-the-Art for DAWNBench SQuAD},
  author={Wu, Felix and Li, Boyi and Wang, Lequn and Lao, Ni and Blitzer, John and and Weinberger, Kilian Q.},
  journal={arXiv preprint arXiv:1902.11291},
  url={https://arxiv.org/abs/1902.11291},
  year={2019}
}

Acknowledgement

This is based on the v0.3.1 version of Runqi Yang's excellent DrQA code base as well as the official FusionNet on NLI implementation. Lots of Runqi's code is borrowed from Facebook/ParlAI under an MIT license.

You can’t perform that action at this time.