This repository contains the code and data for the implementation of our paper.
To be consistent with previous work TANDA, we base our implementation on the transformers package and use the following script to enable scan
option for the package.
git clone https://github.com/huggingface/transformers.git
cd transformers
git checkout f3386 -b scan
git apply scan.diff
Then we can train the model by the script: train.sh.
We initailize our pre-trained language model with checkpoints from TANDA. Specifically, we use 'RoBERTa-Base ASNQ'
Download from here and questions without correct answers are removed.
Download from here
Download from here
All datasets are processed according to the description of our paper.
Note: some hyper-parameters(such as lrbl, epochb, lambdap) may vary among different environments/software/hardware/random seeds, so careful tunning is needed.