Skip to content
This repository has been archived by the owner on Jan 22, 2020. It is now read-only.

This repository provides the implementation for the paper "Combining Fact Extraction and Verification withNeural Semantic Matching Networks".

License

Notifications You must be signed in to change notification settings

j6mes/fever-unc-system

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Combine-FEVER-NSMN

This repository provides the implementation for the paper Combining Fact Extraction and Verification with Neural Semantic Matching Networks (AAAI 2019 and EMNLP-FEVER Shared Task Rank-1 System).

Requirement

  • Python 3.6
  • pytorch 0.4.1
  • allennlp 0.7.1
  • sqlitedict
  • wget
  • flashtext
  • pexpect
  • fire
  • inflection

Try to install the package as the order above. Previous version of pytorch can be find at legacy pytorch.

Preparation

  1. Setup the python environment and download the required package listed above.
  2. Run the preparation script.
source setup.sh
bash ./scripts/prepare.sh

The script will download all the required data, the auxiliary packages and files.

  1. Tokenize the dataset and build wiki document database for easy and fast access and query.
python src/pipeline/prepare_data.py tokenization        # Tokenization
python src/pipeline/prepare_data.py build_database      # Build document database. (This might take a while)

After preparation, the following folder should contain similar files as listed below.

data
├── fever
│   ├── license.html
│   ├── shared_task_dev.jsonl
│   ├── shared_task_test.jsonl
│   └── train.jsonl
├── fever.db
├── id_dict.jsonl
├── license.html
├── sentence_tokens.json
├── tokenized_doc_id.json
├── tokenized_fever
│   ├── shared_task_dev.jsonl
│   └── train.jsonl
├── vocab_cache
│   └── nli_basic
│       ├── labels.txt
│       ├── non_padded_namespaces.txt
│       ├── tokens.txt
│       ├── unk_count_namespaces.txt
│       └── weights
│           └── glove.840B.300d
├── wiki-pages
│   ├── wiki-001.jsonl
│   ├── ... ...
│   └── wiki-109.jsonl
└── wn_feature_p
    ├── ant_dict
    ├── em_dict
    ├── em_lemmas_dict
    ├── hyper_lvl_dict
    ├── hypernym_stems_dict
    ├── hypo_lvl_dict
    └── hyponym_stems_dict
dep_packages
├── DrQA
└── stanford-corenlp-full-2017-06-09
results
└── chaonan99
saved_models
├── saved_nli_m
├── nn_doc_selector
└── saved_sselector

Automatic pipeline procedure.

Running the pipeline system on the dev set with the code below:

python src/pipeline/auto_pipeline.py

Note that this pipeline is the (SotA) model in the AAAI paper. For EMNLP-FEVER Shared Task version, please refer to src/nli/mesim_wn_simi_v1_3.py and src/pipeline/pipeline_process.py.

Citation

If you find this implementation helpful, please consider citing:

@inproceedings{nie2019combining,
  title={Combining Fact Extraction and Verification with Neural Semantic Matching Networks},
  author={Yixin Nie and Haonan Chen and Mohit Bansal},
  booktitle={Association for the Advancement of Artificial Intelligence ({AAAI})},
  year={2019}
}

About

This repository provides the implementation for the paper "Combining Fact Extraction and Verification withNeural Semantic Matching Networks".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.1%
  • Shell 1.7%
  • Dockerfile 1.2%