Skip to content

This repository contains code to derive the data generated by the pipeline described in the paper "Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization".

License

Notifications You must be signed in to change notification settings

joshuabambrick/Falsesum

Repository files navigation

Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization

Authors: Prasetya Ajie Utama, Joshua Bambrick, Nafise Sadat Moosavi, and Iryna Gurevych.

Purpose

This repository contains code to derive the data generated by the pipeline described in the paper Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization to appear in the 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022). This code is not intended to be modified or reused.

Usage

The code in this repository uses Python 3.

To derive the dataset:

  1. Download the CNN Stories and the Daily Mail Stories from https://cs.nyu.edu/~kcho/DMQA/
  2. Unpack the downloaded files into a new directory
  3. Run generate_falsesum_data.py to generate the dataset

Example execution:

python generate_falsesum_data.py <dir-with-falsesum-jsonl-data> <dir-with-unpacked-cnndm-data> <target-output-dir>

Citation

If you find this work useful, please consider citing our paper as:

@inproceedings{utama-etal-2022-falsesum,
  author    = {Utama, Prasetya Ajie and Bambrick, Joshua and Moosavi, Nafise Sadat and Gurevych, Iryna},
  title     = {Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization},
  booktitle = {Proceedings of the 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics},
  month     = jul,
  year      = {2022},
  publisher = {Association for Computational Linguistics}
}

License

Please read the LICENSE file.

About

This repository contains code to derive the data generated by the pipeline described in the paper "Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages