Skip to content
This repository was archived by the owner on Aug 19, 2025. It is now read-only.

factula/sumtool

Repository files navigation

sumtool

A toolkit for understanding factuality & consistency errors in summarization models.

Core Features

  • A harness for generating text summaries with automated factuality evaluations

    • NLI (textual entailment)
    • Question answering
    • Other metrics (BERT-Score, Rouge Score, etc.)
  • An interactive query interface for exploring generated summaries (i.e. XSum or custom dataset)

    • Search for common factuality errors across your dataset (i.e. find all numerical errors)
    • Explore faithfulness & factuality annotations (if available)
  • An interactive query interface for ngram lookup

    • search for a ngram query from the dataset

References

References

Setup

Setup (python 3.8):

pip install -r requirements.txt
pip install .

Run Streamlit app

streamlit run interface/app.py

You can also run interfaces individually, i.e.

streamlit run interface/summary_interface.py

Contributors

Setup (python 3.8):

pip install -r requirements.dev.txt
pip install -Ue .

Before commiting:

black sumtool/ interface/ scripts/
flake8 sumtool/ interface/ scripts/

Run on Google Colab for GPU

  1. Create a Github token to access your private repositories. Follow these steps here: Github: Creating a Personal Access Token

  2. Create a new Colab notebook and set the runtime type to GPU

  3. Add the following commands in the first cell to clone the repository and install the requirements

!git clone https://[your-git-token]@github.com/cs6741/summary-analysis.git
!pip install -r /content/summary-analysis/requirements.txt
  1. Add the following command to run the text generation script
!python /content/generate_xsum_summary.py --bbc_ids [idx1,idx2] --data_split [train|test]

Storage documentation

Pipeline for storage:

  1. Store generated summaries
    • by generating them using a custom model (example)
    • by loading them from an external dataset/paper (example)
  2. Compute summary metrics for stored summaries using sumtool.

/data/<dataset>/<model-id>-summaries.json

<document_id>: 
	summary: the generated summary,
	metadata: ...metadata for the generated summary, i.e. annotations / score / entropy

/data/<dataset>/<model-id>-metrics.json

<document_id>: 
	...metrics for a stored summary, i.e. rouge-score, bert-score

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors