Skip to content

petezh/OpenD5

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenD5

Authors: Ruiqi Zhong, Peter Zhang, Steve Li, JinWoo Ahn, Dan Klein, Jacob Steinhardt

Paper link

This repository hosts OpenD5, a benchmark for discovering natural language facts from pairs of corpora. Our paper focuses on the setting comparing two distributions of text via a text description. The repository containing the system is available here.

The benchmark spans a wide array of disciplines and problem types. A sibling repostiory that contains code for running our system for solving these problems is available here.

To create the full benchmark, you should 1) downloaded these folders and 2) run the build_benchmark.sh script from the main repo.

For more details, please refer to the

Downloads

  • The 675 problems in the original paper are available here.
  • An extension with 37 additional problems is available here.
  • A reproduction package for the entire dataset is available here. It includes additional source data that is required to assemble the full dataset.

Contributing

If you'd like to contribute additonal problems to the benchmark, please:

BibTeX

@article{zhong2023goal,
  title={Goal Driven Discovery of Distributional Differences via Language Descriptions},
  author={Zhong, Ruiqi and Zhang, Peter and Li, Steve and Ahn, Jinwoo and Klein, Dan and Steinhardt, Jacob},
  journal={arXiv preprint arXiv:2302.14233},
  year={2023}
}

Releases

No releases published

Packages

No packages published