ProofNet

Warning

This repository hosts the original Lean 3 version of ProofNet. Since Lean 3 is no longer maintained, you should use one of the Lean 4 ports of ProofNet, such as the one contained in deepseek-ai/DeepSeek-Prover-V1.5.

Code for replicating the paper ProofNet: Autoformalizing and Formally Proving Undergraduate Mathematics.

This repo is intended for replicating experimental results and accepting PRs to the dataset. To use ProofNet for your own experiments, use the Huggingface dataset.

ProofNet is a benchmark for autoformalization and formal proving of undergraduate-level mathematics. The ProofNet benchmarks consists of 371 examples, each consisting of a formal theorem statement in Lean 3, a natural language theorem statement, and a natural language proof. The problems are primarily drawn from popular undergraduate pure mathematics textbooks and cover topics such as real and complex analysis, linear algebra, abstract algebra, and topology. We intend for ProofNet to be a challenging benchmark that will drive progress in autoformalization and automatic theorem proving.

Citation

@misc{azerbayev2023proofnet,
      title={ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics}, 
      author={Zhangir Azerbayev and Bartosz Piotrowski and Hailey Schoelkopf and Edward W. Ayers and Dragomir Radev and Jeremy Avigad},
      year={2023},
      eprint={2302.12433},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Directory Structure

benchmark/ contains .lean and TeX source files for maintaining the dataset. If you wish to open a PR for the dataset, modify the Lean and TeX source in benchmark/benchmark_to_publish and run parse_files.py.
calc_perplexity contains scripts for calculating proof-pile and arXiv perplexity.
eval contains scripts for running the autoformalization and informalization experiments found in the paper.
train_backtranslation contains code for extracting mathlib declarations, informalizing mathlib using the OpenAI API, and fine-tuning distilled backtranslation models.

Name		Name	Last commit message	Last commit date
Latest commit History 346 Commits
benchmark		benchmark
calc_perplexity		calc_perplexity
eval		eval
images		images
train_backtranslation		train_backtranslation
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
leanpkg.toml		leanpkg.toml
xena_post.md		xena_post.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProofNet

Directory Structure

About

Releases

Packages

Contributors 7

Languages

License

zhangir-azerbayev/ProofNet

Folders and files

Latest commit

History

Repository files navigation

ProofNet

Directory Structure

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages