Code for FormulaNet in NIPS 2017
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Code for reproducing the results in the following paper:

Premise Selection for Theorem Proving by Deep Graph Embedding
Mingzhe Wang*, Yihe Tang*, Jian Wang, Jia Deng (*equal contribution)
Neural Information Processing Systems (NIPS), 2017

Package dependency

  • Python3
  • PyTorch 0.1.12
  • Cuda

Downloading the dataset

Download and extract the HolStep dataset.

mkdir data/raw_data && cd data/raw_data
tar -xvzf holstep.tgz

Generating graph representations

After downloading the HolStep dataset, we generate and save graph representations for train, valid, and test in data/hol_data.

By default, we use 7% of the data in the training set as the validation set. The training and validation set do not share conjectures.

mkdir data/hol_data
python src/data_util/ data/hol_raw_data data/hol_data

Run python src/data_util/ -h for more options.

Pretrained models

  • models/FormulaNet-basic: FormulaNet-basic for conditional premise selection.
  • models/FormulaNet-basic-uc: FormulaNet-basic for unconditional premise selection.
  • models/FormulaNet: FormulaNet for conditional premise selection.
  • models/FormulaNet-uc: FormulaNet for unconditional premise selection.

Note: the above models should only be used with the default token dictionary data/dicts/hol_train_dict.

Training your own models

To train a FormulaNet-basic model, please run:

cd src
python  --log training.log --output model --record train_record

To train a FormulaNet model, please run:

cd src
python  --log training.log --output model --record train_record --binary

Option --binary turns on the order-preserving terms.

Run python -h and check scripts/ for more options.


cd src
python --model MODEL_FILE  # The file name, such as FormulaNet-basic-uc

Check scripts/ for commands to test the pretrained models.

Please contact us if you run into any issues or have any questions.


This work is partially supported by the National Science Foundation under Grant No. 1633157.