Skip to content
Holophrasm: a neural Automated Theorem Prover for higher-order logic
Objective-C++ Other
Branch: master
Clone or download

README

Holophrasm: a purely neural automated theorem prover for Metamath
https://arxiv.org/abs/1608.02644

To use the program, start by downloading the release, which includes the binary files for the language model and the trained model weights.  Then, from the root directory run

python run_script_ordered.py
python write_all_proofs.py

run_script_ordered.py will attempt to search for proofs for all the Metamath propositions in the test set, starting with the (expected) easiest proofs first.  Reasonable parameters for running the script are:
timeout: 5
num_passes: 10000
search beam width: 20
hyp bonus: 0
Proofs are saved to searcher/proofs as soon as they are found.  The proofs that I found during my search are in searcher/proofs_baseline.

write_all_proofs.py will generate a modified_set.mm module file, which consists of the proofs from set.mm, with the found proofs replaced.  This can then be fed into Metamath and verified using the commands

READ modified_set.mm
VERIFY PROOF *

There's a known bug involving `dtrucor` ignoring the disjointness conditions, which has been fixed with a temporary hack. In case the bug applies more generally, you should not assume any proof is valid until metamath verifies modified_set.mm.

#############################################

If you want to train from scratch, you can do the following.  I recommend having at least 64GB of RAM.

Step 0: (OS X only) deal with Accelerate and forking bug
#########
export VECLIB_MAXIMUM_THREADS=1
#########

Step 1: train gen and pred. Neither of the training scripts exit automatically, so you
will need to keyboard interrupt them at some point, preferably after the training
rate decays to a sufficiently low value. All of these scripts may be run in parallel.
#########
python script_gen.py # Train the generative network
python script_pred.py # Train the prediction network
python save_language_model.py # Save a copy of the language model somewhere
#########

Step 2: set up the files for the nn interface
#########
mkdir searcher
cp weights/gen/train.parameters searcher/gen.parameters
cp weights/gen/train.weights searcher/gen.weights
cp weights/pred/train.parameters searcher/pred.parameters
cp weights/pred/train.weights searcher/pred.weights
#########

Step 3: construct the payout data set and train the payout network
The first script takes days, but is highly parallelizable. Chunks may be run on
different machines, and multiprocessing can be enabled by changing the
"withpool.Pool(None)" to "withpool.Pool(n)", where n is the desired number of
processes.
WARNING: multiprocessing support may crash Windows machines.
WARNING: if generate_payout_data_script.py does not run correctly, it may leave
files payout_data_* of size 0. These need to be deleted before rerunning the script.
#########
python generate_payout_data_script.py
python script_payout.py
cp weights/pred/train.parameters searcher/payout.parameters
cp weights/pred/train.weights searcher/payout.weights
#########

Step 5: run the proof search (hyp bonus should be set to 0.0 for consistency with the paper)
The search will be much faster if you set "multi=True" in the searcher.run call, but this
may break on Windows.
#########
python run_script_ordered.py
#########

Step 6: verify the proofs
######### PYTHON CODE
import write_proof
write_proof.reset()
write_proof.write_all_known_proofs()
#########

Step 7: copy the modified_set.mm into the metamath folder and use the following commands to verify the proofs
######### METAMATH COMMANDS
ERASE
READ modified_set.mm
VERIFY PROOF *
#########
You can’t perform that action at this time.