Skip to content

Profiling Z3 queries

Catalin Hritcu edited this page Jan 29, 2021 · 14 revisions

Collecting the Z3 queries produced by F*

Shape of the F* invocation:

fstar --z3refresh --log_queries --print_z3_statistics \
    --query_stats [--record_hints | --use_hints] \
    --admit_except '(identifier, 1)' \
    Test.fst
  • --z3refresh: fstar calls a fresh Z3 instance for each query, instead of reusing the same for all queries. This is a bit slower, but good for robustness/reproducibility;
  • --log_queries: store the queries sent to z3 in .smt2 files (in the current directory). Will overwrite the .smt2 files generated by previous runs;
  • --print_z3_statistics: prints Z3's internal statistics for each query;
  • --query_stats: print general info to stdout about the queries run, the time taken by z3 to solve them, and whether the hints worked;
  • --record_hints: records hints;
  • --use_hints: after a previous invocation with --record_hints, try to use the recorded hints.
  • --admit_except <id>: admit all SMT queries except for the query <id>, which has the syntax (identifier, query_num). For example, --admit_except '(FStar.Fin.pigeonhole, 1)' (note that quotes are required for the shell). These are the query names you see in the output of --print_z3_statistics, for example. This does not currently work with --record_hints; only the hint for <id> will be recorded (see #1139 for the feature request)

The smt2 files produced by a run with --use_hints should be smaller, as F* can rely on the hints to prune the context.

Interpreting the results

  • Note the difference in verification time per query based on whether or not hint successfully replays.

    If there is a large discrepancy in time, then this suggests that there are assertions in the context that are causing Z3 to search and instantiate quantifiers needlessly.

    We can profile two runs of Z3, using the pruned and non-pruned smt2 files produced by two runs of F*, without hints and with hints. Then diff the two profiles to see what part of the context is causing on of the proofs to go slowly, using qprofdiff.

    z3 smt.qi.profile=true a.smt2 > a.prof 2>&1
    z3 smt.qi.profile=true b.smt2 > b.prof 2>&1
    qprofdiff -si a.prof b.prof
    

    A script that helps make this easier can be found here.

  • More thorough profiling can be achieved using the Z3 Axiom profiler.

    To produce an execution trace, run:

    z3 trace=true foo.smt2
    

    This will produce a z3.log file, which can be fed to the profiler. Use the trace_file_name option to set the name of the output file to something else.

    • Using mono, the profiler seems too slow to process any trace of nontrivial size (with a 15 Mo z3.log, it ran for an hour before I kill it);
    • The Help menu of the profiler gives useful information on how to use it.
  • If there are any queries whose hints fail to replay, then it is generally useful to understand exactly why this is the case.

    Suppose query q.smt2 has an unsat core u. And suppose q-u.smt2, the pruning of q to only include those assertions from u, cannot be proven unsat by Z3.

    We'd like to find out a minimal set of assertions from q, which when added to q-u allow it to be proven unsat. Call this set the "completions" of u.

    Using Z3 4.5.1, we can pass these options to F*:

    --z3cliopt smt.core.extend_patterns=true --z3cliopt smt.core.extend_patterns.max_distance=n
    

    to get it to try to compute a completion of u, for some metric n (larger n should produce larger completions; so we want to try with the smallest n first).

    Sometimes, the completion reported by Z3 will be too large to be of much use.
    In such cases, resorting to bisection search through q to help find a completion of u may help...

Clone this wiki locally