Setting :random-seed has practically no effect, Renaming functions has huge effect #909

cliffordwolf · 2017-02-22T16:29:22Z

I am trying to evaluate the performance of different SMT2 encoding schemes. To this goal I create test cases using the encodings I'm trying to compare and run each test case many times with different :random-seed settings. However, it turns out that setting :random-seed has minimal to no effect. Randomly renaming the functions I declare/define without changing anything else on the other hand changes the performance of my test cases by more than 2x!

See this SO question for more information. The test cases test1.smt2, test2.smt2, and test3.smt2 used in the analysis below can be found here.

On my workstation, and with latest z3 (c67cf16), the test cases take about 2 minutes to execute.

I ran each of them eight times with :random-seed set to 1 .. 8. The runtime for each of the test cases is for the most part not effected by the :random-seed setting:

As mentioned above: The three test cases are practically identical! The only difference is the names I use for the identifiers I declare/define.

I am now considering writing a preprocessor that randomly renames the functions in an smt2 file so I can use it in addition to setting :random-seed..

There is an argument to be had that a solver should preserve the order in which statements are processed, because quite often related clauses show up in the SMT2 file close to each other. But I have never heard such an argument with regard to the identifiers declared/defined in the SMT2 input. I think whatever random changes I inject into the solver by renaming my functions, the same kind of random changes should be triggered by changing the value of :random-seed.

The text was updated successfully, but these errors were encountered:

cliffordwolf · 2017-02-23T10:01:45Z

I've now written this program to run all identifiers in my SMT2 code through a randomly generated substitution alphabet. It is a hack, but good enough to demonstrate my point. I've re-run my tests and added a third category "shuffle": the test1.smt file but run through my smtshuffle program with different seed values. This gives me a span of over 3x between best and worst case run time for seed values 1 .. 8:

Using a substitution alphabet also has the advantage that identifiers that share a prefix still share a prefix after the transformation.

I am not familiar with the z3 code base. But I would assume that there are places where containers (like std::map<>) that use identifiers as keys are iterated in sort order. Simply using a custom compare function for those containers that uses a substitution alphabet that is randomly generated from :random-seed should improve the situation.

(It looks like :random-seed literally does nothing and all variation I got before was because other load on the machine influenced the result. This time I let it run completely undisturbed and the curves for test1 .. test3 show even less variation than before.)

wintersteiger · 2017-02-23T12:22:08Z

Depending on what appears in your benchmarks, e.g., what theories/logics they use, and what options you chose, there will be more or less randomness. For instance, the default settings for smt.phase_selection and sat.phase are to use phase-caching (without randomness), but you can set them to 5 and random respectively to enable random phase selection in the SAT and SMT solvers. (See also smt.random_freq.) Note that auto_config may have to be disabled, otherwise the automatic configuration may overwrite some of your options. Most users don't want randomness, so by default all those options are set conservatively.

Regarding the renaming of identifiers: We should at the very least look at it briefly. Do you have a small, human-readable file that exhibits a large difference in performance upon renaming?

cliffordwolf · 2017-02-23T13:19:31Z

Do you have a small, human-readable file that exhibits a large difference in performance upon renaming?

I don't have any small, human-readable files that run for long enough to reasonably measure the runtime..

Do you have any small, human-readable files (QF_AUFBV / QF_ABV / or something similar) that run for at least a few seconds that I could use? I'd be happy to run some experiments with them.

(You can find my test[123].smt files that I used for my analysis at the link I posted above.)

Most users don't want randomness, so by default all those options are set conservatively.

I guess that most users do not run (set-option :random-seed <seed>) as part of their SMT input. I would suggest that as soon as a user sets this option, it is pretty safe to assume that they want randomness.

delcypher · 2017-02-23T14:19:59Z

@cliffordwolf Have your tried running with ASLR turned off? Some parts of Z3's code base are sensitive to the addresses of pointers which will vary from run to run unless ASLR is switched off.

cliffordwolf · 2017-02-23T14:53:10Z

@delcypher I did not (I don't even know how), but what you are describing would the the opposite of what I have observed: I see no variation in behavior from one run to the next, even when I'm modifying the (set-option :random-seed ...) setting. (I have not tried it, but I would not expect more variation when re-running z3 with exactly the same input file. :)

I want variation so I can take the median run time of several runs to get a performance metric that is not effected by random variation (very much) but instead reflects the intrinsic quality of the encoding used, so I can make a reasonable comparison of different encoding schemes for the same problem.

Edit: Ah! ASLR as in Address Space Layout Randomization. :) I thought you were speaking about a Z3 feature... ASLR is enabled on my machine, but that does not seem to have a significant effect on the run time I measure.

NikolajBjorner · 2017-02-23T16:35:13Z

There is a lot of useful data that I hope to look at when I get a chance.
One small comment, perhaps it is smt.random_seed or sat.random_seed.
Sorry, these names are not streamlined.

cliffordwolf · 2017-02-24T12:10:29Z

jfyi: I've now re-run my benchmarks. (10 hours on a 36 core machine on AWS.) With the increased variation in runtime I get from my smtshuffle approach I've now switched to the median of 9 runs instead of 5 runs:

New results: http://scratch.clifford.at/compact_smt2_enc_r1108.html
Old results: http://scratch.clifford.at/compact_smt2_enc_r1102.html

(Hover with your mouse cursor over a table cell to see the raw data points before the median.)

As you can see, yices and boolector are consistently >10x faster than z3 on this benchmarks. For some cases its even close to 100x. That's why I'm using test cases that are so long-running: I want 10-100 seconds run time on the fastest solver so I can still do reasonable comparisons. That same test cases happen to run for 400-4000 seconds on z3..

If there is interest in this test cases as a benchmark for z3, I can create versions that run in the neighborhood of 2 minutes, like the test[123].smt examples I posted above.

NikolajBjorner · 2021-07-21T03:13:08Z

If there is interest in this test cases as a benchmark for z3, I can create versions that run in the neighborhood of 2 minutes, like the test[123].smt examples I posted above.

test1/2/3 actually exposed a regression in 4.8.12

The more interesting question to me seems to be what is in these benchmarks http://scratch.clifford.at/compact_smt2_enc_r1108.html where z3 is 10-100x off.
It is typically something very blatant. For example, I found a long running inefficiency in how we bit-blast not too long time ago.
I am not sure I have access to the files mentioned on the web page

DAHeath mentioned this issue Nov 9, 2017

Changing variable name goes slower? DAHeath/Logic#17

Open

NikolajBjorner closed this as completed Oct 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting :random-seed has practically no effect, Renaming functions has huge effect #909

Setting :random-seed has practically no effect, Renaming functions has huge effect #909

cliffordwolf commented Feb 22, 2017

cliffordwolf commented Feb 23, 2017

wintersteiger commented Feb 23, 2017

cliffordwolf commented Feb 23, 2017

delcypher commented Feb 23, 2017

cliffordwolf commented Feb 23, 2017 •

edited

NikolajBjorner commented Feb 23, 2017

cliffordwolf commented Feb 24, 2017

NikolajBjorner commented Jul 21, 2021

Setting :random-seed has practically no effect, Renaming functions has huge effect #909

Setting :random-seed has practically no effect, Renaming functions has huge effect #909

Comments

cliffordwolf commented Feb 22, 2017

cliffordwolf commented Feb 23, 2017

wintersteiger commented Feb 23, 2017

cliffordwolf commented Feb 23, 2017

delcypher commented Feb 23, 2017

cliffordwolf commented Feb 23, 2017 • edited

NikolajBjorner commented Feb 23, 2017

cliffordwolf commented Feb 24, 2017

NikolajBjorner commented Jul 21, 2021

cliffordwolf commented Feb 23, 2017 •

edited