New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setting :random-seed has practically no effect, Renaming functions has huge effect #909
Comments
I've now written this program to run all identifiers in my SMT2 code through a randomly generated substitution alphabet. It is a hack, but good enough to demonstrate my point. I've re-run my tests and added a third category "shuffle": the Using a substitution alphabet also has the advantage that identifiers that share a prefix still share a prefix after the transformation. I am not familiar with the z3 code base. But I would assume that there are places where containers (like (It looks like |
Depending on what appears in your benchmarks, e.g., what theories/logics they use, and what options you chose, there will be more or less randomness. For instance, the default settings for Regarding the renaming of identifiers: We should at the very least look at it briefly. Do you have a small, human-readable file that exhibits a large difference in performance upon renaming? |
I don't have any small, human-readable files that run for long enough to reasonably measure the runtime.. Do you have any small, human-readable files (QF_AUFBV / QF_ABV / or something similar) that run for at least a few seconds that I could use? I'd be happy to run some experiments with them. (You can find my
I guess that most users do not run |
@cliffordwolf Have your tried running with ASLR turned off? Some parts of Z3's code base are sensitive to the addresses of pointers which will vary from run to run unless ASLR is switched off. |
@delcypher I did not (I don't even know how), but what you are describing would the the opposite of what I have observed: I see no variation in behavior from one run to the next, even when I'm modifying the I want variation so I can take the median run time of several runs to get a performance metric that is not effected by random variation (very much) but instead reflects the intrinsic quality of the encoding used, so I can make a reasonable comparison of different encoding schemes for the same problem. Edit: Ah! ASLR as in Address Space Layout Randomization. :) I thought you were speaking about a Z3 feature... ASLR is enabled on my machine, but that does not seem to have a significant effect on the run time I measure. |
There is a lot of useful data that I hope to look at when I get a chance. |
jfyi: I've now re-run my benchmarks. (10 hours on a 36 core machine on AWS.) With the increased variation in runtime I get from my smtshuffle approach I've now switched to the median of 9 runs instead of 5 runs:
(Hover with your mouse cursor over a table cell to see the raw data points before the median.) As you can see, yices and boolector are consistently >10x faster than z3 on this benchmarks. For some cases its even close to 100x. That's why I'm using test cases that are so long-running: I want 10-100 seconds run time on the fastest solver so I can still do reasonable comparisons. That same test cases happen to run for 400-4000 seconds on z3.. If there is interest in this test cases as a benchmark for z3, I can create versions that run in the neighborhood of 2 minutes, like the |
test1/2/3 actually exposed a regression in 4.8.12 The more interesting question to me seems to be what is in these benchmarks http://scratch.clifford.at/compact_smt2_enc_r1108.html where z3 is 10-100x off. |
I am trying to evaluate the performance of different SMT2 encoding schemes. To this goal I create test cases using the encodings I'm trying to compare and run each test case many times with different
:random-seed
settings. However, it turns out that setting:random-seed
has minimal to no effect. Randomly renaming the functions I declare/define without changing anything else on the other hand changes the performance of my test cases by more than 2x!See this SO question for more information. The test cases
test1.smt2
,test2.smt2
, andtest3.smt2
used in the analysis below can be found here.On my workstation, and with latest z3 (c67cf16), the test cases take about 2 minutes to execute.
I ran each of them eight times with
:random-seed
set to 1 .. 8. The runtime for each of the test cases is for the most part not effected by the:random-seed
setting:As mentioned above: The three test cases are practically identical! The only difference is the names I use for the identifiers I declare/define.
I am now considering writing a preprocessor that randomly renames the functions in an smt2 file so I can use it in addition to setting
:random-seed
..There is an argument to be had that a solver should preserve the order in which statements are processed, because quite often related clauses show up in the SMT2 file close to each other. But I have never heard such an argument with regard to the identifiers declared/defined in the SMT2 input. I think whatever random changes I inject into the solver by renaming my functions, the same kind of random changes should be triggered by changing the value of
:random-seed
.The text was updated successfully, but these errors were encountered: