Two almost identical files - differing only for unused rules - one verifies in 70s, the other in 25 minutes #539

claudiacauli · 2023-04-20T07:43:23Z

I have two almost identical files (automatically generated): file A and file B.
The two files only differ for 2 rules and one restriction that are not needed for the verification of the lemmas.
File A contains the three extra items, files B doesn't.
File A verifies in approximately 70s, file B takes approximately 25 minutes.

The restriction is an "ExactlyOnce" restriction to force the other two rules to appear exactly once in every trace (even though these are not needed in this specific model). Intuitively, I would have expected the opposite: that is, file A should take longer and file B shorter. However, this is not what's happening in practice.

I'm using heuristic=c and stop-on-trace=SEQDFS flags. The order of all rules and facts in the two files is the same. At a diff, the files differ only for the presence/absence of these 3 items. I'm running these on a machine with 16 cores and 64Gb Ram and increasing the size of the area allocated to garbage collection with the +RTS -A2048M -RTS flag (I found this to be optimal). All 16 cores seem to be used by default without runtime flags (did you link the binary with this?). I'm doing benchmarking with hyperfine, including warmup runs etc., so the numbers are not a one-off.

Unfortunately I cannot attach the files for reproducibility.

Can you suggest any reasons why this might be happening?

The text was updated successfully, but these errors were encountered:

claudiacauli · 2023-04-20T14:57:16Z

Update: after digging further into the issue, I have discovered that for the two different theories a slightly different set of loop breakers are assigned. As a test, I switched from heuristic=c to heuristic=C and that solved the runtime difference issue! Even though, I'm not sure why not delaying loop breakers helps.

Say that we have five rules that are shared across the two models (not involving and not related - in a fact-flow sense - to the rules that are present in file A but not in file B):

             A.spthy      |       B.spthy
  rule1      loop b: [1]  |      
  rule2      loop b: [1]  |      loop b: [0,1]
  rule3      loop b: [1]  |      loop b: [1]
  rule4      loop b: [1]  |      loop b: [1]
  rule5                   |      loop b: [1]

I'm not sure why different loop breakers might cause a proof to take this long. What is the meaning of the numbers within square brackets?

Thanks for the support!

cascremers · 2023-04-20T16:25:33Z

Hi, Just to understand your problem better: what do you mean exactly with "unused" rules? E.g., - rules with an LHS fact that doesn't occur on any RHS, or - rules whose LHS facts occur on RHS of other rules, but never (famous last words) with matching parameters? (Note that in general, it is undecidable if a rule can ever be used.) It could be that the precomputation joins some of these in different source chains, which then can influence the heuristics. Similarly, loop breakers influence heuristics. The heuristics strongly impact analysis time and termination. In general, it is very hard to debug this properly without seeing the rules. Best, Cas

…

On Thu, Apr 20, 2023 at 4:57 PM Claudia Cauli ***@***.***> wrote: *Update:* after digging further into the issue, I have discovered that for the two different theories a slightly different set of loop breakers are assigned. Say that we have five rules that are shared across the two models (not involving and not related - in a fact-flow sense - to the rules that are present in file A but not in file B): A.spthy | B.spthy rule1 loop b: [1] | rule2 loop b: [1] | loop b: [0,1] rule3 loop b: [1] | loop b: [1] rule4 loop b: [1] | loop b: [1] rule5 | loop b: [1] I'm not sure why different loop breakers might cause a proof to take this long. What is the meaning of the numbers within square brackets? Thanks for the support! — Reply to this email directly, view it on GitHub <#539 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AALJWOAAHW5A44LEQL3C2CLXCFFFPANCNFSM6AAAAAAXFDHFXQ> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

claudiacauli · 2023-04-21T08:25:53Z

Hi, sorry for the lack of clarity and thanks for getting back at me.
By "unused" rules I meant that these rules form a graph-structure (where the nodes are ground rules and the edges are dependency relations between the premises of one rule and the consequences of others) that is a separate component wrt the rest of the theory graph. In particular:

Most facts in the consequences of these rules have symbols that don't appear anywhere else,
For those consequence facts whose symbols do appear somewhere else in the theory a term matching is not possible. In fact, these consequence facts have all terms already grounded, and these constants don't appear anywhere else in the theory.

rule X1:
 [ ] --> [ !Aaa('Label1'), !Bbb('Label2') ]              // !Aaa and !Bbb symbols do appear somewhere else in the theory but 
                                                                                // these two labels 'Label1' and 'Label2' cannot for sure.

rule X2: 
[ !RR(var), !Pii(var,pr) ] --> [ !CC('Label1') ]        // !RR's and !CC's fact symbol ("RR", "CC") do not appear anywhere else in the theory. 

rule X3: 
[ !PP(var1), !Piii(var,pr) ] --> [ !DD('Label2') ]    // !PP's and !DD's fact symbol ("PP", "DD") do not appear anywhere else in the theory. 

rule 1: ...

rule 2: ...

rule 3: ...

rule 4: ...

rule 5: ...

With all the three rules in place, or commenting out some of X1, X2 then the following loop breakers are computed:

rule1      loop b: [1]      
  rule2      loop b: [1]  
  rule3      loop b: [1]  
  rule4      loop b: [1]  
  rule5

If I comment rule X3 out, the the pattern of loop breakers that gets computed changes and it becomes:

rule1      
rule2      loop b: [0,1]
rule3      loop b: [1]
rule4      loop b: [1]
rule5      loop b: [1]

The rules are written in this order within the theory. It's unclear to me while commenting out X3 (specifically) changes the pattern of loop breakers that are computed, but this doesn't happen if I comment out X2 (for example).
I wonder if this has to do with the order these are included in the file, for example, and if swapping X2 and X3 will have some effect on that.

I understand that this might still not be enough for you to advise me properly. Sorry about that.

rsasse · 2023-04-25T14:43:40Z

The ordering of rules should not matter for this (famous last words, please do try it out). However, X3 uses 3 facts, !PP, !Piii, and !DD - do any of these appear in rules1-5, independent of whether the term arguments would match? If they appear, that might be enough already.

Also note that you may know that the given argument constants don't appear, but the tool also has to be able to figure it out, which may be hard if there are some variables there. If it's all constants, it should be ok.

This is unfortunately very hard to help debug without a look at the theories. As Cas said, proving time depends on a lot of factors with a quite complex possible interplay.

Good luck with your verification attempt!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Two almost identical files - differing only for unused rules - one verifies in 70s, the other in 25 minutes #539

Two almost identical files - differing only for unused rules - one verifies in 70s, the other in 25 minutes #539

claudiacauli commented Apr 20, 2023

claudiacauli commented Apr 20, 2023 •

edited

cascremers commented Apr 20, 2023 via email

claudiacauli commented Apr 21, 2023

rsasse commented Apr 25, 2023

Two *almost* identical files - differing only for unused rules - one verifies in 70s, the other in 25 minutes #539

Two *almost* identical files - differing only for unused rules - one verifies in 70s, the other in 25 minutes #539

Comments

claudiacauli commented Apr 20, 2023

claudiacauli commented Apr 20, 2023 • edited

cascremers commented Apr 20, 2023 via email

claudiacauli commented Apr 21, 2023

rsasse commented Apr 25, 2023

Two almost identical files - differing only for unused rules - one verifies in 70s, the other in 25 minutes #539

Two almost identical files - differing only for unused rules - one verifies in 70s, the other in 25 minutes #539

claudiacauli commented Apr 20, 2023 •

edited