`(expt 2 8)` induction? #448

kazarmy · 2023-05-18T12:40:27Z

I have the following:

(set-info :smt-lib-version 2.6)
(set-option :produce-proofs true)
(set-logic UFDTLIRA)

(define-fun zip ((i Int)) Bool
  (= i 0))
(define-fun-rec expt ((r Int) (i Int)) Int
  (ite (zip i) 1
  (ite (= r 0) 0
  (ite (> i 0) (* r (expt r (+ i (- 1))))
  (- 1)))))

(assert
 (not
  (= (expt 2 7) 128)))

(check-sat)
(get-proof)
(exit)

(yes it's a partial reimplementation of ACL2's expt)

Vampire 4.7 (807e37dd9) proves the above (with some difficulty) using:

vampire --mode portfolio -sched struct_induction -thi all

but when I change the assert to:

(assert
 (not
  (= (expt 2 8) 256)))

Vampire doesn't find the proof within the time limit anymore. Now I know that CVC5 1.0.5 proves the above assertion instantly, so are there are any other options that I need to set to encourage Vampire to go down the constant term instantiation path? -- I hope I'm using terms correctly here.

The text was updated successfully, but these errors were encountered:

MichaelRawson · 2023-05-18T13:21:22Z

@joe-hauns - when you get a minute, should the current theory machinery handle this "evaluation" use case somehow in your view? Otherwise I'll close as I don't think we can do much better here at present.

selig · 2023-05-18T14:10:01Z

thi all is going to fill up the search space a lot, it's unlikely it's the best option compared to other theory instantiation options; it also works well when combined with a uwa option.

What kind of induction are you hoping to do here (I haven't looked closely at the actual problem)? You have integers but you've picked the structural induction schedule, which is over datatypes and recursive functions (which you also have). I think just picking the induction schedule will try and detect which kinds of induction might apply. Although I don't think any of the induction schedules make broad use of all of the arithmetic reasoning options.

kazarmy · 2023-05-18T14:57:20Z

Thanks for the hints! All I can say right now about this is that the struct_induction schedule is significantly faster compared to the integer_induction and induction schedules (~33s vs ~54s on my laptop), and that there are some interesting lines that appear to be options lines e.g.

% lrs+10_1_drc=off:ind=struct:sos=theory:sstl=1:to=lpo_89 on expt.smt2 for (89ds/0Mi)

that might suggest that uwa should be off. And without thi all, Vampire doesn't succeed in proving in 60s "(= (expt 2 3) 8)" much less "(= (expt 2 7) 128)". All of this is very interesting tbh.

It's just that I'm not sure whether I can or should be poring over papers, logs and source code to see what every option does (and how small changes in the problem affect the proving process). Still, I might come back to this when I have the chance.

joe-hauns · 2023-05-18T16:50:46Z

Well i don't really know anything about how we handle define-fun-rec and how induction is being performed for such functions. I think @mezpusz knows best about that.
Recursive function mechanisms are definitely separate from the usual theory evaluation i've been working on.

selig · 2023-05-18T18:05:19Z

@joe-hauns I think handling define-fun-rec is orthogonal. If this needs the all then it's needing to do arithmetic reasoning involving constraint solving that seems relevant to your work.... however, these are integers.

@kazarmy ... I feel your pain. The current schedules are generated based on sample problems and it sounds like your use case doesn't fall nicely into any of those sets. The issue you're facing is that some combinations of options probably weren't explored when generating those schedules so we don't see them.

If I had your problem I'd define an experiment that searched the option space to find a nice set of complementary options for a representative set of problems but that's a lot of compute and requires extra knowledge in which strategies you should be trying. Over the last 10 years we keep discussing whether this is a service we could provide to users but it's a big effort that nobody's working on.

MichaelRawson · 2023-07-18T12:03:58Z

Sadly I think this is something we're not expecting to work well (yet). I looked vaguely at proof search and I'd expect this to rewrite "backwards" from the goal, but it doesn't because the symbol precedence forbids it, I think.Perhaps future work with arithmetic or recursive functions will improve matters. @mezpusz I think you're the expert here, if you've got anything to add please do!

mezpusz · 2023-07-18T12:28:06Z

Sorry for the late reply! As far as I see, induction won't help here since we cannot generalize (= (expt 2 8) 256) to any universal statement, and the arithemetic axioms + the above function definitions should be enough to solve this. I guess the proof search simply diverges due to superposition trying to expand the expt definition which results in larger and larger terms and more and more time is spent on useless superpositions between the axioms..

MichaelRawson closed this as completed Jul 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`(expt 2 8)` induction? #448

`(expt 2 8)` induction? #448

kazarmy commented May 18, 2023

MichaelRawson commented May 18, 2023

selig commented May 18, 2023

kazarmy commented May 18, 2023

joe-hauns commented May 18, 2023

selig commented May 18, 2023

MichaelRawson commented Jul 18, 2023

mezpusz commented Jul 18, 2023

(expt 2 8) induction? #448

(expt 2 8) induction? #448

Comments

kazarmy commented May 18, 2023

MichaelRawson commented May 18, 2023

selig commented May 18, 2023

kazarmy commented May 18, 2023

joe-hauns commented May 18, 2023

selig commented May 18, 2023

MichaelRawson commented Jul 18, 2023

mezpusz commented Jul 18, 2023

`(expt 2 8)` induction? #448

`(expt 2 8)` induction? #448