Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

smt2 input not satisfiable, but preprocessed & fixed up input is #403

Closed
bobismijnnaam opened this issue Jul 26, 2022 · 7 comments
Closed

Comments

@bobismijnnaam
Copy link

Hi, I have an smt2 snippet which I think should be satisfiable, but vampire cannot establish that. If I take the preprocessed tptp of that, and fix it up because it contains some problems, vampire can satisfy it somehow. So my questions are:

  • Is there a difference between my smt2 input and the "fixed up" tptp input? Or is the smt2 case supposed to time out, and the "fixed up" version not, and did my "fixing up" somehow got rid of the part that made the smt2 input time out?
  • Vampire does not immediately accept the preprocessed tptp of my smt2 input, I have to fix it up a bit first. Is this a bug, or am I mixing features (= smt2 frontend & preprocessing) in an unintended way?

The smt2 is as follows:

(declare-datatypes (($Snap 0)) ((($Snap.unit) ($Snap.combine ($Snap.first $Snap) ($Snap.second $Snap)))))
(define-sort $Perm() Real)
(define-fun $Perm.No () $Perm 0.0)
(define-fun $Perm.isValidVar ((p $Perm)) Bool (<= $Perm.No p))
(check-sat)

When running this with vampire 4.6.1-sledgehemmer-eval, I get a timeout after 60 seconds, using the following command:

$ vampire-461-sledge --mode casc_sat test.smt2

I tried to have a look at the preprocessed input as follows:

$ vampire --input_syntax smtlib2 --mode clausify test.smt2
tff(type_def_5, type, '$Snap()': $tType).
tff(func_def_0, type, '$Snap.unit': '$Snap()').
tff(func_def_1, type, '$Snap.combine': ('$Snap()' * '$Snap()') > '$Snap()').
tff(func_def_2, type, '$Snap.first': '$Snap()' > '$Snap()').
tff(func_def_3, type, '$Snap.second': '$Snap()' > '$Snap()').
tff(func_def_5, type, '$Perm.No': $real).
tff(pred_def_2, type, '$Perm.isValidVar': $real > $o).
cnf(u8,axiom,
    ($true = X0) | ($false = X0)).

cnf(u7,axiom,
    ($true != $false)).

cnf(u6,axiom,
    '$Snap.combine'(X0,X2) != '$Snap.combine'(X1,X3) | X2 = X3).

cnf(u5,axiom,
    '$Snap.combine'(X0,X2) != '$Snap.combine'(X1,X3) | X0 = X1).

cnf(u4,axiom,
    '$Snap.unit' != '$Snap.combine'(X0,X1)).

cnf(u3,axiom,
    '$Snap.unit' = X0 | '$Snap.combine'('$Snap.first'(X0),'$Snap.second'(X0)) = X0).

When giving this to vampire, I get the following message:

$ vampire --input_syntax smtlib2 --mode clausify test.smt2 | vampire
Parsing Error on line 9
Cannot create equality between terms of different types.
X0 is $i
$true is $o

There are a bunch of errors like this in the clausified output; using --mode preprocess has similar problems.

Fixing all the problems (in a naive way, I must add. I don't know tptp very well), I get the following:

tff(type_def_5, type, '$Snap()': $tType).
tff(func_def_0, type, '$Snap.unit': '$Snap()').
tff(func_def_1, type, '$Snap.combine': ('$Snap()' * '$Snap()') > '$Snap()').
tff(func_def_2, type, '$Snap.first': '$Snap()' > '$Snap()').
tff(func_def_3, type, '$Snap.second': '$Snap()' > '$Snap()').
tff(func_def_5, type, '$Perm.No': $real).
tff(pred_def_2, type, '$Perm.isValidVar': $real > $o).
tff(u8,axiom,
    ![X0: $o]:
    (($true = X0) | ($false = X0))).

tff(u7,axiom,
    ($true != $false)).

tff(u6,axiom,
    ![X0: '$Snap()', X1: '$Snap()', X2: '$Snap()', X3: '$Snap()']:
    ('$Snap.combine'(X0,X2) != '$Snap.combine'(X1,X3) | X2 = X3)).

tff(u5,axiom,
    ![X0: '$Snap()', X1: '$Snap()', X2: '$Snap()', X3: '$Snap()']:
    ('$Snap.combine'(X0,X2) != '$Snap.combine'(X1,X3) | X0 = X1)).

tff(u4,axiom,
    ![X0: '$Snap()', X1: '$Snap()']:
    ('$Snap.unit' != '$Snap.combine'(X0,X1))).

tff(u3,axiom,
    ![X0: '$Snap()']:
    ('$Snap.unit' = X0 | '$Snap.combine'('$Snap.first'(X0),'$Snap.second'(X0)) = X0)).

Vampire reports this is satisfiable:

$ vampire-461-sledge --mode casc_sat test_fixed.tptp
perf_event_open failed (instruction limiting will be disabled): Permission denied
% Running in auto input_syntax mode. Trying TPTP
% ott+11_3_aac=none:afr=on:afp=4000:afq=1.4:amm=off:anc=all:bs=unit_only:bsr=on:bce=on:fde=unused:irw=on:nm=64:newcnf=on:nwc=1:nicw=on:sac=on:sp=reverse_arity:uhcvi=on_31 on test_fixed
% SZS status Satisfiable for test_fixed
% # SZS output start Saturation.
cnf(u15,axiom,
    '$Snap.unit' != '$Snap.combine'(X0,X1)).

cnf(u19,axiom,
    '$Snap.combine'('$Snap.first'(X0),'$Snap.second'(X0)) = X0 | '$Snap.unit' = X0).

cnf(u23,axiom,
    '$Snap.combine'(X0,X2) != '$Snap.combine'(X1,X3) | X2 = X3).

cnf(u27,axiom,
    '$Snap.combine'(X0,X2) != '$Snap.combine'(X1,X3) | X0 = X1).

cnf(u38,axiom,
    '$Snap.combine'(X1,X2) != X0 | '$Snap.second'(X0) = X2 | '$Snap.unit' = X0).

cnf(u45,axiom,
    '$Snap.second'('$Snap.combine'(X0,X1)) = X1).

cnf(u51,axiom,
    '$Snap.combine'(X0,X1) = '$Snap.combine'('$Snap.first'('$Snap.combine'(X0,X1)),X1)).

cnf(u55,axiom,
    '$Snap.combine'(X1,X2) != X0 | '$Snap.first'(X0) = X1 | '$Snap.unit' = X0).

cnf(u72,axiom,
    '$Snap.first'('$Snap.combine'(X0,X1)) = X0).

cnf(u78,axiom,
    '$Snap.combine'(X10,X11) != '$Snap.combine'(X12,X13) | '$Snap.first'('$Snap.combine'(X10,X11)) = X12).

cnf(u85,axiom,
    '$Snap.second'(X0) = '$Snap.second'(X1) | X0 != X1 | '$Snap.unit' = X1 | '$Snap.unit' = X0).

cnf(u105,axiom,
    '$Snap.first'(X0) = '$Snap.first'(X1) | X0 != X1 | '$Snap.unit' = X1 | '$Snap.unit' = X0).

cnf(u125,axiom,
    '$Snap.combine'(X2,X3) != X4 | '$Snap.first'('$Snap.combine'(X2,X3)) = '$Snap.first'(X4) | '$Snap.unit' = X4).

cnf(u129,axiom,
    X0 != X1 | '$Snap.unit' = X0 | '$Snap.combine'('$Snap.first'(X0),'$Snap.second'(X1)) = X0 | '$Snap.unit' = X1).

cnf(u137,axiom,
    X0 != X1 | '$Snap.unit' = X0 | '$Snap.combine'('$Snap.first'(X1),'$Snap.second'(X0)) = X0 | '$Snap.unit' = X1).

cnf(u143,axiom,
    X3 != X4 | X3 != X5 | '$Snap.unit' = X5 | '$Snap.unit' = X3 | '$Snap.second'(X4) = '$Snap.second'(X5) | '$Snap.unit' = X4).

cnf(u147,axiom,
    X3 != X4 | X3 != X5 | '$Snap.unit' = X5 | '$Snap.unit' = X3 | '$Snap.first'(X4) = '$Snap.first'(X5) | '$Snap.unit' = X4).

% # SZS output end Saturation.
% ------------------------------
% Version: Vampire 4.6.1 (commit af1735c99 on 2021-12-01 14:43:47 +0100)
% Linked with Z3 4.8.13.0 f03d756e086f81f2596157241e0decfb1c982299 z3-4.8.4-5390-gf03d756e0
% Termination reason: Satisfiable

% Memory used [KB]: 5628
% Time elapsed: 0.003 s
% ------------------------------
% ------------------------------
% Success in time 0.029 s
@selig
Copy link
Contributor

selig commented Jul 26, 2022 via email

@quickbeam123
Copy link
Collaborator

@selig , I think you have a typo there, the mode is called tclausify.

(Still, the output clause set will be far from a complete axiomatization of real (or any other) arithmetic or datatypes. So it's satisfiability does not imply the same for the source input.)

@bobismijnnaam
Copy link
Author

bobismijnnaam commented Jul 27, 2022

Thanks for your quick replies! I have some more questions about this if that's okay.

Vampire assumes it is incomplete for data types.

In "Coming to Terms with Quantified Reasoning" it is mentioned that the datatype support implemented in Vampire is actually complete. Did the implementation change and is it now incomplete, or is it incomplete in a different way, say combined with reals or functions, as in my example?

Also, does this incompleteness mean vampire might say "unsatisfiable" when the input is actually satisfiable, meaning, can be refuted? Or am I now mixing up refutation prover terminology?

Regarding this issue: to me it feels like this is not something that's easily fixable, so if this is the case this issue can be closed.

EDIT:

I commented out some more parts of my smt2 input and am still wondering about the behavior I see. If I comment out the smt2 datatypes declaration and use --mode casc, vampire just spins, i.e., repeatedly restarts with "refutation not found, incomplete strategy". If I first preprocess it with tclausify, vampire determines the input to be satisfiable immediately. Finally, if I remove the define-fun, vampire can establish satisfiability without tclausify. Having or not having Real in there makes no difference.

The input:

(define-sort $Perm() Real)
(define-fun $Perm.No () $Perm 0.0)
(check-sat)

It seems that the behaviour I'm seeing has more to do with how vampire axiomates smt's define-fun, and this subtlety is not shown when processing the smt with tclausify? If you'd have time to look into this silly edge case or explain it to me I'd be very grateful. But if this is an unimportant edge case for vampire feel free to ignore this comment & close the issue!

@laurakovacs
Copy link
Contributor

The first-order theory of algebraic data types is decidable (or complete as you mention), as long as it is only about the term algebra (that is, only ADT constructors/destructors) . There are decision procedures for proving ADT formulas; but again, these are decision procedures when the formulas use ADT symbols only. In the " "Coming to Terms with Quantified Reasoning" paper we show decidability for a conservative extension of the ADT theory (restricted to ADT symbols and a subterm predicate); no uninterpreted function and other arbitrary sorts.

Once you start using additional symbols, such as uninterpreted functions or theory-symbols from different theories (e.g. reals, integers), the theory becomes undecidable. The undecidability proof is actually given in the "Coming to Terms with Quantified Reasoning" paper (Theorem 2).

The ADT reasoning in Vampire is based on the incomplete variation of superposition calculus + ADT (section 6 of the aforementioned paper). This variation comes with the advantage that we can work with ADT formulas combined with other theories, but it is incomplete. If a formula is provable/valid, we should be able to find a proof; but that is kind of aall we can guarantee. Satisfiability of ADT properties were not our focus yet, as as said, it cannot be made complete for arbitrary first-order formulas involving ADTs (but not just).

@bobismijnnaam
Copy link
Author

Thank you for the explanation, that clears it up for me.

Two more questions if that's okay.

First, if you axiomatize datatypes "manually" using quantifiers and types in tff, is it the case that you get similar expressive power compared to vampire's native ADT support, but performance suffers because new quantifiers are introduced for each datatype?

Second, is there a mailing list or forum that I should use if I want to ask more questions? Or should I use the github issues for that?

@quickbeam123
Copy link
Collaborator

If I remember correctly, you cannot finitely axiomatize the ADT theory, so "manual" will always be incomplete. Even if I am wrong, tailored inference rules which vampire employs are almost always better than axiomatizations, which they replace, because of how the inference rules interact with one another and with the rest of the calculus.

Issues seem to be a good way to reach us with easy to formulate usability questions.

@bobismijnnaam
Copy link
Author

Okay, thank you all very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants