# Added Semantic Tableaux. Improved pl_true #2964

Open
wants to merge 6 commits into
from

None yet

### 3 participants

 Added satisfiability checking using a Semantic Tableaux. Semantic tableaux removes the overhead of converting a formula to CNF (or DNF) and is faster than a SAT solver. However, it can only return True or False to indicate satisfiability, and not the model. Modified pl_true to give the correct result under incomplete interpretations. Previously something like `pl_true(a >> (b >> a))` would return None. This function now accurately returns whether the value is obvious (True/False) or not (None).
 Soumya Dipta Biswas `Added Semantic Tableaux` `243bc05`
Owner
 Can you show some benchmarks? There are some benchmark dimacs files in sympy/logic/benchmarks, though you should also benchmark non-CNF formulas as well.
sympy/logic/inference.py
 else: - raise ValueError("Illegal operator in logic expression" + str(expr)) + raise ValueError("Illegal operator in expression" + str(expr)) + + +def semantic_tableaux(expr): + """ + Checks satisfiability of a formula using a Semantic Tableaux + This method is much faster than a traditional SAT solver however + it returns only True or False. To obtain a model use satisfiability
 asmeurer Owner "use `satisfiable`." And put a period at the end.
sympy/logic/inference.py
 + False + """ + + if isinstance(expr, bool): + return expr + + expr = eliminate_implications(sympify(expr)) + + if is_literal(expr): + return expr + + elif isinstance(expr, Not): + raise TypeError("Expression is not in Negation Normal Form") + + elif isinstance(expr, Or): + args = [semantic_tableaux(arg) for arg in expr.args]
 asmeurer Owner Can't you do some short circuiting here?
 This thing is still a work in progress. Should finish it in about 2 days. Will post the benchmarks then.
 There seems to be a interesting flaw in the implementation (not in the idea) I posted yesterday. It is complete but not sound i.e. if function return false then, the formula is definitely unsatisfiable, but if it returns satisfiable, it may or may not be so. I am obviously going to fix this flaw asap but it is interesting because of the following math. This algorithm takes about 15% the time of normal sat for formulas with 5-10 variables and 15-25 clauses. For bigger formulas this will perform even better. In the worst case, assume that the algorithm returns False in 20% cases. Old Time: 1 New Time: 0.2 * 0.15 + 0.8 * 1.15 = 0.95 Should we think about using similar heuristics to speed up the common case for SAT?
 Soumya Dipta Biswas `Fixed critical flaw in semantic_tableaux` `96fc7c8`
 I fixed the problem mentioned in the last post. Will test it thoroughly tonight and post the stats tomorrow (and subsequent changes, if any). I was probably very unclear about what I meant in the previous post (regarding usage of heuristics). I meant, when a model is required (and not just the satisfiability) should we use this algorithm to determine whether or not an actual SAT solver should be called? If this algorithm returns True, the solver can be called, else return False. Depending upon the percentage of unsatisfiable formulas, this method might speed up the common case.

Here are the test results with 5 - 10 variable and (1) 5 - 10, (2) 15 - 25, (3) 30 - 50 clauses
The selection of alternating Satisfiable and Unsatisfiable formulas is completely intentional. Still working on improving efficiency further. Will add tests to the test file and commit once done.

Sl No. Faster Diff Atoms Clauses Avg. Args SemTab Tseitin
Test 1 SemTab 15589 10 8 4.62 True True
Test 2 SemTab 17573 10 8 4.38 False False
Test 3 SemTab 18619 6 7 4.43 True True
Test 4 Tseitin 435 8 6 2.83 False False
Test 5 SemTab 27103 7 5 4.60 True True
Test 6 SemTab 11613 6 6 3.17 False False
Test 7 SemTab 8031 5 8 3.12 True True
Test 8 SemTab 8829 6 6 4.00 False False
Test 9 SemTab 7724 10 5 3.40 True True
Test 10 SemTab 11283 10 8 2.38 False False

Average: SemTab: 3187, Tseitin: 15780

Sl No. Faster Diff Atoms Clauses Avg. Args SemTab Tseitin
Test 1 SemTab 32332 7 18 4.78 True True
Test 2 SemTab 18163 5 18 2.61 False False
Test 3 SemTab 54329 9 23 3.57 True True
Test 4 SemTab 28019 8 21 3.24 False False
Test 5 SemTab 25769 10 15 5.07 True True
Test 6 SemTab 25648 9 16 4.62 False False
Test 7 SemTab 30802 7 23 2.70 True True
Test 8 SemTab 22512 5 15 4.73 False False
Test 9 SemTab 14889 8 15 3.67 True True
Test 10 SemTab 35666 10 17 3.82 False False

Average: SemTab: 4253, Tseitin: 33066

Sl No. Faster Diff Atoms Clauses Avg. Args SemTab Tseitin
Test 1 SemTab 82065 7 37 4.16 True True
Test 2 SemTab 81062 8 49 4.14 False False
Test 3 SemTab 52917 6 31 5.65 True True
Test 4 SemTab 60515 6 48 4.12 False False
Test 5 SemTab 77310 6 42 4.79 True True
Test 6 SemTab 64374 10 40 3.55 False False
Test 7 SemTab 42413 10 31 3.32 True True
Test 8 SemTab 54045 5 45 5.38 False False
Test 9 SemTab 57479 8 45 4.20 True True
Test 10 SemTab 86744 6 47 5.40 False False

Average: SemTab: 13970, Tseitin: 79862

Member
 If this thing IS true, then I think we should go with this as far as the assumptions system is concerned. A factor of 7-8 is too large to be ignored. I hope you have confirmed the validity of the test results.
and 1 other commented on an outdated diff Feb 27, 2014
sympy/logic/inference.py
 else: - raise ValueError("Illegal operator in logic expression" + str(expr)) + raise ValueError("Illegal operator in expression" + str(expr)) + + +def semantic_tableaux(expr):
 srjoglekar246 Member The name seems a little ambiguous. Looking at this name, a user may be tempted to think this returns a Tree structure. How about naming this 'check_semantic_tableau' or something on those lines? sdbiswas I was instead thinking of making it `__semantic_tableaux` and add a model or return_model parameter to satisfiable function. `satisfiable(expr, algorithm="dpll2", model=True)` The end-user is very less likely to know what this function does and how it can be used. Its major use to evaluate unsatisfiability (and validity), so integrating it with the corresponding function would probably be a good idea. srjoglekar246 Member `return_model` seems a more self-explanatory argument. But adding this functionality to satisfiable probably seems a good idea.
commented on the diff Feb 27, 2014
sympy/logic/inference.py
 + Checks satisfiability of a formula using a Semantic Tableaux + This method is much faster than a traditional SAT solver however + it returns only True or False. To obtain a model use 'satisfiable' + + Examples + ======== + + >>> from sympy.abc import A, B + >>> from sympy.logic.inference import semantic_tableaux + >>> semantic_tableaux(A | B) + True + >>> semantic_tableaux(A & ~A) + False + + References + ==========
 srjoglekar246 Member Put the references before the Examples. Looks better in the Sphinx docs. asmeurer Owner I thought it was the other way around. It makes most sense to me to have the references at the end, just like they are anywhere else.
sympy/logic/inference.py
 + return True + else: + e.remove(clause) + + if isinstance(clause, Or): + for arg in clause.args[1:]: + temp = set(e) + temp.add(arg) + s.append(temp) + e.add(clause.args[0]) + s.append(e) + + elif isinstance(clause, And): + for arg in clause.args: + e.add(arg) + s.append(e)
 srjoglekar246 Member Given the nature of this algorithm, it would be good to have in-code comments about the steps you are taking. Especially the way the tree develops/is pruned depending on the sub-formulae (And/Or/Literals etc) and boundary cases.
 Soumya Dipta Biswas `Added 'deep' to pl_true` `49dcaf3`
commented on the diff Feb 28, 2014
sympy/logic/inference.py
 @@ -63,12 +63,13 @@ def literal_symbol(literal): raise ValueError("Argument must be a boolean literal.") -def satisfiable(expr, algorithm="dpll2"): +def satisfiable(expr, return_model=True, algorithm="dpll2"):
 sdbiswas @asmeurer @srjoglekar246 In future, we might want to remove the algorithm parameter. Unless we have 2 competing algorithms such that algo1 is faster for some cases and algo2 for others, there is no use of having this parameter. Assuming that we have many SAT solvers, we could always call the most optimum one from satisfiable and deprecate the others. To accomplish the removal of the parameter without breaking user applications, we could set the parameter to None by default. If someone explicitly provides an algorithm, we issue a warning but continue with our standard SAT solver. asmeurer Owner I think it will be useful. One thing that we might want to do in the future is allow SymPy to interface with external SAT solvers written in C, like pycosat. But we wouldn't want to make them hard dependencies of SymPy, so if they are not installed, it would need to fallback to the one written in Python in SymPy. asmeurer Owner Add example of `return_model` to the docstrings.
Owner
commented Mar 1, 2014
 You need to benchmark way more clauses than that. In the (new) assumptions, the bare minimum number of clauses is 63 (the known_facts_cnf from sympy.assumptions.ask_generated). In my newassump branch (currently), this cnf is duplicated for every subexpression, and any additional facts generated are added to that as well. In my branch, there are 198 clauses, consisting of 114 atoms, used for `newask(Q.positive(x*y), Q.positive(x) & Q.positive(y))`, which has only three subexpressions. Here's how I got that information: ``````In [10]: from sympy.assumptions.newask import * In [11]: len(get_all_relevant_facts(Q.positive(x*y), Q.positive(x) & Q.positive(y)).args) Out[11]: 198 In [12]: from sympy.logic.boolalg import _find_predicates In [13]: a = get_all_relevant_facts(Q.positive(x*y), Q.positive(x) & Q.positive(y)) In [14]: len(_find_predicates(a)) Out[14]: 114 ``````
commented Mar 1, 2014

@asmeurer: Semantic Tableaux still works faster than Tseitin (and much faster than original CNF algorithm). But the magnitude of difference has reduced. From my tests, till the number of variables is around 400 and number of clauses around 750, SemTab will safely perform better than Tseitin.

Sl No. Faster Diff Atoms Clauses Avg. Args SemTab Tseitin
Test 1 SemTab 461999 142 242 3.31 True True
Test 2 SemTab 364501 144 224 3.41 False False
Test 3 SemTab 341407 134 199 3.16 True True
Test 4 SemTab 326397 149 222 3.19 False False
Test 5 SemTab 372458 150 236 3.02 True True
Test 6 SemTab 294582 135 237 3.18 False False
Test 7 SemTab 465008 129 221 3.59 True True
Test 8 SemTab 265901 143 203 3.28 False False
Test 9 SemTab 438548 149 205 3.18 True True
Test 10 SemTab 309060 105 238 3.45 False False

Average: Semtab: 145729, Tseitin: 509715

 Soumya Dipta Biswas `Added tests` `24ad822`
commented Mar 6, 2014
 @asmeurer @srjoglekar246 Everything functioning properly now. I have added the relevant tests. If there are no specific problems, can this be merged?
Member
 Seems to be working okay now. I am +1 to this going in.
sympy/logic/inference.py
 - from sympy.logic.algorithms.dpll import dpll_satisfiable - return dpll_satisfiable(expr) - elif algorithm == "dpll2": - from sympy.logic.algorithms.dpll2 import dpll_satisfiable - return dpll_satisfiable(expr) - raise NotImplementedError + + if return_model is True: + expr = to_cnf(expr) + if algorithm == "dpll": + from sympy.logic.algorithms.dpll import dpll_satisfiable + return dpll_satisfiable(expr) + elif algorithm == "dpll2": + from sympy.logic.algorithms.dpll2 import dpll_satisfiable + return dpll_satisfiable(expr) + raise NotImplementedError
 asmeurer Owner This should be ValueError. Like `ValueError("'algorithm' must be one of ['dpll', 'dpll2']")`.
sympy/logic/inference.py
 -def pl_true(expr, model={}): +def pl_true(expr, model={}, deep=False):
 asmeurer Owner Explain what the deep parameter does in the docstring.
sympy/logic/inference.py
 -def pl_true(expr, model={}): +def pl_true(expr, model={}, deep=False): """ Return True if the propositional logic expression is true in the model, and False if it is false. If the model does not specify the value for
 asmeurer Owner Put the first sentence on a separate line. See https://github.com/sympy/sympy/wiki/Writing-documentation
sympy/logic/inference.py
 """ Return True if the propositional logic expression is true in the model, and False if it is false. If the model does not specify the value for - every proposition, this may return None to indicate 'not obvious'; - this may happen even when the expression is tautological.
 asmeurer Owner Is this no longer true?
sympy/logic/inference.py
 else: - raise ValueError("Illegal operator in logic expression" + str(expr)) + args = [_pl_interpretation(arg, atoms, i) for arg in expr.args] + return expr.func(*args) + + +def semantic_tableaux(expr): + """ + Checks satisfiability of a formula using a Semantic Tableaux + This method is much faster than a traditional SAT solver however
 asmeurer Owner Put empty line before this line. asmeurer Owner commas before and after "however"
sympy/logic/inference.py
 else: - raise ValueError("Illegal operator in logic expression" + str(expr)) + args = [_pl_interpretation(arg, atoms, i) for arg in expr.args] + return expr.func(*args) + + +def semantic_tableaux(expr): + """ + Checks satisfiability of a formula using a Semantic Tableaux + This method is much faster than a traditional SAT solver however + it returns only True or False. To obtain a model use 'satisfiable'
 asmeurer Owner Put periods at the end of your sentences.
Owner
commented Mar 8, 2014
 Another way to test this would be to set return_model=False in the assumptions where satisfiable is used and see if the tests pass (and they are not slower). You can also try merging this with my newassump branch (in a separate branch of course) and doing the same in newask.py.
commented on the diff Mar 8, 2014
sympy/logic/tests/test_inference.py
 - raises(ValueError, lambda: pl_true(42 + pi + pi ** 2)) - raises(ValueError, lambda: pl_true(42)) + raises(TypeError, lambda: pl_true(42 + pi + pi ** 2)) + raises(TypeError, lambda: pl_true(42)) + + +def test_pl_true_deep(): + from sympy.abc import A, B, C + assert pl_true(A | B, {A: False}, deep=True) is None + assert pl_true(~A & ~B, {A: False}, deep=True) is None + assert pl_true(A | B, {A: False, B: False}, deep=True) is False + assert pl_true(A & B & (~A | ~B), {A: True}, deep=True) is False + assert pl_true((C >> A) >> (B >> A), {C: True}, deep=True) is True + + +def test_semantic_tableaux():
 asmeurer Owner I think you should test some larger expressions.
 Soumya Dipta Biswas `Updated documentation. Added more tests` `ad2fb70`
 @asmeurer: Made the required documentation changes. Added more tests in a different format. Hope thats okay. Please provide your comments. Will fix the stray '>>> ' (which is causing a trailing whitespace error) in a final patch with the other suggested changes (if any) Because of the addition of the deep parameter, the function will return True regardless of the model. Hence the statement "this may happen even when the expr is tautological" is no longer apt.
 @asmeurer: Could you please see if we can move towards merging this. I will make a final commit to fix the quality issue that is causing the Travis failure
Owner
 Still waiting on that fixup commit. I am fine to merge otherwise.
 @asmeurer: As per what I mentioned on the thread (last but one-th post) I really think we should wait until later to merge this request. In future (once faster solvers are implemented) the semantic tableaux could end up being redundant. So should I get the entire PR merged and later remove tableaux if required? Or should I open another pull request for improved pl_true and get that merged and wait for the benchmarks on Semantic Tableaux? Sorry for all the confusion.
 Soumya Dipta Biswas `Fixed quality issues. Added minor tests. Improved documentation` `e126775`
Owner
 That's fine. You should note such things on the pull requests themselves, as otherwise people just see the pull request and don't know.
 @asmeurer: I fixed all issues in the logic module. Travis seems to give some error in "sympy/mpmath/tests/test_elliptic.py[14]". Is this a problem all across?
Owner
 That one is unrelated.
commented on the diff May 23, 2014
sympy/logic/inference.py
 - Examples: + + Parameters + ========== + + return_model: boolean, optional, default: True + Returns a model if expression is satisfiable. If only (UN)SAT + is required, then set to False
 asmeurer Owner Is this faster?
commented on the diff May 23, 2014
sympy/logic/inference.py
 -def pl_true(expr, model={}): +def pl_true(expr, model={}, deep=False):
 asmeurer Owner What is the point of this function? Why can't you just do `expr.subs(model)`? asmeurer Owner Or even `expr.xreplace(model)`. I'm betting that is like 10 times faster too. asmeurer Owner Fortunately this function isn't exported to the public API, so if we decide it's unnecessary, we can just remove it. asmeurer Owner Do you have a planned use for `deep=True`?
commented on the diff May 23, 2014
sympy/logic/inference.py
 else: - raise ValueError("Illegal operator in logic expression" + str(expr)) + args = [_pl_interpretation(arg, atoms, i) for arg in expr.args] + return expr.func(*args) + + +def semantic_tableaux(expr): + """ + Checks satisfiability of a formula using a Semantic Tableaux. + + This method is much faster than a traditional SAT solver, however, + it returns only True or False. To obtain a model use 'satisfiable'.
 asmeurer Owner Is it actually faster, though? I would leave this statement out unless the benchmarks are actually showing this.
commented on the diff May 23, 2014
sympy/logic/inference.py
 True - + >>> pl_true(A & B, {A: False}) + False + >>> pl_true(A & B, {A: True}) + >>> pl_true(A >> (B >> A)) + True
 asmeurer Owner I guess here is the difference. I suspect a lot of the logic could be simplified by just using `expr.xreplace(model)`, though.