Added Semantic Tableaux. Improved pl_true #2964

Open
wants to merge 6 commits into
from

Projects

None yet

3 participants

  1. Added satisfiability checking using a Semantic Tableaux. Semantic tableaux removes the overhead of converting a formula to CNF (or DNF) and is faster than a SAT solver. However, it can only return True or False to indicate satisfiability, and not the model.
  2. Modified pl_true to give the correct result under incomplete interpretations. Previously something like pl_true(a >> (b >> a)) would return None. This function now accurately returns whether the value is obvious (True/False) or not (None).
Soumya Dipta Biswas Added Semantic Tableaux 243bc05
Owner

Can you show some benchmarks? There are some benchmark dimacs files in sympy/logic/benchmarks, though you should also benchmark non-CNF formulas as well.

@asmeurer asmeurer commented on an outdated diff Feb 25, 2014
sympy/logic/inference.py
else:
- raise ValueError("Illegal operator in logic expression" + str(expr))
+ raise ValueError("Illegal operator in expression" + str(expr))
+
+
+def semantic_tableaux(expr):
+ """
+ Checks satisfiability of a formula using a Semantic Tableaux
+ This method is much faster than a traditional SAT solver however
+ it returns only True or False. To obtain a model use satisfiability
asmeurer
asmeurer Feb 25, 2014 Owner

"use satisfiable." And put a period at the end.

@asmeurer asmeurer commented on an outdated diff Feb 25, 2014
sympy/logic/inference.py
+ False
+ """
+
+ if isinstance(expr, bool):
+ return expr
+
+ expr = eliminate_implications(sympify(expr))
+
+ if is_literal(expr):
+ return expr
+
+ elif isinstance(expr, Not):
+ raise TypeError("Expression is not in Negation Normal Form")
+
+ elif isinstance(expr, Or):
+ args = [semantic_tableaux(arg) for arg in expr.args]
asmeurer
asmeurer Feb 25, 2014 Owner

Can't you do some short circuiting here?

This thing is still a work in progress. Should finish it in about 2 days. Will post the benchmarks then.

There seems to be a interesting flaw in the implementation (not in the idea) I posted yesterday. It is complete but not sound i.e. if function return false then, the formula is definitely unsatisfiable, but if it returns satisfiable, it may or may not be so. I am obviously going to fix this flaw asap but it is interesting because of the following math. This algorithm takes about 15% the time of normal sat for formulas with 5-10 variables and 15-25 clauses. For bigger formulas this will perform even better. In the worst case, assume that the algorithm returns False in 20% cases.
Old Time: 1
New Time: 0.2 * 0.15 + 0.8 * 1.15 = 0.95
Should we think about using similar heuristics to speed up the common case for SAT?

I fixed the problem mentioned in the last post. Will test it thoroughly tonight and post the stats tomorrow (and subsequent changes, if any).

I was probably very unclear about what I meant in the previous post (regarding usage of heuristics). I meant, when a model is required (and not just the satisfiability) should we use this algorithm to determine whether or not an actual SAT solver should be called? If this algorithm returns True, the solver can be called, else return False. Depending upon the percentage of unsatisfiable formulas, this method might speed up the common case.

Here are the test results with 5 - 10 variable and (1) 5 - 10, (2) 15 - 25, (3) 30 - 50 clauses
The selection of alternating Satisfiable and Unsatisfiable formulas is completely intentional. Still working on improving efficiency further. Will add tests to the test file and commit once done.

@asmeurer @srjoglekar246 any comments?

Sl No. Faster Diff Atoms Clauses Avg. Args SemTab Tseitin
Test 1 SemTab 15589 10 8 4.62 True True
Test 2 SemTab 17573 10 8 4.38 False False
Test 3 SemTab 18619 6 7 4.43 True True
Test 4 Tseitin 435 8 6 2.83 False False
Test 5 SemTab 27103 7 5 4.60 True True
Test 6 SemTab 11613 6 6 3.17 False False
Test 7 SemTab 8031 5 8 3.12 True True
Test 8 SemTab 8829 6 6 4.00 False False
Test 9 SemTab 7724 10 5 3.40 True True
Test 10 SemTab 11283 10 8 2.38 False False

Average: SemTab: 3187, Tseitin: 15780

Sl No. Faster Diff Atoms Clauses Avg. Args SemTab Tseitin
Test 1 SemTab 32332 7 18 4.78 True True
Test 2 SemTab 18163 5 18 2.61 False False
Test 3 SemTab 54329 9 23 3.57 True True
Test 4 SemTab 28019 8 21 3.24 False False
Test 5 SemTab 25769 10 15 5.07 True True
Test 6 SemTab 25648 9 16 4.62 False False
Test 7 SemTab 30802 7 23 2.70 True True
Test 8 SemTab 22512 5 15 4.73 False False
Test 9 SemTab 14889 8 15 3.67 True True
Test 10 SemTab 35666 10 17 3.82 False False

Average: SemTab: 4253, Tseitin: 33066

Sl No. Faster Diff Atoms Clauses Avg. Args SemTab Tseitin
Test 1 SemTab 82065 7 37 4.16 True True
Test 2 SemTab 81062 8 49 4.14 False False
Test 3 SemTab 52917 6 31 5.65 True True
Test 4 SemTab 60515 6 48 4.12 False False
Test 5 SemTab 77310 6 42 4.79 True True
Test 6 SemTab 64374 10 40 3.55 False False
Test 7 SemTab 42413 10 31 3.32 True True
Test 8 SemTab 54045 5 45 5.38 False False
Test 9 SemTab 57479 8 45 4.20 True True
Test 10 SemTab 86744 6 47 5.40 False False

Average: SemTab: 13970, Tseitin: 79862

Member

If this thing IS true, then I think we should go with this as far as the assumptions system is concerned. A factor of 7-8 is too large to be ignored. I hope you have confirmed the validity of the test results.

@srjoglekar246 srjoglekar246 and 1 other commented on an outdated diff Feb 27, 2014
sympy/logic/inference.py
else:
- raise ValueError("Illegal operator in logic expression" + str(expr))
+ raise ValueError("Illegal operator in expression" + str(expr))
+
+
+def semantic_tableaux(expr):
srjoglekar246
srjoglekar246 Feb 27, 2014 Member

The name seems a little ambiguous. Looking at this name, a user may be tempted to think this returns a Tree structure. How about naming this 'check_semantic_tableau' or something on those lines?

sdbiswas
sdbiswas Feb 27, 2014

I was instead thinking of making it __semantic_tableaux and add a model or return_model parameter to satisfiable function. satisfiable(expr, algorithm="dpll2", model=True) The end-user is very less likely to know what this function does and how it can be used. Its major use to evaluate unsatisfiability (and validity), so integrating it with the corresponding function would probably be a good idea.

srjoglekar246
srjoglekar246 Feb 27, 2014 Member

return_model seems a more self-explanatory argument. But adding this functionality to satisfiable probably seems a good idea.

@srjoglekar246 srjoglekar246 commented on the diff Feb 27, 2014
sympy/logic/inference.py
+ Checks satisfiability of a formula using a Semantic Tableaux
+ This method is much faster than a traditional SAT solver however
+ it returns only True or False. To obtain a model use 'satisfiable'
+
+ Examples
+ ========
+
+ >>> from sympy.abc import A, B
+ >>> from sympy.logic.inference import semantic_tableaux
+ >>> semantic_tableaux(A | B)
+ True
+ >>> semantic_tableaux(A & ~A)
+ False
+
+ References
+ ==========
srjoglekar246
srjoglekar246 Feb 27, 2014 Member

Put the references before the Examples. Looks better in the Sphinx docs.

asmeurer
asmeurer Mar 8, 2014 Owner

I thought it was the other way around. It makes most sense to me to have the references at the end, just like they are anywhere else.

@srjoglekar246 srjoglekar246 commented on an outdated diff Feb 27, 2014
sympy/logic/inference.py
+ return True
+ else:
+ e.remove(clause)
+
+ if isinstance(clause, Or):
+ for arg in clause.args[1:]:
+ temp = set(e)
+ temp.add(arg)
+ s.append(temp)
+ e.add(clause.args[0])
+ s.append(e)
+
+ elif isinstance(clause, And):
+ for arg in clause.args:
+ e.add(arg)
+ s.append(e)
srjoglekar246
srjoglekar246 Feb 27, 2014 Member

Given the nature of this algorithm, it would be good to have in-code comments about the steps you are taking. Especially the way the tree develops/is pruned depending on the sub-formulae (And/Or/Literals etc) and boundary cases.

Soumya Dipta Biswas Added 'deep' to pl_true 49dcaf3
@sdbiswas sdbiswas commented on the diff Feb 28, 2014
sympy/logic/inference.py
@@ -63,12 +63,13 @@ def literal_symbol(literal):
raise ValueError("Argument must be a boolean literal.")
-def satisfiable(expr, algorithm="dpll2"):
+def satisfiable(expr, return_model=True, algorithm="dpll2"):
sdbiswas
sdbiswas Feb 28, 2014

@asmeurer @srjoglekar246
In future, we might want to remove the algorithm parameter. Unless we have 2 competing algorithms such that algo1 is faster for some cases and algo2 for others, there is no use of having this parameter. Assuming that we have many SAT solvers, we could always call the most optimum one from satisfiable and deprecate the others. To accomplish the removal of the parameter without breaking user applications, we could set the parameter to None by default. If someone explicitly provides an algorithm, we issue a warning but continue with our standard SAT solver.

asmeurer
asmeurer Mar 8, 2014 Owner

I think it will be useful. One thing that we might want to do in the future is allow SymPy to interface with external SAT solvers written in C, like pycosat. But we wouldn't want to make them hard dependencies of SymPy, so if they are not installed, it would need to fallback to the one written in Python in SymPy.

asmeurer
asmeurer Mar 8, 2014 Owner

Add example of return_model to the docstrings.

Owner
asmeurer commented Mar 1, 2014

You need to benchmark way more clauses than that. In the (new) assumptions, the bare minimum number of clauses is 63 (the known_facts_cnf from sympy.assumptions.ask_generated). In my newassump branch (currently), this cnf is duplicated for every subexpression, and any additional facts generated are added to that as well. In my branch, there are 198 clauses, consisting of 114 atoms, used for newask(Q.positive(x*y), Q.positive(x) & Q.positive(y)), which has only three subexpressions. Here's how I got that information:

In [10]: from sympy.assumptions.newask import *

In [11]: len(get_all_relevant_facts(Q.positive(x*y), Q.positive(x) & Q.positive(y)).args)
Out[11]: 198

In [12]: from sympy.logic.boolalg import _find_predicates

In [13]: a = get_all_relevant_facts(Q.positive(x*y), Q.positive(x) & Q.positive(y))

In [14]: len(_find_predicates(a))
Out[14]: 114
sdbiswas commented Mar 1, 2014

@asmeurer: Semantic Tableaux still works faster than Tseitin (and much faster than original CNF algorithm). But the magnitude of difference has reduced. From my tests, till the number of variables is around 400 and number of clauses around 750, SemTab will safely perform better than Tseitin.

Sl No. Faster Diff Atoms Clauses Avg. Args SemTab Tseitin
Test 1 SemTab 461999 142 242 3.31 True True
Test 2 SemTab 364501 144 224 3.41 False False
Test 3 SemTab 341407 134 199 3.16 True True
Test 4 SemTab 326397 149 222 3.19 False False
Test 5 SemTab 372458 150 236 3.02 True True
Test 6 SemTab 294582 135 237 3.18 False False
Test 7 SemTab 465008 129 221 3.59 True True
Test 8 SemTab 265901 143 203 3.28 False False
Test 9 SemTab 438548 149 205 3.18 True True
Test 10 SemTab 309060 105 238 3.45 False False

Average: Semtab: 145729, Tseitin: 509715

Soumya Dipta Biswas Added tests 24ad822
sdbiswas commented Mar 6, 2014

@asmeurer @srjoglekar246
Everything functioning properly now. I have added the relevant tests. If there are no specific problems, can this be merged?

Member

Seems to be working okay now. I am +1 to this going in.

@asmeurer asmeurer commented on an outdated diff Mar 8, 2014
sympy/logic/inference.py
- from sympy.logic.algorithms.dpll import dpll_satisfiable
- return dpll_satisfiable(expr)
- elif algorithm == "dpll2":
- from sympy.logic.algorithms.dpll2 import dpll_satisfiable
- return dpll_satisfiable(expr)
- raise NotImplementedError
+
+ if return_model is True:
+ expr = to_cnf(expr)
+ if algorithm == "dpll":
+ from sympy.logic.algorithms.dpll import dpll_satisfiable
+ return dpll_satisfiable(expr)
+ elif algorithm == "dpll2":
+ from sympy.logic.algorithms.dpll2 import dpll_satisfiable
+ return dpll_satisfiable(expr)
+ raise NotImplementedError
asmeurer
asmeurer Mar 8, 2014 Owner

This should be ValueError. Like ValueError("'algorithm' must be one of ['dpll', 'dpll2']").

@asmeurer asmeurer commented on an outdated diff Mar 8, 2014
sympy/logic/inference.py
-def pl_true(expr, model={}):
+def pl_true(expr, model={}, deep=False):
asmeurer
asmeurer Mar 8, 2014 Owner

Explain what the deep parameter does in the docstring.

@asmeurer asmeurer commented on an outdated diff Mar 8, 2014
sympy/logic/inference.py
-def pl_true(expr, model={}):
+def pl_true(expr, model={}, deep=False):
"""
Return True if the propositional logic expression is true in the model,
and False if it is false. If the model does not specify the value for
asmeurer
asmeurer Mar 8, 2014 Owner

Put the first sentence on a separate line. See https://github.com/sympy/sympy/wiki/Writing-documentation

@asmeurer asmeurer commented on an outdated diff Mar 8, 2014
sympy/logic/inference.py
"""
Return True if the propositional logic expression is true in the model,
and False if it is false. If the model does not specify the value for
- every proposition, this may return None to indicate 'not obvious';
- this may happen even when the expression is tautological.
asmeurer
asmeurer Mar 8, 2014 Owner

Is this no longer true?

@asmeurer asmeurer commented on an outdated diff Mar 8, 2014
sympy/logic/inference.py
else:
- raise ValueError("Illegal operator in logic expression" + str(expr))
+ args = [_pl_interpretation(arg, atoms, i) for arg in expr.args]
+ return expr.func(*args)
+
+
+def semantic_tableaux(expr):
+ """
+ Checks satisfiability of a formula using a Semantic Tableaux
+ This method is much faster than a traditional SAT solver however
asmeurer
asmeurer Mar 8, 2014 Owner

Put empty line before this line.

asmeurer
asmeurer Mar 8, 2014 Owner

commas before and after "however"

@asmeurer asmeurer commented on an outdated diff Mar 8, 2014
sympy/logic/inference.py
else:
- raise ValueError("Illegal operator in logic expression" + str(expr))
+ args = [_pl_interpretation(arg, atoms, i) for arg in expr.args]
+ return expr.func(*args)
+
+
+def semantic_tableaux(expr):
+ """
+ Checks satisfiability of a formula using a Semantic Tableaux
+ This method is much faster than a traditional SAT solver however
+ it returns only True or False. To obtain a model use 'satisfiable'
asmeurer
asmeurer Mar 8, 2014 Owner

Put periods at the end of your sentences.

Owner
asmeurer commented Mar 8, 2014

Another way to test this would be to set return_model=False in the assumptions where satisfiable is used and see if the tests pass (and they are not slower). You can also try merging this with my newassump branch (in a separate branch of course) and doing the same in newask.py.

@asmeurer asmeurer commented on the diff Mar 8, 2014
sympy/logic/tests/test_inference.py
- raises(ValueError, lambda: pl_true(42 + pi + pi ** 2))
- raises(ValueError, lambda: pl_true(42))
+ raises(TypeError, lambda: pl_true(42 + pi + pi ** 2))
+ raises(TypeError, lambda: pl_true(42))
+
+
+def test_pl_true_deep():
+ from sympy.abc import A, B, C
+ assert pl_true(A | B, {A: False}, deep=True) is None
+ assert pl_true(~A & ~B, {A: False}, deep=True) is None
+ assert pl_true(A | B, {A: False, B: False}, deep=True) is False
+ assert pl_true(A & B & (~A | ~B), {A: True}, deep=True) is False
+ assert pl_true((C >> A) >> (B >> A), {C: True}, deep=True) is True
+
+
+def test_semantic_tableaux():
asmeurer
asmeurer Mar 8, 2014 Owner

I think you should test some larger expressions.

@asmeurer: Made the required documentation changes. Added more tests in a different format. Hope thats okay. Please provide your comments. Will fix the stray '>>> ' (which is causing a trailing whitespace error) in a final patch with the other suggested changes (if any)

Because of the addition of the deep parameter, the function will return True regardless of the model. Hence the statement "this may happen even when the expr is tautological" is no longer apt.

@asmeurer: Could you please see if we can move towards merging this. I will make a final commit to fix the quality issue that is causing the Travis failure

Owner

Still waiting on that fixup commit. I am fine to merge otherwise.

@asmeurer: As per what I mentioned on the thread (last but one-th post) I really think we should wait until later to merge this request. In future (once faster solvers are implemented) the semantic tableaux could end up being redundant. So should I get the entire PR merged and later remove tableaux if required? Or should I open another pull request for improved pl_true and get that merged and wait for the benchmarks on Semantic Tableaux? Sorry for all the confusion.

Owner

That's fine. You should note such things on the pull requests themselves, as otherwise people just see the pull request and don't know.

@asmeurer: I fixed all issues in the logic module. Travis seems to give some error in "sympy/mpmath/tests/test_elliptic.py[14]". Is this a problem all across?

Owner

That one is unrelated.

@asmeurer asmeurer commented on the diff May 23, 2014
sympy/logic/inference.py
- Examples:
+
+ Parameters
+ ==========
+
+ return_model: boolean, optional, default: True
+ Returns a model if expression is satisfiable. If only (UN)SAT
+ is required, then set to False
asmeurer
asmeurer May 23, 2014 Owner

Is this faster?

@asmeurer asmeurer commented on the diff May 23, 2014
sympy/logic/inference.py
-def pl_true(expr, model={}):
+def pl_true(expr, model={}, deep=False):
asmeurer
asmeurer May 23, 2014 Owner

What is the point of this function? Why can't you just do expr.subs(model)?

asmeurer
asmeurer May 23, 2014 Owner

Or even expr.xreplace(model). I'm betting that is like 10 times faster too.

asmeurer
asmeurer May 23, 2014 Owner

Fortunately this function isn't exported to the public API, so if we decide it's unnecessary, we can just remove it.

asmeurer
asmeurer May 23, 2014 Owner

Do you have a planned use for deep=True?

@asmeurer asmeurer commented on the diff May 23, 2014
sympy/logic/inference.py
else:
- raise ValueError("Illegal operator in logic expression" + str(expr))
+ args = [_pl_interpretation(arg, atoms, i) for arg in expr.args]
+ return expr.func(*args)
+
+
+def semantic_tableaux(expr):
+ """
+ Checks satisfiability of a formula using a Semantic Tableaux.
+
+ This method is much faster than a traditional SAT solver, however,
+ it returns only True or False. To obtain a model use 'satisfiable'.
asmeurer
asmeurer May 23, 2014 Owner

Is it actually faster, though? I would leave this statement out unless the benchmarks are actually showing this.

@asmeurer asmeurer commented on the diff May 23, 2014
sympy/logic/inference.py
True
-
+ >>> pl_true(A & B, {A: False})
+ False
+ >>> pl_true(A & B, {A: True})
+ >>> pl_true(A >> (B >> A))
+ True
asmeurer
asmeurer May 23, 2014 Owner

I guess here is the difference. I suspect a lot of the logic could be simplified by just using expr.xreplace(model), though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment