Skip to content

GSoC 2023 Report Tilo Reneau‐Cardoso: Improving Relational Assumptions in SymPy’s New Assumptions

Tilo Reneau-Cardoso edited this page Oct 1, 2023 · 16 revisions

About Me

At the time of writing this, I am beginning my third year as a math and computer science undergraduate at Pomona College.

Project Overview

The original goals of the project were to implement relational assumptions for SymPy's new assumptions (e.i. add handling for inequalities). Though my project proposal included some examples of the sorts of queries that I would implement, what exactly would be implemented was left somewhat vague. What I was able to implement was handling for real linear inequalities. For example, SymPy's new assumptions are able to understand facts such as x > 2 implies x > 0.

A big limitation of what I implemented is that all the variables must be specified to be real in a somewhat awkward way. The reason for this is that variables are assumed to be complex valued by default. As a result, the negation of an inequality doesn't mean what it usually means; ~(x > 0) implies (x <= 0) if and only if x is a real variable.

Main Contributions

The bulk of my project can be found in PR 25544 where I implement the paper "A Fast Linear-Arithmetic Solver for DPLL(T)". The same PR also includes code that uses the implementation to handle queries to the new assumptions involving inequalities with explicitly real variables. Here are some examples of queries that are now handled and were not previously implemented:

from sympy import symbols, ask
x, y, z = symbols("x y z", real=True)

ask(x + y > 2, (x < 0) & (y <0)) # returns False instead of None
ask(x > z, (x > y) & (y > z)) # returns True instead of None
ask(x > z, ((x > y) & (y > z)) | ((x > w) & (w > y) & (y > z))) # returns True instead of None

My second biggest PR was #25436. I also integrated added some code to use Z3 as an optional dependency to provide an alternative way of handling inequalities that PR #25544 could be tested against. Furthermore, this code with slight modification and further testing for correctness could be used to handle non-linear inequalities.

Additional Contributions

I revived an old linear programming PR, and contributed a lot to it but I had to leave finishing it to other people so I could prioritize work more directly related to my GSoC project. However, as the simplex algorithm was relevant to both pull requests, gaining familiarity with it was helpful for understanding "A Fast Linear-Arithmetic Solver for DPLL(T)".

I improved the SMT-LIB printer by fixing a bug and enabling it to print predicate from the new assumptions:

I found and fixed a bug with ask

I also found and did not fix quite a few bugs with ask that remain unfixed

Future Work

I created some issues related to topics tangentially related to improving the new assumptions but not directly related to my work:

  • Issue #25533 suggests that the matrix and non-matrix facts be separated which would average halve the size of the SAT queries made by satask.
  • Issue #25485 suggests a redesign of the new assumptions that could handle the substitution property of equality and would also be better for several other reasons including performance.

In regards to my work, there is room for improvement in both handling more inputs and performance.

Speed

The paper explains how to speed up the algorithm by implementing backtracking and using theory propagation. It also says some preprocessing can be done with gaussian elimination to reduce the size of the matrices, but I couldn't figure out what that meant.

Additionally, handling equalities by by splitting them into two inequalities could possibly be faster than the method currently used.

Handling more than just rational numbers

I have to explain a bit about the LRA Solver algorithm to be able to explain a bit about how non-rational inputs could be implemented. Suppose you want to check the satisfiability of $$x + y &gt; 2 \land x + 2y &lt; 2$$ The theory solver reframes this as a different equisatifiable problem: $$s_1 &gt; 2 \land s_2 &lt; 2 \land s_1 = x + y \land s_2 = x + 2y$$ The equalities are represented as a matrix.

I bring this up because handling non-rational inputs in the matrix is a different and much harder problem than handling non-rational inputs in the inequalities. Handling for such inputs for the matrix should come later and for some inputs possibly not at all.

Handling Irrational Numbers

In the context of a PR related to a linear programing function (which like my LRA solver also uses the simplex method) @oscarbenjamin wrote this comment which is relevant to handling irrational numbers present in the matrix. I'm sure @oscarbenjamin would have thoughts about how irrational numbers should or shouldn't be implemented in the context of an LRA solver so he should probably consulted regarding improvements to the matrix part of the algorithm.

Handling Positive and Negative Infinity

Infinity will never appear as the coefficient to a variable, so it will never appear in the matrix. Already, the way strict inequalities are handled involves handling arbitrarily small values. It's quite possible that an approach to handling infinity could be developed by taking inspiration from the way strict inequalities are handled.

Handling Complex Variables

Currently only inputs where all the variables are specified to be real with the old assumptions are handled. This is a big limitation as variables in the new assumptions are assumed to be complex valued by default. Queries like the following should be handled:

from sympy.abc import x
from sympy import ask, Q

ask(Q.gt(x, 0), Q.gt(x 2)) # should return True instead of None
ask(Q.imaginary(x), Q.gt(x 2)) # should return False instead of None

I don't think implementing this should be too complicated. This would just involve adding an attribute to each variable denoting whether the variable was currently real or not. If the variable was asserted to be real, the predicate responsible should be kept track of so that a conflict clause involving that predicate and any predicate causing the variable to be imaginary could be generated if necessary.

Conclusion

I enjoyed working on this project a lot. Thank you Aaron and Francesco for this opportunity. I got a lot out of working on this project and am grateful for the time you took to meet with me.

References

.. [1] Dutertre, B., de Moura, L.: A Fast Linear-Arithmetic Solver for DPLL(T) https://link.springer.com/chapter/10.1007/11817963_11

Clone this wiki locally