New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matrix cleanup stage 5 #12521
Matrix cleanup stage 5 #12521
Conversation
Added a `MatrixReductions` class which houses row/column operations and row reduction. The new default algorithm waits to normalize pivot rows until the end of the reduction. This gives a tremendous speed improvement.
Benchmarks on matrices with integer entries. First is the old row reduction algorithm (which normalized first) and second is the new one.
Benchmarks on matrices with symbols. First is the old row reduction algorithm (which normalized first) and second is the new one.
The code that was run was
The |
This may not have much to do with this PR, but would it be possible to make the matrix module so general that the entries could belong to any ring, or at least to rings that are SymPy Domains? |
@jksuom A lot of that should work already. Sympy calls the element's multiply and add functions. The only thing where care would need to be taken is with inverting and such. You can pass a custom |
It also seems to me that most things should work. Only some methods that use division may have to check its availability and raise an exception if necessary. But one issue with the old implementation is that matrix elements are routinely sympified. Therefore elements of polys domains cannot be used in matrices. Perhaps sympification should be made optional. |
@jksuom every method that simplifies elements should allow you to pass in a custom |
MatrixBase currently has this line:
where |
I think we would also want to structure it like the polys where the computations happen at a low level on basic types, with minimal type checking, and the higher level provides the abstractions and type checking. That way we get the best performance possible. And worth noting that domains would break the ability to do this:
since the elements would have to be unified to a single domain (either all floats or all rationals). It's not just an issue of avoiding sympify, the Matrix itself needs to know that its elements are from a given domain. |
For immutable matrices, the |
FloatMatrix would be a very special case. Aside from separating floats from rationals, the domains let you do things like assure that every element of a matrix is an element of some rational function field, with guaranteed zero equivalence testing (no need to call The polys also get performance by letting low-level operations take place on raw Python or gmpy integers or rationals, rather than SymPy Integer/Rational, which are much slower in comparison (especially to gmpy). |
For everything I've rewritten so far, @asmeurer Are you suggesting that the class be a |
Here is a suggested design: There are three levels, the top level looks and acts like it does now. If you access the elements you get SymPy objects. Below that you have the DomainMatrix, which knows that the elements are from a polys domain. That would be a wrapper for the lowest level, which is just the pure representation (list of lists or dict or whatever). By the way, I don't think we should worry about non-field domains, for now at least. It's much simpler to always assume the domain is a field (so that you are dealing with a vector space, rather than an R-module). |
Interesting. That seems to be upsidedown from how I've been structuring it. I've been making all methods work in as much generality as possible and leaving it to subclasses to override those methods to meet their requirements. It sounds to me like you're suggesting having a |
The idea is to optimize for performance, especially for those operations that are used in algorithms that need to be fast (mainly basic addition and multiplication, and row reduction; other stuff is less important). The |
I noticed that in the tests. huerisch seems to be quite fragile as well. Row reducing and then normalizing every row left things in the wrong form (even though it is at least 2x as fast) when compared with naive row reduction where the rows are normalized along the way. Hence the introduction of the |
What do you mean the wrong form? Different answers from heurisch are acceptable so long as they are still correct. |
From the tests, the example was
which when row reduced in the naive way
but when the rows are normalized last, you get
Applying |
If |
@asmeurer not necessarily. Even so,
but the same entry for the other matrix
even though
|
Well that's a bug right there. |
Also I think the expression should be -1/4, not 1/4. Was that just a typo on your part, or is it another thing that should be looked into? |
I opened #12531 for this. |
It was a typeo. Should be -1/4. |
That's good. A wrong result would be far more worrying than a improperly simplified result. |
sympy/matrices/matrices.py
Outdated
def _eval_col_del(self, col): | ||
def entry(i, j): | ||
i = i - 1 if i > col else i | ||
return self[i, j] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nitpick, but how about return self[i-1, j] if i > col else self[i, j]
instead of possibly setting setting i
equal to itself when i <= col
is True?
sympy/matrices/matrices.py
Outdated
|
||
def _row_reduce(self, iszerofunc, simpfunc, normalize_last=True, | ||
normalize=True, zero_above=True): | ||
"""Row reduce `self` and return a typle (rref_matrix, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "typle"->"tuple"
sympy/matrices/matrices.py
Outdated
zero if `iszerofunc` returns `None` | ||
normalize_last : indicates where all row reduction should | ||
happen in a fraction-free manner and then the rows are | ||
normalized (so that the pivots are 1), or wheither |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: "wheither"->"whether"
# in the process. Let's not let them go to waste | ||
for (offset, val) in newly_determined: | ||
offset += piv_row | ||
mat[offset*cols + piv_col] = val |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this line of code can be omitted. This newly simplified entry is under the pivot position, so it will get zeroed when a multiple of the pivot row is added to the row containing this entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mathematically, that is true, but when the cross multiplication happens, it is better to do it with a simplified expression than with an unsimplified expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Durr, you do use the simplified values I thought were just getting overwritten. I'm fired. Sorry about that.
if iszerofunc(val): | ||
continue | ||
|
||
cross_cancel(pivot_val, row, val, piv_row) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you can get the same result with fewer calculations. When forming the linear combination of the pivot row with a row beneath it, it looks like you're explicitly computing the entry in the pivot column of the result, but this entry is zero by construction. When the pivot column is not zero, it looks like you're computing the entries to the left of the pivot in the linear combination of rows, but these are also zero by construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that. It adds algorithmic complexity to only work with values on the right side of the matrix. I assumed multiplying by zero and adding zero were fast operations in sympy and so there would be little overhead. I didn't actually benchmark this though...
Eventually, integer/float-only matrices will be row reduced with mpmath, which is way faster, so if multiplying by zero really is fast, I don't think the speedup gained from avoiding a few multiplications would be worth it. Or, did I misunderstand your comment entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't appreciate that in this variant of the elimination algorithm, one of the coefficients need not be one in the linear combination of rows that produces a zero in the desired position. Comment withdrawn.
@moorepants @jksuom would one of you mind reviewing this PR? |
sympy/matrices/matrices.py
Outdated
@@ -111,6 +114,11 @@ def __len__(self): | |||
class MatrixShaping(MatrixRequired): | |||
"""Provides basic matrix shaping and extracting of submatrices""" | |||
|
|||
def _eval_col_del(self, col): | |||
def entry(i, j): | |||
return self[i, j] if i <= col else self[i - 1, j] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does i
denote the column index here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. And my test suite also let this one slide :-(.
sympy/matrices/matrices.py
Outdated
if op == "n->kn": | ||
col = col if col is not None else col1 | ||
if col is None or k is None: | ||
raise ValueError("For a {0} operation 'n->kn' you must proved the " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
proved -> provide?
sympy/matrices/matrices.py
Outdated
algorithms start on the left, having complexity right-shifted | ||
speeds things up. | ||
|
||
returns a tuple (mat, perm) where perm is a permutation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
returns -> Returns
sympy/matrices/matrices.py
Outdated
speeds things up. | ||
|
||
returns a tuple (mat, perm) where perm is a permutation | ||
of the columns to perform to shift the complex columns righ, and mat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
righ -> right
sympy/matrices/matrices.py
Outdated
return mat | ||
|
||
def elementary_col_op(self, op="n->kn", col=None, k=None, col1=None, col2=None): | ||
"""Perfoms the elementary row operation `op`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name refers to columns. Should there be two methods, one for columns and another for rows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are indeed. Copy and paste error in the doc string.
The matrix file is so large that I find it hard to form an overall view of its contents. Especially the MatrixBase class is very big. Am I correct in assuming that it should eventually be replaced by the new classes and reduced, or possibly totally removed? If not, then maybe it should be moved to another file. It seems that my review will now primarily be restricted to trivial matters like typos. (Was the name |
@jksuom The
The new tests in |
This looks good to me. Thanks. |
Seeing some performance regressions from this, see #14673. |
This PR cause a slowdown in $ git checkout 683e2c1b57e13e2971dd074e9578989082bbcd53
HEAD is now at 683e2c1b57 Matrix Cleanup Stage 5
$ time python -c 'from sympy.polys.benchmarks.bench_solvers import time_solve_lin_sys_165x165; time_solve_lin_sys_165x165()'
^CTraceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/enojb/current/sympy/sympy/sympy/polys/benchmarks/bench_solvers.py", line 250, in time_solve_lin_sys_165x165
sol = solve_lin_sys(eqs, R_165)
File "/Users/enojb/current/sympy/sympy/sympy/polys/solvers.py", line 38, in solve_lin_sys
echelon, pivots = matrix.rref(iszerofunc=lambda x: not x, simplify=lambda x: x)
File "/Users/enojb/current/sympy/sympy/sympy/matrices/matrices.py", line 2609, in rref
normalize_last=normalize_last)
File "/Users/enojb/current/sympy/sympy/sympy/matrices/matrices.py", line 2266, in _eval_rref
zero_above=True)
File "/Users/enojb/current/sympy/sympy/sympy/matrices/matrices.py", line 2422, in _row_reduce
cross_cancel(pivot_val, row, val, piv_row)
File "/Users/enojb/current/sympy/sympy/sympy/matrices/matrices.py", line 2373, in cross_cancel
mat[p] = a*mat[p] - b*mat[p + q]
KeyboardInterrupt
real 7m33.707s
user 7m21.181s
sys 0m10.454s
$ git checkout HEAD^
Previous HEAD position was 683e2c1b57 Matrix Cleanup Stage 5
HEAD is now at 31957814e9 Merge remote-tracking branch 'upstream/master' into commonmatrix5
$ time python -c 'from sympy.polys.benchmarks.bench_solvers import time_solve_lin_sys_165x165; time_solve_lin_sys_165x165()'
real 0m11.345s
user 0m10.422s
sys 0m0.227s The benchmark is made much faster under the PR #18844: $ git checkout pr_domain_matrix
...
$ time python -c 'from sympy.polys.benchmarks.bench_solvers import time_solve_lin_sys_165x165; time_solve_lin_sys_165x165()'
real 0m4.593s
user 0m4.007s
sys 0m0.370s The PR introduces a |
Added a
MatrixReductions
class which houses row/columnoperations and row reduction. The new default algorithm
waits to normalize pivot rows until the end of the reduction.
This gives a tremendous speed improvement, especially for
matrices with symbols.
Also added are new methods:
echelon_form
which returns echelon form of a matrix (often all that is needed to check a matrix rank or solve an equation).is_echelon
property to tell if a matrix is in echelon form.elementary_row_op
a nice interface to all the elementary row operations in one placeelementary_col_op
a nice interface to all the elementary column operations in one placerow_del
is now part ofMatrixShaping
col_del
is now part ofMatrixShaping
permuteBkwd
andpermuteFwd
are deprecated.