Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DomainMatrix for faster rref in solve #18844

Merged
merged 41 commits into from Jun 28, 2020

Conversation

oscarbenjamin
Copy link
Contributor

References to other Issues or PRs

This is built on top of #18814 and adds the DomainMatrix class from #18445

Brief description of what is fixed or changed

  • Add DomainMatrix class
  • Use DomainMatrix in solve_linear_system where possible with linsolve as the fallback.

Other comments

Release Notes

  • solvers
    • Solving linear systems particularly involving polynomial coefficients is much faster

@sympy-bot
Copy link

sympy-bot commented Mar 12, 2020

Hi, I am the SymPy bot (v160). I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

Your release notes are in good order.

Here is what the release notes will look like:

  • solvers
    • Solving linear systems particularly involving polynomial coefficients is much faster (#18844 by @oscarbenjamin)

This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.7.

Note: This comment will be updated with the latest check if you edit the pull request. You need to reload the page to see it.

Click here to see the pull request description that was parsed.

<!-- Your title above should be a short description of what
was changed. Do not include the issue number in the title. -->

#### References to other Issues or PRs
<!-- If this pull request fixes an issue, write "Fixes #NNNN" in that exact
format, e.g. "Fixes #1234" (see
https://tinyurl.com/auto-closing for more information). Also, please
write a comment on that issue linking back to this pull request once it is
open. -->

This is built on top of #18814 and adds the DomainMatrix class from #18445

#### Brief description of what is fixed or changed

- Add DomainMatrix class
- Use DomainMatrix in solve_linear_system where possible with linsolve as the fallback.

#### Other comments


#### Release Notes

<!-- Write the release notes for this release below. See
https://github.com/sympy/sympy/wiki/Writing-Release-Notes for more information
on how to write release notes. The bot will check your release notes
automatically to see if they are formatted correctly. -->

<!-- BEGIN RELEASE NOTES -->
* solvers
    * Solving linear systems particularly involving polynomial coefficients is much faster
<!-- END RELEASE NOTES -->

Update

The release notes on the wiki have been updated.

@oscarbenjamin
Copy link
Contributor Author

This massively increases the speed of solve when solving linear equations that have symbols in e.g.:

def dosolve(n):
    y = Symbol('y')
    syms = symbols('x:%s' % n)
    eqs = list(y*randMatrix(n, n+1) * Matrix([syms+(1,)]).T)
    sol = solve(eqs, syms)
    return eqs, sol, syms

That makes a system like

In [3]: eqs, sol, syms = dosolve(5)                                                                                                                           

In [4]: eqs                                                                                                                                                   
Out[4]: 
[57x₀y + 14x₁y + 30x₂y + 40x₃y + 50x₄y + 80y, 60x₀y + 50x₁y + 49x₂y + 5x₃y + 5x₄y + 71y, 38x₀y + 59x₁y + 31x₂y + 43x₃y + 58x₄
y + 98y, 92x₀y + 5x₁y + 50x₂y + 64x₃y + 47x₄y + 89y, 83x₀y + 49x₁y + 35x₂y + 50x₃y + 96x₄y + 47y]

In [5]: sol                                                                                                                                                   
Out[5]:120850754      57073183      -239313614       25717805      -72558796 ⎫
⎨x₀: ─────────, x₁: ────────, x₂: ───────────, x₃: ────────, x₄: ──────────⎬
⎩     26138981      26138981        26138981       26138981       26138981 ⎭

In [6]: syms                                                                                                                                                  
Out[6]: (x₀, x₁, x₂, x₃, x₄)

On master we have:

In [7]: for n in range(10): 
   ...:     print() 
   ...:     print(n) 
   ...:     %time ok=dosolve(n) 
   ...:                                                                                                                                                       

0
CPU times: user 384 µs, sys: 406 µs, total: 790 µs
Wall time: 442 µs

1
CPU times: user 9.63 ms, sys: 1.64 ms, total: 11.3 ms
Wall time: 10.1 ms

2
CPU times: user 35.1 ms, sys: 388 µs, total: 35.5 ms
Wall time: 36.3 ms

3
CPU times: user 85.5 ms, sys: 833 µs, total: 86.4 ms
Wall time: 86.8 ms

4
CPU times: user 164 ms, sys: 1.1 ms, total: 165 ms
Wall time: 166 ms

5
CPU times: user 453 ms, sys: 3.71 ms, total: 456 ms
Wall time: 460 ms

6
CPU times: user 2.25 s, sys: 19.3 ms, total: 2.27 s
Wall time: 2.28 s

7
CPU times: user 6.63 s, sys: 34.5 ms, total: 6.66 s
Wall time: 6.72 s

8
CPU times: user 57.9 s, sys: 426 ms, total: 58.4 s
Wall time: 1min 1s

9
^C

(I interrupted because it was too slow)

Repeating that with this PR we get

In [2]: for n in range(10): 
   ...:     print() 
   ...:     print(n) 
   ...:     %time ok=dosolve(n) 
   ...:                                                                                                                                                       

0
CPU times: user 760 µs, sys: 425 µs, total: 1.19 ms
Wall time: 1.11 ms

1
CPU times: user 12.6 ms, sys: 1.87 ms, total: 14.4 ms
Wall time: 13.9 ms

2
CPU times: user 26.4 ms, sys: 917 µs, total: 27.4 ms
Wall time: 27.2 ms

3
CPU times: user 40.5 ms, sys: 440 µs, total: 40.9 ms
Wall time: 41.1 ms

4
CPU times: user 57.8 ms, sys: 422 µs, total: 58.2 ms
Wall time: 58.4 ms

5
CPU times: user 76.5 ms, sys: 652 µs, total: 77.2 ms
Wall time: 77.7 ms

6
CPU times: user 104 ms, sys: 635 µs, total: 105 ms
Wall time: 105 ms

7
CPU times: user 130 ms, sys: 696 µs, total: 130 ms
Wall time: 131 ms

8
CPU times: user 156 ms, sys: 669 µs, total: 157 ms
Wall time: 157 ms

9
CPU times: user 201 ms, sys: 1.04 ms, total: 202 ms
Wall time: 202 ms

@sylee957
Copy link
Member

I think that this approach may start to add a lot of duplicate methods only to operate on a more crude objects.
But it can be worth merging if the speed is improved.

@oscarbenjamin
Copy link
Contributor Author

I don't think that there will need to be many duplicate methods. From discussions elsewhere I think it was concluded that a high-level matrix class could be built from a low-level one that only implemented a few key methods like rref, charpoly and det. My intention here is that we can build something like that and ultimately use it as the basis for all matrix calculations.

The speed improvements are enormous and it also solves the same problems that dotprodsimp doesdoes but in a more natural way. Try the example from #18814 (comment) to see the speed difference. With the PR we have

In [12]: eqs, p, y = _mk_eqs(2)                                                                                                   

In [13]: %time ok=solve(eqs, *p.c)                                                                                                
CPU times: user 88.2 ms, sys: 697 µs, total: 88.9 ms
Wall time: 90.2 ms

In [14]: eqs, p, y = _mk_eqs(3)                                                                                                   

In [15]: %time ok=solve(eqs, *p.c)                                                                                                
CPU times: user 400 ms, sys: 1.55 ms, total: 402 ms
Wall time: 404 ms

In [16]: eqs, p, y = _mk_eqs(4)                                                                                                   

In [17]: %time ok=solve(eqs, *p.c)                                                                                                
CPU times: user 1.02 s, sys: 4.96 ms, total: 1.02 s
Wall time: 1.03 s

With master we have

In [2]: eqs, p, y = _mk_eqs(2)                                                                                                    

In [3]: %time ok=solve(eqs, *p.c)                                                                                                 
CPU times: user 962 ms, sys: 2.88 ms, total: 965 ms
Wall time: 965 ms

In [4]: eqs, p, y = _mk_eqs(3)                                                                                                    

In [5]: %time ok=solve(eqs, *p.c)                                                                                                 
CPU times: user 5.42 s, sys: 12 ms, total: 5.44 s
Wall time: 5.45 s

In [6]: eqs, p, y = _mk_eqs(4)                                                                                                    

In [7]: %time ok=solve(eqs, *p.c)                                                                                                 
CPU times: user 34.5 s, sys: 147 ms, total: 34.6 s
Wall time: 35 s

Next steps for this PR are to try and use this in place of PolyMatrix in the Risch algorithm.

@sylee957
Copy link
Member

Okay, if it is only a few key methods and not intended to be extended heavily from the mass

@codecov
Copy link

codecov bot commented Mar 28, 2020

Codecov Report

Merging #18844 into master will increase coverage by 0.010%.
The diff coverage is 82.926%.

@@              Coverage Diff              @@
##            master    #18844       +/-   ##
=============================================
+ Coverage   75.662%   75.673%   +0.010%     
=============================================
  Files          658       658               
  Lines       171003    171211      +208     
  Branches     40353     40412       +59     
=============================================
+ Hits        129385    129561      +176     
- Misses       35952     35983       +31     
- Partials      5666      5667        +1     

@oscarbenjamin
Copy link
Contributor Author

This PR is almost ready but I would like to discuss what exactly to do with the new code that is added. Specifically what should be done with the DomainMatrix class?

The PR adds a DomainMatrix class in sympy/polys/polymatrix.py along with the PolyMatrix class. The DomainMatrix class is based on the polys domains and uses DomainElements as elements. It gives a much faster rref in a number of cases solving many examples that would otherwise hang. However this PR does not put this as a replacement for Matrix.rref. Instead the solve_lin_sys function of sympy/polys/solvers.py which uses RawMatrix is adapted to use DomainMatrix instead.

The solve_lin_sys function was previously only used by heurisch but is here also used by solve_linear_system (and therefore solve) and linsolve so this unifies solving linear systems of equations apart from the Matrix solving function like rref which are themselves no longer used by any of the solvers. A final unification with e.g. Matrix.rref seems reasonable but we need to define what the relationship between Matrix and DomainMatrix should be. Lots of issues involving Matrix.rref being slow can be fixed simply by converting to DomainMatrix internally but DomainMatrix can also speed up many other things such as matrix multiplication, charpoly, det etc.

Since solve_lin_sys no longer uses RawMatrix that can be removed which also removes one of the obstacles to ensuring that Matrix can sympify all of its entries. There is also another matrix class that I haven't replaced yet but perhaps should which is the PolyMatrix class which is used used in risch (prde.py). On the face of it this has a similar purpose to the DomainMatrix class except is less efficient. The PolyMatrix class is a subclass of Matrix but having entries that are Poly instances i.e. that are Basic but not Expr (except sometimes they are Poly and sometimes Expr) and like DomainMatrix each matrix has an associated domain. It's not really clear to me what the purpose of the PolyMatrix class is though. Most likely it could be replaced by DomainMatrix except that more of the Matrix API might need to be implemented for DomainMatrix because of the way that PolyMatrix is used.

The other change here is that the solve_lin_sys function splits systems according to connected components using the connected_components function added in #16299. This gives a significant speed up for large sparse systems of the kind regularly seen in heurisch. This same speedup also carries through to linsolve and solve as well. Splitting on connected components is also add in solve so that it can be used with polynomial systems as well leading to the kinds of speed up discussed here #16550 (comment)

In #18445 there was discussion of adding DomainMatrix to be used as an internal object by Matrix. The idea discussed there was to make a new immutable Basic matrix class that would use a lower-level object like DomainMatrix internally. This would make Matrix somewhat analogous to Poly which is a wrapper around lower-level non-Basic poly instances. It also means that both Poly and Matrix are based on the same domains so improvements to the domains like adding Gaussian integers can benefit both matrix and polynomial calculations.

The question is if DomainMatrix is to be the basis for a new matrix class then it should have various methods added to make it suitable for that purpose. Otherwise it is really just used here for rref in which case it does not really need to have many of the methods it currently has like multiplication. On the other hand maybe DomainMatrix can be separate from normal Matrix and be exposed for end-users who want to do things like matrix calculations over finite fields etc.

Because of the confusion about the relationship between DomainMatrix and Matrix I am not sure where exactly the code added here belongs. Either way this PR brings enormous speed improvements in many situations so it could be merged as is with the issue of what to do with DomainMatrix deferred until later.

@sylee957
Copy link
Member

sylee957 commented Apr 2, 2020

In my experiments, I don't think that sympy objects can wrap around non-sympy objects easily. Because there needs to be an another sympy object that wrap around that object, and the problem just recurses down.

So in my opinion, I think that everything should be done in the basis assuming DomainMatrix is some external dependency of Matrix.
As mpmath is considered an external dependency and we are using mpmath functions independently when it is needed, the matrix can internally convert to DomainMatrix for appropriate circumstances every time and do computation and convert back to sympy.

I see this approach is also used in Poly class, but I think that we can avoid the caveat of having sympify(Poly(domain=A)) becoming Poly(domain=B) if we don't tie domain to matrix class itself, but as a keyword argument in function calls.

So we can have the matrix functions with domain as optional arguments, like Matrix.det(domain=ZZ[I]), Matrix.rref(domain=FF(2)).
So the argument can given as a user-provided hint, to skip the decision process of finding the right extension domain where every elements fit in, or doing the computation in other rings.

I also think that this idea is futureproof because even if it can be possible to tie the domain together with Matrix objects, the keyword argument domain across each methods can just be made into an redundant argument that forces the original domain matrix have.

The only cons of this is approach is that

  1. Basic arithmetics like multiplication can be done inconveniently if someone wants to use domain, e.g. A.mul(B, domain=A).mul(C, domain=A).
  2. I haven't measured the time converting some nested square roots to domain object, but it can be slow.
  3. There can be some domain objects unrepresentable in sympy and it can be problem if there is, But I don't think that this is a problem as far as I have observed:
    • Poly <=> Expr
    • FiniteField, ModularInteger <=> Integer
    • AlgebraicElement <=> Sqrt, Pow, ...

@sylee957
Copy link
Member

sylee957 commented Apr 2, 2020

The things I find inconvenient in the current implementation of DomainMatrix is that it lacks a lot of convenience methods like multidimensional slicing, submatrix indexing, which often gets handy
I expect things like A[1, :] = B[1, :]*C to be possible also for DomainMatrix and I think that it can ease the translation of algorithms in matrix api to DomainMatrix.
But I think that this can be done carefully as I see indexing done too much complicated in Matrix. and I suspect some slowdowns from it.

@oscarbenjamin
Copy link
Contributor Author

At this stage it is deliberate that DomainMatrix is not a full-featured matrix class. I certainly wouldn't want it to have as wide an API as Matrix does but its role needs to be clarified before adding any other API to it. For the purposes here the class itself is not needed and a list of lists/dicts would be perfectly manageable. The question is where we would want to take this in future.

@sylee957
Copy link
Member

sylee957 commented Apr 3, 2020

I think that it can be provided as an optional API under Polys after it have lots of self-contained features for operating in a lower level of sympy.
However, I don't want finite field computations to only be possible through DomainMatrix, but not through Matrix. So I think that it should be designed to subset the Matrix, and should fill in any disjoint stuff in ways like I have suggested above.

eqs = system
try:
eqs, ring = sympy_eqs_to_ring(eqs, symbols)
except PolynomialError as exc:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to recognize ValueError?

File "/home/travis/virtualenv/python3.8.0/lib/python3.8/site-packages/sympy-1.7.dev0-py3.8.egg/sympy/solvers/solveset.py", line 2525, in sympy.solvers.solveset.linsolve
Failed example:
    linsolve([x**2 - 1], x)
Expected:
    Traceback (most recent call last):
    ...
    NonlinearError:
    nonlinear term encountered: x**2
Got:
    Traceback (most recent call last):
      File "/opt/python/3.8.0/lib/python3.8/doctest.py", line 1328, in __run
        exec(compile(example.source, filename, "single",
      File "<doctest sympy.solvers.solveset.linsolve[23]>", line 1, in <module>
        linsolve([x**2 - 1], x)
      File "/home/travis/virtualenv/python3.8.0/lib/python3.8/site-packages/sympy-1.7.dev0-py3.8.egg/sympy/solvers/solveset.py", line 2564, in linsolve
        sol = solve_lin_sys(eqs, ring, _raw=False)
      File "/home/travis/virtualenv/python3.8.0/lib/python3.8/site-packages/sympy-1.7.dev0-py3.8.egg/sympy/polys/solvers.py", line 73, in solve_lin_sys
        eq_coeffs = {ring.gens[monom.index(1)]:coeff for monom, coeff in eq_dict.items()}
      File "/home/travis/virtualenv/python3.8.0/lib/python3.8/site-packages/sympy-1.7.dev0-py3.8.egg/sympy/polys/solvers.py", line 73, in <dictcomp>
        eq_coeffs = {ring.gens[monom.index(1)]:coeff for monom, coeff in eq_dict.items()}
    ValueError: tuple.index(x): x not in tuple

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. We have NonlinearError now which should probably be raised in this case.

Add docs for the solve_lin_sys and related functions
Don't recompute the split after recursively calling solve. Since solve
computes the connected components of the system of equations and the
pass each back to solve recursively the connected components get
recomputed in the second call. This makes sure that the connected
components are not recomputed in the recursive call.
Make sure that linsolve raises NonlinearError rather than
PolyNonlinearError
@oscarbenjamin oscarbenjamin changed the title WIP - Add DomainMatrix for faster rref in solve Add DomainMatrix for faster rref in solve Jun 27, 2020
@oscarbenjamin
Copy link
Contributor Author

This PR is now ready for review or merge

@sylee957 sylee957 merged commit 9009a4b into sympy:master Jun 28, 2020
@oscarbenjamin
Copy link
Contributor Author

Thanks @sylee957 and thanks to everyone else for reviewing and helping

@oscarbenjamin oscarbenjamin deleted the pr_domain_matrix branch June 28, 2020 09:03
@moorepants
Copy link
Member

Hi, this PR causes output in matrix operations to result in over complicated results and has an effect on the physics mechanics codes.

See this issue for an explanation: #23130

This commit causes the regression: f0b7996

The results are mathematically equivalent but the expressions are longer and introduce significantly longer equations in complex multi-body systems (i.e. after passed through KanesMethod/LagrangesMethod). This is an undesired effect. It is a case were the output, even though mathematically the same, shouldn't have changed.

@moorepants
Copy link
Member

Is there an option to not use this DomainMatrix underlying implementation? If so, I can disable it for the physics vector/mechanics codes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants