Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matrix multiplication expression blowup control #18049

Merged
merged 36 commits into from
Dec 24, 2019

Conversation

tom-pytel
Copy link
Contributor

@tom-pytel tom-pytel commented Dec 15, 2019

References to other Issues or PRs

Fixes #17246

Brief description of what is fixed or changed

Matrices tend to blow up during certain operations, this PR introduces an optional intermediate simplification step during multiplication which can keep these blowups in check and permit certain matrix operations to successfully complete on larger matrices where they currently do not.

Other comments

This PR is a restart of a larger previous PR #17247 since that thread got too long to quickly digest for for the person reviewing this.

Release Notes

  • matrices
    • Controls expression blowup by optimized intermediate simplification during multiplication.

@sympy-bot
Copy link

sympy-bot commented Dec 15, 2019

Hi, I am the SymPy bot (v149). I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

Your release notes are in good order.

Here is what the release notes will look like:

  • matrices
    • Controls expression blowup by optimized intermediate simplification during multiplication. (#18049 by @Pristine-Cat)

This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.6.

Note: This comment will be updated with the latest check if you edit the pull request. You need to reload the page to see it.

Click here to see the pull request description that was parsed.

#### References to other Issues or PRs
Fixes #17246

#### Brief description of what is fixed or changed
Matrices tend to blow up during certain operations, this PR introduces an optional intermediate simplification step during multiplication which can keep these blowups in check and permit certain matrix operations to successfully complete on larger matrices where they currently do not.

#### Other comments
This PR is a restart of a larger previous PR #17247 since that thread got too long to quickly digest for for the person reviewing this.

#### Release Notes

<!-- BEGIN RELEASE NOTES -->
* matrices
  * Controls expression blowup by optimized intermediate simplification during multiplication.
<!-- END RELEASE NOTES -->

Update

The release notes on the wiki have been updated.

@tom-pytel
Copy link
Contributor Author

Normally a matrix * multiplication or ** power will not simplify
intermediate terms as they are calcualted to give the best general performance.
Unfortunately this can lead to an explosion in the size and complexity of matrix
element symbolic expressions (including complex number expressions), and even
impair the ability of the calculation to finish, though in the majority of small
matrices it is much faster. This can be a problem however for larger matrices
and can even cause multiplication to never end or may be impossible to simplify
afterwards.

For this reason two separate functions - mul and pow for multiplication
and exponentiation have been added with a flag which allows you to specify that
an optimized intermediate simplification step is to be performed at each step of
calculation. The matrix exp function has also been modified to take this
flag. By default these functions have the flag mulsimp set to None which
means that intermediate simplification is not done, set this to True to use
the simplification. This will leave the matrix in a relatively simplified state
after the operation and permit calculations on some large matrices which were
not possible or extremely slow previously.

Trivial examples (since this PR is really meant for large complicated matrices):

>>> x = Symbol('x')
>>> M = Matrix([[1+x, 1-x], [1-x, 1+x]])

>>> M**4
Matrix([
[4*(1 - x)**2*(x + 1)**2 + ((1 - x)**2 + (x + 1)**2)**2,            4*(1 - x)*(x + 1)*((1 - x)**2 + (x + 1)**2)],
[           4*(1 - x)*(x + 1)*((1 - x)**2 + (x + 1)**2), 4*(1 - x)**2*(x + 1)**2 + ((1 - x)**2 + (x + 1)**2)**2]])

>>> M.pow(4, mulsimp=True)
Matrix([
[8*x**4 + 8, 8 - 8*x**4],
[8 - 8*x**4, 8*x**4 + 8]])

>>> M.exp()
Matrix([
[-(1 - x)*exp(2)/(2*(x - 1)) + exp(2*x)/2, -(1 - x)*((1 - x)/(2*(x - 1)) + 1)*exp(2)/(x - 1) + (1 - x)*exp(2*x)/(2*(x - 1))],
[                  -exp(2*x)/2 + exp(2)/2,                 -(1 - x)*exp(2*x)/(2*(x - 1)) + ((1 - x)/(2*(x - 1)) + 1)*exp(2)]])

>>> M.exp(mulsimp=True)
Matrix([
[ (exp(2*x) + exp(2))/2, (-exp(2*x) + exp(2))/2],
[(-exp(2*x) + exp(2))/2,  (exp(2*x) + exp(2))/2]])

>>> P, J = M.jordan_form ()
>>> P*J*P.inv ()
Matrix([
[x - (1 - x)/(x - 1), x*(1 - x)/(x - 1) - 2*(1 - x)*((1 - x)/(2*(x - 1)) + 1)/(x - 1)],
[              1 - x,                        -x*(1 - x)/(x - 1) + (1 - x)/(x - 1) + 2]])

>>> P.mul(J, mulsimp=True).mul(P.inv(), mulsimp=True)
Matrix([
[x + 1, 1 - x],
[1 - x, x + 1]])

For an example of a larger matrix which does not currently terminate see matrices/test_matrices.py function test_issue_17247_expression_blowup_4().

@tom-pytel
Copy link
Contributor Author

tom-pytel commented Dec 15, 2019

Some benchmarks using sympy_benchmarks. Four different matrices exponentiated in three ways - "master" which is just the current master branch M**p, "PR" which is this PR method of M.pow(p, mulsimp=True) and "master + simplify" which is the first method followed by a simplify() since the matrix is very exploded after the exponentiation. Note that the PR is slower in general than the current unsimplified matrix multiplication but that it actually returns the matrix in simplified form and does not need a post-simplification step so the apples-to-apples comparison would be with the "master + simplify" test. Also note that the PR is able to complete all tests where the master branch is not. The benchmark numbers first and the actual matrices used will follow (time is in ms):

          master         PR   master +
                              simplify

A**4        2.09      15.40     301.00
A**7        4.29      19.20     363.00
A**13       5.30      34.70     466.00
A**16       4.20      46.20     517.00
B**4        3.26      15.20     565.00
B**7        8.39      20.80    2330.00
B**13      10.60      28.70    4050.00
B**16       8.73      31.60    1680.00
C**4      247.00    1290.00   20900.00
C**7      492.00    8580.00     failed
C**13     738.00   13500.00     failed
C**16     735.00   12400.00     failed
D**4      340.00     244.00     failed
D**7     2210.00    4860.00     failed
D**13     failed    5600.00     failed
D**16     failed    1620.00     failed

Matrices used:

>>> A
⎡x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1⎤
⎢                                                      ⎥
⎢x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1⎥
⎢                                                      ⎥
⎢x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1⎥
⎢                                                      ⎥
⎢x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1⎥
⎢                                                      ⎥
⎢x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1⎥
⎢                                                      ⎥
⎢x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1⎥
⎢                                                      ⎥
⎢x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1⎥
⎢                                                      ⎥
⎣x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1  x + 1⎦
>>> B
⎡x + 1  1 - x  x + 1  1 - x  x + 1  1 - x  x + 1  1 - x⎤
⎢                                                      ⎥
⎢1 - x  x + 1  1 - x  x + 1  1 - x  x + 1  1 - x  x + 1⎥
⎢                                                      ⎥
⎢x + 1  1 - x  x + 1  1 - x  x + 1  1 - x  x + 1  1 - x⎥
⎢                                                      ⎥
⎢1 - x  x + 1  1 - x  x + 1  1 - x  x + 1  1 - x  x + 1⎥
⎢                                                      ⎥
⎢x + 1  1 - x  x + 1  1 - x  x + 1  1 - x  x + 1  1 - x⎥
⎢                                                      ⎥
⎢1 - x  x + 1  1 - x  x + 1  1 - x  x + 1  1 - x  x + 1⎥
⎢                                                      ⎥
⎢x + 1  1 - x  x + 1  1 - x  x + 1  1 - x  x + 1  1 - x⎥
⎢                                                      ⎥
⎣1 - x  x + 1  1 - x  x + 1  1 - x  x + 1  1 - x  x + 1⎦
>>> C
⎡  x     x + 1   x + 2   x + 3   x + 4   x + 5   x + 6   x + 7 ⎤
⎢                                                              ⎥
⎢x + 8   x + 9   x + 10  x + 11  x + 12  x + 13  x + 14  x + 15⎥
⎢                                                              ⎥
⎢x + 16  x + 17  x + 18  x + 19  x + 20  x + 21  x + 22  x + 23⎥
⎢                                                              ⎥
⎢x + 24  x + 25  x + 26  x + 27  x + 28  x + 29  x + 30  x + 31⎥
⎢                                                              ⎥
⎢x + 32  x + 33  x + 34  x + 35  x + 36  x + 37  x + 38  x + 39⎥
⎢                                                              ⎥
⎢x + 40  x + 41  x + 42  x + 43  x + 44  x + 45  x + 46  x + 47⎥
⎢                                                              ⎥
⎢x + 48  x + 49  x + 50  x + 51  x + 52  x + 53  x + 54  x + 55⎥
⎢                                                              ⎥
⎣x + 56  x + 57  x + 58  x + 59  x + 60  x + 61  x + 62  x + 63⎦
>>> D
⎡                45   37⋅ⅈ        1   ⅈ         129   9⋅ⅈ      1   5⋅ⅈ        65   87⋅ⅈ        9    ⅈ       183   97⋅ⅈ  ⎤
⎢    -3/4        ── - ────        ─ + ─       - ─── - ───      ─ - ───       ─── + ────      - ── - ──      ─── - ────  ⎥
⎢                32    16         4   2          64    64      4    16       128    64         32   16      256   128   ⎥
⎢                                                                                                                       ⎥
⎢  149   49⋅ⅈ    177   1369⋅ⅈ   125   87⋅ⅈ     2063   541⋅ⅈ    85   33⋅ⅈ    805   2415⋅ⅈ     219   115⋅ⅈ  6301   6609⋅ⅈ ⎥
⎢- ─── + ────  - ─── - ──────   ─── + ────   - ──── + ─────   ─── - ────    ─── + ──────   - ─── + ─────  ──── - ────── ⎥
⎢   64    32     128    128      64    64      256     128    256    16     128    512       128    256   4096    1024  ⎥
⎢                                                                                                                       ⎥
⎢                 9   55⋅ⅈ                     45   37⋅ⅈ        1   ⅈ         129   9⋅ⅈ       1   5⋅ⅈ        65   87⋅ⅈ  ⎥
⎢  1/2 - ⅈ        ─ + ────         -3/4        ── - ────        ─ + ─       - ─── - ───       ─ - ───       ─── + ────  ⎥
⎢                 4    16                      32    16         4   2          64    64       4    16       128    64   ⎥
⎢                                                                                                                       ⎥
⎢   5   39⋅ⅈ    2473   137⋅ⅈ     149   49⋅ⅈ    177   1369⋅ⅈ   125   87⋅ⅈ     2063   541⋅ⅈ    85   33⋅ⅈ     805   2415⋅ⅈ ⎥
⎢ - ─ - ────    ──── + ─────   - ─── + ────  - ─── - ──────   ─── + ────   - ──── + ─────   ─── - ────     ─── + ────── ⎥
⎢   8    16     256      64       64    32     128    128      64    64      256     128    256    16      128    512   ⎥
⎢                                                                                                                       ⎥
⎢                  19   5⋅ⅈ                     9   55⋅ⅈ                     45   37⋅ⅈ         1   ⅈ         129   9⋅ⅈ  ⎥
⎢   1 + ⅈ        - ── + ───      1/2 - ⅈ        ─ + ────         -3/4        ── - ────         ─ + ─       - ─── - ───  ⎥
⎢                  4     4                      4    16                      32    16          4   2          64    64  ⎥
⎢                                                                                                                       ⎥
⎢                537   143⋅ⅈ      5   39⋅ⅈ    2473   137⋅ⅈ     149   49⋅ⅈ    177   1369⋅ⅈ   125   87⋅ⅈ      2063   541⋅ⅈ⎥
⎢  21/8 + ⅈ    - ─── + ─────    - ─ - ────    ──── + ─────   - ─── + ────  - ─── - ──────   ─── + ────    - ──── + ─────⎥
⎢                 64     16       8    16     256      64       64    32     128    128      64    64       256     128 ⎥
⎢                                                                                                                       ⎥
⎢                17   13⋅ⅈ                       19   5⋅ⅈ                     9   55⋅ⅈ                      45   37⋅ⅈ   ⎥
⎢     -2         ── - ────        1 + ⅈ        - ── + ───      1/2 - ⅈ        ─ + ────         -3/4         ── - ────   ⎥
⎢                4     2                         4     4                      4    16                       32    16    ⎥
⎢                                                                                                                       ⎥
⎢  1   13⋅ⅈ      825   147⋅ⅈ                   537   143⋅ⅈ      5   39⋅ⅈ    2473   137⋅ⅈ     149   49⋅ⅈ     177   1369⋅ⅈ⎥
⎢  ─ + ────    - ─── - ─────     21/8 + ⅈ    - ─── + ─────    - ─ - ────    ──── + ─────   - ─── + ────   - ─── - ──────⎥
⎣  4    4         64     32                     64     16       8    16     256      64       64    32      128    128  ⎦

@codecov
Copy link

codecov bot commented Dec 15, 2019

Codecov Report

Merging #18049 into master will increase coverage by 0.091%.
The diff coverage is 92.016%.

@@              Coverage Diff              @@
##            master    #18049       +/-   ##
=============================================
+ Coverage   74.855%   74.946%   +0.091%     
=============================================
  Files          640       642        +2     
  Lines       166551    166930      +379     
  Branches     39176     39275       +99     
=============================================
+ Hits        124672    125109      +437     
+ Misses       36361     36299       -62     
- Partials      5518      5522        +4


return self.mul(other)

def mul(self, other, mulsimp=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@tom-pytel tom-pytel Dec 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MatrixBase.mul() function essentially took the body of MatrixBase.__mul__() but allowed the extra mulsimp parameter so that is why I left it in common.py, I don't like the idea of transposing code from common into another module (like moving this code into Matrix.multiply) and I also didn't want to add a mulsimp keyword to the __mul__() function since that would be non-standard, so I split the function.

As for the regressions there are two extra calls inserted into the chain for multiplication, __mul__() -> mul() and then call into dotprodsimp() where originally the operation was carried out directly in the individual _eval_matrix_mul() functions. I did this on the correct by my view suggestion of @oscarbenjamin that I centralize the simplifying inner loop across all matrix type multiplies.

Other source could be the use of iterables instead of sequences in dotprodsimp() to make it more general. I will look at speeding this up in the centralized function but another orthogonal option is to bring back the individual _eval_matrix_mul() function bodies to be used in the non-simplified case and call out to dotprodsimp() only if simplification is requested. Will experiment...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be better to have a separate function for the mulsimp multiply and mul can just dispatch to one or other function leaving the existing _eval_matrix_mul routines unmodified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The slowdown is due entirely to calling out to the dotprodsimp() function - even if I modify the function to do the simplest fastest dot product - it is still slow due to the overhead of calling the function and passing iterators I guess. So yes, what you say is the route I am taking now, only calling out to dotprodsimp() if needed. Otherwise the _eval_matrix_mul functions will essentially do what they did before.

Copy link
Member

@sylee957 sylee957 Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually meant by moving MatrixBase.multiply to common.py.
It should already have been moved to common.py. If we add another mul, we may end up with having a duplicate function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, weird, multiply_elementwise is in MatrixArithmetic and multiply is in MatrixBase so they probably should both be in common. As for naming, as a function multiply only appears in matrices.py and mathml.py and does not seem to actually be called from anywhere in SymPy, whereas mul seems to be the common shorthand across many modules.

So how about I move the multiply into common.py alongside multiply_elementwise and leave the mul since it is such so common and we do wind up with a duplicate function but it doesn't actually harm anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the duplicates will be regretted and may have to be deprecated in the future.
Even if you don't like the naming, it's better to respect the old one already came before.
Or you should better get agreed upon deprecating Matrix.multiply and introduce some shorthand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will go with MatrixArithmetic.multiply and nuke mul and MatrixBase.multiply.

@sylee957
Copy link
Member

There are some benchmark regressions

+     1.57±0.01ms      2.58±0.02ms     1.64  matrices.TimeMatrixPower.time_Case1 [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py2.7-fastcache-mpmath-numpy]
+     2.19±0.02ms         3.68±0ms     1.68  matrices.TimeMatrixPower.time_Case1 [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py3.6-fastcache-mpmath-numpy]
          183±2ms          259±2ms     1.41  matrices.TimeMatrixPower.time_Case2 [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py2.7-fastcache-mpmath-numpy]
        291±0.6ms          386±5ms     1.33  matrices.TimeMatrixPower.time_Case2 [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py3.6-fastcache-mpmath-numpy]
          314±4ms          406±3ms     1.29  matrices.TimeMatrixPower.time_Case3 [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py2.7-fastcache-mpmath-numpy]
          429±2ms          528±3ms     1.23  matrices.TimeMatrixPower.time_Case3 [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py3.6-fastcache-mpmath-numpy]
...
+     1.24±0.02ms         1.97±0ms     1.59  solve.TimeMatrixArithmetic.time_dense_multiply(10, 0) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py2.7-fastcache-mpmath-numpy]
+     1.54±0.01ms      2.63±0.01ms     1.71  solve.TimeMatrixArithmetic.time_dense_multiply(10, 0) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py3.6-fastcache-mpmath-numpy]
+     1.47±0.03ms      2.59±0.01ms     1.76  solve.TimeMatrixArithmetic.time_dense_multiply(10, 5) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py2.7-fastcache-mpmath-numpy]
+     1.81±0.01ms      3.35±0.02ms     1.85  solve.TimeMatrixArithmetic.time_dense_multiply(10, 5) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py3.6-fastcache-mpmath-numpy]
         48.4±1μs       65.7±0.8μs     1.36  solve.TimeMatrixArithmetic.time_dense_multiply(3, 0) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py2.7-fastcache-mpmath-numpy]
       60.2±0.2μs       86.0±0.5μs     1.43  solve.TimeMatrixArithmetic.time_dense_multiply(3, 0) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py3.6-fastcache-mpmath-numpy]
       72.8±0.7μs        103±0.6μs     1.42  solve.TimeMatrixArithmetic.time_dense_multiply(3, 5) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py2.7-fastcache-mpmath-numpy]
       90.3±0.2μs        133±0.3μs     1.47  solve.TimeMatrixArithmetic.time_dense_multiply(3, 5) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py3.6-fastcache-mpmath-numpy]
        101±0.9μs        148±0.6μs     1.46  solve.TimeMatrixArithmetic.time_dense_multiply(4, 0) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py2.7-fastcache-mpmath-numpy]
+       126±0.5μs        192±0.6μs     1.53  solve.TimeMatrixArithmetic.time_dense_multiply(4, 0) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py3.6-fastcache-mpmath-numpy]
          168±1μs          245±1μs     1.46  solve.TimeMatrixArithmetic.time_dense_multiply(4, 5) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py2.7-fastcache-mpmath-numpy]
+       204±0.5μs          307±2μs     1.50  solve.TimeMatrixArithmetic.time_dense_multiply(4, 5) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py3.6-fastcache-mpmath-numpy]
+         292±2μs          461±4μs     1.58  solve.TimeMatrixArithmetic.time_dense_multiply(6, 0) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py2.7-fastcache-mpmath-numpy]
+         371±3μs          595±2μs     1.61  solve.TimeMatrixArithmetic.time_dense_multiply(6, 0) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py3.6-fastcache-mpmath-numpy]
+         449±4μs          715±3μs     1.59  solve.TimeMatrixArithmetic.time_dense_multiply(6, 5) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py2.7-fastcache-mpmath-numpy]
+         548±2μs          908±3μs     1.66  solve.TimeMatrixArithmetic.time_dense_multiply(6, 5) [travis-job-518f74a6-a700-43d1-9a69-09f2f29142de/virtualenv-py3.6-fastcache-mpmath-numpy]

And if the default flag mulsimp=None disables the simplification, the regression can be from some unintended changes.

@tom-pytel
Copy link
Contributor Author

tom-pytel commented Dec 17, 2019

Good catch on the slowdown. The mistake I made was assuming that if I did not apply the simplification step the performance would be equal to the original master and thus tested like this. Its surprising that it was simply a call to a separate dotprod function that caused it so I restored the summation to the inner loops of _eval_matrix_mul() for common, dense and sparse for when there is no simplification and it seems to be back to master performance. I did not create separate _eval_matrix_mulsimp() functions since they would share a lot of code and putting in a simple check for the flag in the loop of the current functions seems to have the same performance.

Also I moved the core dotprodsimp() function from matrices/common.py to simplify/simplify.py since it is a general simplified dot product function which may be useful in other places.

Revised benchmarks, the standard "master" perfomance improves a good bit but the "master + simplify" stays huge since the hit there is 99.99% from the simplify() function.

          master         PR   master +
                              simplify

A**4        1.23      15.20     305.00
A**7        2.51      18.40     364.00
A**13       3.13      33.40     484.00
A**16       2.53      47.20     538.00
B**4        2.29      15.30     575.00
B**7        6.43      19.70    2390.00
B**13       8.25      28.70    4180.00
B**16       6.66      31.30    1740.00
C**4      185.00    1320.00   20900.00
C**7      372.00    8380.00     failed
C**13     553.00   13700.00     failed
C**16     568.00   12500.00     failed
D**4      277.00     251.00     failed
D**7     2060.00    4940.00     failed
D**13     failed    5560.00     failed
D**16     failed    1650.00     failed

@sylee957
Copy link
Member

I think that the current approach will complicate the code a lot like
A.mul(B.mul(C, mulsimp), mulsimp)

We should introduce a similar context manager like with evaluate(False) such that this can be more straightforward

with matmulsimp(True):
    A * B * C

@tom-pytel
Copy link
Contributor Author

tom-pytel commented Dec 18, 2019

I think that the current approach will complicate the code a lot like
A.mul(B.mul(C, mulsimp), mulsimp)

We should introduce a similar context manager like with evaluate(False) such that this can be more straightforward

with matmulsimp(True):
    A * B * C

That would be easy enough to do but it touches on something I had a pretty long back and forth in #17247 with @oscarbenjamin, the idea of setting a global operating mode for the matrix multiplication - simplified or not. TL;DR he doesn't like the ideas of globals or even thread-local-storage for setting the operating mode, which is what a context manager would do in its operation even if only in its own isolated scope. The other caveat being that on each context switch the cache would need to be flushed since a function with the same sig could return a different value. Is there a way to flush only certain functions from the cache, not the whole thing?

I could do this, even leaving the optional mulsimp to allow a value of False to override the context manager where needed within a True context, but I don't know if you guys are all onboard with this, can you check on this before I do the work?

On the other hand if all you want to do is avoid the ugliness of A.mul(B.mul(C, mulsimp), mulsimp), the other simple solution is A.mulsimp(B.mulsimp(C)), adding as well a powsimp() and expsimp(). I do like the idea of a context manager though even with its flaws so if is guaranteed that you guys want this I will happily add it.

@oscarbenjamin
Copy link
Contributor

There are many bug reports for evaluate(False) because it is silently ignored in so many places. It also has the potential to break functions that are not designed with it in mind:

with evaluate(False):
    # The author of some_function had no idea that you would do this
    x = some_function(a * b)

Maybe in the end some kind of context manager might be a useful way to go but for now I think it is better to get the different options implemented so that they can be used explicitly in places where blowup is known to happen. For the particular case of multiplication that mostly happens when multiplying in a loop (i.e. matrix powers). Some other cases like the Jordan-power method are improved but those only use a couple of multiplies so post-simplification is typically viable. There are other places like rref, det etc that also suffer from blowup but don't use multiplication (those would most likely be the problem for Jordan-pow and matrix exponential).

If we can get implementations of those things working then we can discuss what the default behaviour of things like A*B or A.jordan_form() should be. The ideal for end users is that:

  1. The default behaviour does sensible things (e.g. usually no avoidable blowup)
  2. There are ways to explicitly control what happens (like explicit, documented methods and options)

If you are worried about complicating code do you mean code internal to sympy or code written by end users? I think that introducing global state will ultimately complicate the internals of sympy much more than having explicit method calls.

@tom-pytel
Copy link
Contributor Author

other places like rref, det etc that also suffer from blowup but don't use multiplication (those would most likely be the problem for Jordan-pow and matrix exponential).

Can you give me some concrete examples of expressions which suffer from this?

@sylee957
Copy link
Member

The elimination algorithms in rref, LUdecomposition, QRdecomposition, are equivalent to multiplying with some eliminaiton matrix, but written without the explicit matrix multiplication.

@tom-pytel
Copy link
Contributor Author

The elimination algorithms in rref, LUdecomposition, QRdecomposition, are equivalent to multiplying with some eliminaiton matrix, but written without the explicit matrix multiplication.

Depending how those multiplications are done it might be feasible to do a similar simplification, though that would be for another PR, this one deals with multiplication so I would say if it is good enough and nothing else is to be added like the context manager then it is good enough?

@oscarbenjamin
Copy link
Contributor

Can you give me some concrete examples of expressions which suffer from this?

The example in #16823 is for rref. It's not hard to make others: you just need a Matrix of Adds (especially where cross multiplying the Adds has the potential to simplify):

Matrix(6, 6, lambda i, j: 1 + (-1)**(i+j)*I).charpoly()

The time spent in charpoly there is in post-simplification. If you call it as charpoly(simplify=lambda e: e) then it finished faster and you can see the blown-up expression that is being simplified.

@tom-pytel
Copy link
Contributor Author

Ok, just a preliminary check...

Matrix(6, 6, lambda i, j: 1 + (-1)**(i+j)*I).charpoly(simplify=lambda e: e)

Made two small changes to class MatrixDeterminant, first _eval_berkowitz_toeplitz_matrix():

        for i in range(self.rows - 2):
            diags.append(A * diags[i])
        diags = [(-R*d)[0, 0] for d in diags]

to:

        for i in range(self.rows - 2):
            diags.append(A.mul (diags[i], mulsimp = True))
        diags = [(-R).mul (d, mulsimp = True)[0, 0] for d in diags]

And _eval_berkowitz_vector():

        return toeplitz * submat._eval_berkowitz_vector()

to:

        return toeplitz.mul (submat._eval_berkowitz_vector(), mulsimp = True)

These make charpoly() return nearly instantly with a clean poly.

rref() for this matrix can also be simplified trivially by changing _row_reduce():

        def cross_cancel(a, i, b, j):
            """Does the row op row[i] = a*row[i] - b*row[j]"""
            q = (j - i)*cols
            for p in range(i*cols, (i + 1)*cols):
                mat[p] = a*mat[p] - b*mat[p + q]

to:

        def cross_cancel(a, i, b, j):
            """Does the row op row[i] = a*row[i] - b*row[j]"""
            q = (j - i)*cols
            for p in range(i*cols, (i + 1)*cols):
                mat[p] = (a*mat[p] - b*mat[p + q]).expand (power_base=False,
                        power_exp=False, log=False, multinomial=False, basic=False)

Which returns a cleaner rref since expanding the sum of two scaled matrices forces the like terms to cancel.

But again, just a quick prelim of one case, but seems doable.

@oscarbenjamin
Copy link
Contributor

I think that the fundamental source of expression blowup in the examples we have considered is the fact that sympy doesn't expand (a+b)*(c+d) which is why the core of dotprodsimp is the same call to expand that you've put there in rref.

There are probably many places that can be used to control blowup but that doesn't all need doing in one PR. The only thing to consider here is what are the most useful functions to define for use elsewhere. A matrix-multiply is clearly useful. The dotprodsimp function seems less likely to be useful in all situations (e.g. you haven't used it in rref there because it no longer makes sense in that context even though essentially the same simplification is needed).

Btw I've just noticed that there is a function expand_mul which does the same as your expand call. It's probably better to use that.

@tom-pytel
Copy link
Contributor Author

tom-pytel commented Dec 18, 2019

I think that the fundamental source of expression blowup in the examples we have considered is the fact that sympy doesn't expand (a+b)*(c+d) which is why the core of dotprodsimp is the same call to expand that you've put there in rref.

Yes.

There are probably many places that can be used to control blowup but that doesn't all need doing in one PR.

Also yes, separate PR please.

The dotprodsimp function seems less likely to be useful in all situations (e.g. you haven't used it in rref there because it no longer makes sense in that context even though essentially the same
simplification is needed).

Umm, I wouldn't expect dotprodsimp to be useful in an operation which is not actually a dot product such as in rref which is a simple sum of scaled matrices, which is not exactly the case you delineate at the top, this is more like a*(x+y) + b*(s+t). EDIT: Which kind of is a dot product I guess but it doesn't need the full treatment, or maybe it will if I look at other cases.

Btw I've just noticed that there is a function expand_mul which does the same as your expand call. It's probably better to use that.

Sure. My question is how do you want to proceed...

@oscarbenjamin
Copy link
Contributor

I guess there are two parts to dotprodsimp used in matrix multiply:

  1. Call expand_mul in the innermost loop of the matrix multiply algorithm.
  2. Post-simplification at the end.

Is it possible (or desirable) to separate those?

I think that the inner expand_mul is essential and can't be replaced by simplifying later so that requires a special mul method/option as added here.

The post-simplification could be done as a separate step at the whole matrix level and is presumably beneficial in the case of computing a matrix power by repeated multiply. In the context of the matrix power operation as a whole it isn't really a post-simplification because it happens in between each inner multiply.

What I wonder is if the post-simplification part of dotprodsimp is also useful in the other contexts like the matrix exponential or might they benefit more from something else? If so then it might be better for that not to be part of the mul method itself since those functions could choose what simplification to apply in each context.

My question is how do you want to proceed...

I'm not sure about the API or general applicability of dotprodsimp yet but otherwise the whole PR seems an improvement. Can we make dotprodsimp internal (add a leading underscore) and then merge this?

For the future the things that I think can (possibly) be improved are:

  1. Maybe dotprodsimp can be split into two functions: expand_mul and the other part for post-simplification (I guess the second part is highly tuned to matrix power so it might be better placed in matrices)
  2. There are many other functions eig, det etc and also different options of those functions that possibly can benefit from expand_mul or dotprodsimp (or similar).
  3. If this kind of simplification ends up being something that happens in many different matrix operations then maybe mulsimp isn't the best name for users to understand... (perhaps something to do with "expand" is better if that's the key operation)
  4. I wonder if there is a way to detect when this might be beneficial so that we can do it automatically for users.

I think that the answers to some of these questions might become clearer when looking at more cases of blowup in other operations.

I'd like to know whether @sylee957 thinks this is mergeable as well though.

@tom-pytel
Copy link
Contributor Author

I moved the dotprodsimp back into matrices/common.py if you don't think it is useful generally. Also added the call to expand_mul though I don't necessarily like it since its just a call to a call and expand_mul does an unnecessary sympify (since all elements in a matrix are already sympified) but it doesn't seem to affect the benchmarks so moot point. Also cleaned up _eval_matrix_mul inner loops a little.

I guess there are two parts to dotprodsimp used in matrix multiply:

  1. Call expand_mul in the innermost loop of the matrix multiply algorithm.
  2. Post-simplification at the end.

Is it possible (or desirable) to separate those?

Maybe, though I've left it as a single function for now until the use-case for separating appears when I look at other functions.

For the future the things that I think can (possibly) be improved are:

  1. Maybe dotprodsimp can be split into two functions: expand_mul and the other part for post-simplification (I guess the second part is highly tuned to matrix power so it might be better placed in matrices)

It is not really tuned exclusively to matrices but to the kind of blowup that occurs in any generic sum of products.

  1. There are many other functions eig, det etc and also different options of those functions that possibly can benefit from expand_mul or dotprodsimp (or similar).

The quick prelim I did seems to point to rref and det being controllable, give me a list of other problematic functions and test expressions or issue #s and I will check.

  1. If this kind of simplification ends up being something that happens in many different matrix operations then maybe mulsimp isn't the best name for users to understand... (perhaps something to do with "expand" is better if that's the key operation)

For now mulsimp describes precisely what it does, for other functions if there is a different type if intermediate simplification obviously it would take a different name. Which brings me to how to specify any future blowup control for other functions, I take it you would prefer the interface to be explicit flags passed to the functions if there is a trade-off involved and if the simplification always works better then it would be default?

  1. I wonder if there is a way to detect when this might be beneficial so that we can do it automatically for users.

Might be slower to try to detect than the operation itself. The difference in execution time for Matrix(8, 8, [x+1]*64).pow(4, mulsimp=True) vs. Matrix(8, 8, [x+i for i in range (64)]).pow(4, mulsimp=True) is 15.2ms vs. 1320ms even though the matrices look very similar.

@sylee957
Copy link
Member

In my opinion, it should better to have the keyword symmetrical like simplify used in Matrix.log or other methods.
simplify can be an functional argument, but it is made to only a single-valued function that accepts a single argument, and if your changes can be achievable by this way, then it's better to keep the way that other functions are using.

So I would expect
ret = simpfunc(P * simpfunc(eJ * P.inv())) rather than
ret = P.mul(eJ, mulsimp=mulsimp).mul(P.inv(), mulsimp=mulsimp)
and building a simpfunc that can accept a single matrix argument.

dotprodsimp can be made equivalently into something like applying the expand_mul and count_ops on elementwise way, to decide which element to simplify the matrix or not, and it can take the whole result of the unsimplified matrix product.

@tom-pytel
Copy link
Contributor Author

Also, I just remembered I read an article a couple of weeks ago that says a new method has been discovered to derive eigenvectors from just eigenvalues, here is the article:

https://www.quantamagazine.org/neutrinos-lead-to-unexpected-discovery-in-basic-math-20191113/

Ignore the neutrino stuff, its the eigen stuff which is potentially revolutionary. Has either of you heard of this or has access to smart-people publications which may have or may mention this?

@sylee957
Copy link
Member

Okay, I wouldn't say anything more about the keyword naming.
Probably, at this point, it may be better to throw out matmulsimp if you are not using it in the code and if you want to bind dotprodsimp for every process of division.

@tom-pytel
Copy link
Contributor Author

Okay, I wouldn't say anything more about the keyword naming.
Probably, at this point, it may be better to throw out matmulsimp if you are not using it in the code and if you want to bind dotprodsimp for every process of division.

Ok, dotprodsimp it is...

@sylee957
Copy link
Member

sylee957 commented Dec 23, 2019

Also, I just remembered I read an article a couple of weeks ago that says a new method has been discovered to derive eigenvectors from just eigenvalues, here is the article:

https://www.quantamagazine.org/neutrinos-lead-to-unexpected-discovery-in-basic-math-20191113/

Ignore the neutrino stuff, its the eigen stuff which is potentially revolutionary. Has either of you heard of this or has access to smart-people publications which may have or may mention this?

The actual paper can be https://arxiv.org/pdf/1908.03795.pdf where you can find the formula.

I have made the quick demonstration below

from sympy import *
from IPython.display import display

M = Matrix([[1, 1, -1], [1, 3, 1], [-1, 1, 3]])
M_evs = sorted(list(M.eigenvals().keys()))

M_minors = [M.minor_submatrix(i, i) for i in range(M.rows)]
M_minors_evs = [sorted(list(m.eigenvals().keys())) for m in M_minors]
        
def entry_LHS(i, j):
    return Abs(Symbol("v_{%s, %s}" % (i, j)))**2

def entry_RHS(i, j):
    numer = S.One
    for k in range(M.rows-1):
        numer *= M_evs[i] - M_minors_evs[j][k]

    denom = S.One
    for k in range(M.rows):
        if k == i:
            continue
        denom *= M_evs[i] - M_evs[k]

    return simplify(numer/denom)

display_mat_LHS = Matrix(M.rows, M.cols, entry_LHS)
display_mat_RHS = Matrix(M.rows, M.cols, entry_RHS)
print('Result computed from minors:')
display(Eq(display_mat_LHS.T, display_mat_RHS.T))

l = list()
for _, _, vects in M.eigenvects():
    for vect in vects:
        vect = vect.normalized()
        vect = vect.applyfunc(lambda x: simplify(abs(x)**2))
        l.append(vect)

display_mat_RHS = Matrix.hstack(*l)
print('The actual result:')
display(Eq(display_mat_LHS.T, display_mat_RHS))

But there can be some limitations like

  1. It applies only for hermitian matrices
  2. It is only about computing the square of the eigenvectors.

But if this can be implemented, it can be considered as some other named function eigenvectors_abs_squared or such.

@tom-pytel
Copy link
Contributor Author

Ok, well it is interesting then but of no real practical use here :(

@oscarbenjamin
Copy link
Contributor

Ok, well it is interesting then but of no real practical use here :(

The restriction to Hermitian matrices is not a big problem as symmetric/Hermitian is an important use case. However only finding the mod-square of the components of the eigenvector is more of a problem. If the eigenvectors were known to be real you would only need to figure out the sign of each component but although Hermitian matrices have real eigenvalues the eigenvectors can still be complex. However real, symmetric matrices always have real eigenvectors so in that case you can just figure out the sign of each component. It could be worth investigating as a method for finding eigenvectors of a real symmetric matrix.

The big problem with analytically finding eigenvectors is that in general the eigenvalues/eigenvectors may not have simple forms. You mentioned the blowup of the roots earlier but you only have to look at the general solutions for cubics/quartics to see where that blowup comes from. Of course in some situations the matrix can be large but still have simple eigenvalues and eigenvectors. Then there is the possibility to eliminate avoidable blowup.

@oscarbenjamin
Copy link
Contributor

From @sylee957's example:

In [15]: M = Matrix([[ 1, 1, -1],[ 1, 3,  1],[-1, 1,  3]])                                                                                                    

In [16]: M                                                                                                                                                    
Out[16]:1   1  -1⎤
⎢         ⎥
⎢1   3  1 ⎥
⎢         ⎥
⎣-1  1  3 ⎦

In [17]: s1, s2, s3 = symbols('s1, s2, s3', real=True)                                                                                                        

In [18]: vect = Matrix([s1*2, s2*1, s3*1])                                                                                                                    

In [19]: M*vect                                                                                                                                               
Out[19]:2s₁ + s₂ - s₃  ⎤
⎢                 ⎥
⎢2s₁ + 3s₂ + s₃ ⎥
⎢                 ⎥
⎣-2s₁ + s₂ + 3s₃⎦

In [20]: solve([M*vect, s1**2-1, s2**2-1, s3**2-1], [s1, s2, s3], dict=True)                                                                                  
Out[20]: [{s₁: -1, s₂: 1, s₃: -1}, {s₁: 1, s₂: -1, s₃: 1}]

In [21]: srep = solve([M*vect, s1**2-1, s2**2-1, s3**2-1], [s1, s2, s3], dict=True)[0]                                                                        

In [22]: vect.subs(srep)                                                                                                                                      
Out[22]:-2⎤
⎢  ⎥
⎢1 ⎥
⎢  ⎥
⎣-1

(Note that this involves solving a linear system similar to the one that would be solved to find the eigenvectors in the normal way though...)

@tom-pytel
Copy link
Contributor Author

However real, symmetric matrices always have real eigenvectors so in that case you can just figure out the sign of each component.

Which could be done with a simple dot product and compare the sign with the sign of the eigenvalue times the sqrt of the v(i,j) if I understand this correctly. Might allow finding eigenvects for much larger matrices than is currently possible. Might check into it if you think it is worth it for real symmetric matrices but must finish this PR first :)

@tom-pytel
Copy link
Contributor Author

Which btw I am obviously on track to do all the blowup control in one push, not splitting it up as we had said, hope that is ok...

@tom-pytel
Copy link
Contributor Author

tom-pytel commented Dec 24, 2019

I take it back, you are right, it is a system, I wonder if there would be a way to hack it though given the unique circumstances... EDIT: I mean the fact that each element of the solution can only take on one of two values instead of an infinite range. EDIT: Excluding zeros...

@sylee957
Copy link
Member

Which btw I am obviously on track to do all the blowup control in one push, not splitting it up as we had said, hope that is ok...

Often, having a lot of stuff in one PR makes it difficult to deal with conflicts.
This doesn't necessarily have to be perfect, and if you have to work on simplifying individual methods, you don't have to implement everything on a single PR, so I would merge this if you confirm that you are ready.

@tom-pytel
Copy link
Contributor Author

Ok, well I just pushed what I have so far, will leave it there and continue with remaining methods in different PR then once this is merged. Also do you want me to push the slow but illustrative matrices_dotprodsimp to sympy_benchmarks?

Two minor changes here unrelated to dotprodsimp - I removed a redundant cancel in _eval_det_bareiss and changed diagonalize to not call eigenvects a second time after calling is_diagonalizable.

@sylee957
Copy link
Member

sylee957 commented Dec 24, 2019

Currently, the property is_diagonalizable is cached for immutable matrices, so there are some other things to consider, or it may likely

  1. The cache may not work as intended, like M.is_diagonalizable(witheigen=False) can be considered a cache miss even if M.is_diagonalizable(witheigen=True) was called before, and vice versa
  2. It may cache up the whole boolean and eigenvectors pairs, which can be inefficient in memorywise because the size to contain all eigenvectors can be as large as the size of the matrix.

And probably that was the reason why I didn't touch that part in my previous attempt in #15887

@tom-pytel
Copy link
Contributor Author

Currently, the property is_diagonalizable is cached for immutable matrices, so there are some other things to consider, or it may likely

  1. The cache may not work as intended, like M.is_diagonalizable(witheigen=False) can be considered a cache miss even if M.is_diagonalizable(witheigen=True) is called, and vice versa
  2. It may cache up the whole boolean and eigenvectors pairs, which can be inefficient in some other way because eigenvectors would take up the whole size of the matrix.

And probably that was the reason why I didn't touch that part in my previous attempt in #15887

What about making the base _is_diagonalizable always return a pair which can be called from diagonalize and then a wrapper is_diagonalizable only return the boolean of the pair from _is_diagonalizable?

@tom-pytel
Copy link
Contributor Author

Oops on the docs, this should be ready to merge if all tests pass.

feature='clear_subproducts',
deprecated_since_version=1.4,
issue=15887
).warn()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about moving the deprecation warnings to is_diagonalizable, and also delete arbitrary kwargs here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done...

@sylee957
Copy link
Member

Thank you for contributing.
I've migrated the discussions for eigenvectors in #18121

@tom-pytel
Copy link
Contributor Author

There are still some functions left to add dotprodsimp to so I think I will finish that first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expression blowup in matrix operations
5 participants