Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid Maxima on creation of symbolic matrices #18979

Closed
rwst opened this issue Aug 1, 2015 · 32 comments
Closed

avoid Maxima on creation of symbolic matrices #18979

rwst opened this issue Aug 1, 2015 · 32 comments

Comments

@rwst
Copy link

rwst commented Aug 1, 2015

People have no idea what they ask for when they write if ex != 0: with ex possibly(!) a symbolic expression. E.g. this happens to trigger in matrix/* the full blown procedure of starting Maxima with its limited deductive armament trying to prove that ex is nonzero. This when all was wanted was that it not be a "numeric" zero.

This ticket replaces the two instances in matrix/ where such behaviour was triggered in doctests.

Compare on fresh Sage:
Before:

sage: %timeit M = matrix(SR, 2, 2, x^2 - 2*x + 1)
The slowest run took 468.13 times longer than the fastest. This could mean that an intermediate result is being cached 
1 loops, best of 3: 2.08 ms per loop

After:

sage: %timeit M = matrix(SR, 2, 2, x^2 - 2*x + 1)
The slowest run took 815.87 times longer than the fastest. This could mean that an intermediate result is being cached 
10000 loops, best of 3: 82.8 µs per loop

Component: linear algebra

Author: Nils Bruin

Branch/Commit: e42f285

Reviewer: Vincent Delecroix

Issue created by migration from https://trac.sagemath.org/ticket/18979

@rwst rwst added this to the sage-6.9 milestone Aug 1, 2015
@rwst
Copy link
Author

rwst commented Aug 1, 2015

@rwst
Copy link
Author

rwst commented Aug 1, 2015

New commits:

75e679c18979: avoid Maxima on creation of symbolic matrices

@rwst
Copy link
Author

rwst commented Aug 1, 2015

Commit: 75e679c

@rwst
Copy link
Author

rwst commented Aug 1, 2015

Author: Ralf Stephan

@rwst

This comment has been minimized.

@rwst
Copy link
Author

rwst commented Aug 2, 2015

comment:4

This is also a commit of #18980 because without #18979 the improved decision logic in Pynac would lead to doctest fails. So if #18980 is merged first this ticket will be duplicate.

@videlec
Copy link
Contributor

videlec commented Aug 15, 2015

comment:5

Hi,

I really do not like your solution. You are invading a generic class with some specialized code. Moreover, the issue you found might be present in a lot of other places. You should find a better way to do this test without having to care whether the input is symbolic. If each ring needs a specialized treatment it will become a complete nightmare.

Secondly, the operations import XYZ and isinstance(x, my_type) are not at all negligeable. Though, in this case it is not very noticeable since the cost is elsewhere. But with your strategy, you are potentially slowing down everthing for a corner case speed up.

In this very particular case of matrices, there is a simpler way to fix the issue. You can just avoid testing anything. In case x is zero a useless filling will be done but of no harm (it is very quick compared to the rest of the initialization). If you like better my solution, write at least a comment that we should not test wheter the input is zero.

Vincent

@rwst
Copy link
Author

rwst commented Aug 16, 2015

comment:6

Replying to @videlec:

You should find a better way to do this test without having to care whether the input is symbolic. If each ring needs a specialized treatment it will become a complete nightmare.

Agreed. The easiest way would be bool(x1!=x2) to mean not (x1-x2).is_trivial_zero() for symbolic x and to provide a different interface for cases requiring simplification/proof. This would have some rerpercussions in symbolic and will take long to review so...

In this very particular case of matrices, there is a simpler way to fix the issue. You can just avoid testing anything.

Will do that.

EDIT: not zero

@rwst
Copy link
Author

rwst commented Aug 16, 2015

comment:7

Omitting the checks however makes hundreds of doctests fail with eg...

Failed example:
    matrix(RR, 0, 2, []).columns()
Exception raised:
    Traceback (most recent call last):
      File "/home/ralf/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 496, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/home/ralf/sage/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 858, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.matrix.matrix1.Matrix.columns[2]>", line 1, in <module>
        matrix(RR, Integer(0), Integer(2), []).columns()
      File "/home/ralf/sage/local/lib/python2.7/site-packages/sage/matrix/constructor.py", line 729, in _matrix_constructor
        return matrix_space.MatrixSpace(ring, nrows, ncols, sparse=sparse)(entries)
      File "/home/ralf/sage/local/lib/python2.7/site-packages/sage/matrix/matrix_space.py", line 482, in __call__
        return self.matrix(entries, coerce, copy)
      File "/home/ralf/sage/local/lib/python2.7/site-packages/sage/matrix/matrix_space.py", line 1334, in matrix
        return self.zero_matrix().__copy__()
      File "sage/misc/cachefunc.pyx", line 2215, in sage.misc.cachefunc.CachedMethodCallerNoArgs.__call__ (build/cythonized/sage/misc/cachefunc.c:13792)
        self.cache = f(self._instance)
      File "/home/ralf/sage/local/lib/python2.7/site-packages/sage/matrix/matrix_space.py", line 1185, in zero_matrix
        res = self.__matrix_class(self, 0, coerce=False, copy=False)
      File "sage/matrix/matrix_generic_dense.pyx", line 118, in sage.matrix.matrix_generic_dense.Matrix_generic_dense.__init__ (build/cythonized/sage/matrix/matrix_generic_dense.c:3057)
        raise TypeError, "nonzero scalar matrix must be square"
    TypeError: nonzero scalar matrix must be square

@rwst
Copy link
Author

rwst commented Aug 16, 2015

comment:8

For the symbolic solution see #19040. The question is if that makes this ticket a duplicate, or if your idea of simply removing checks can be implemented.

@videlec
Copy link
Contributor

videlec commented Aug 16, 2015

comment:9

Replying to @rwst:

For the symbolic solution see #19040. The question is if that makes this ticket a duplicate, or if your idea of simply removing checks can be implemented.

I was wrong. Removing the check would work only for square matrices. You want to allow initialization from zero for any matrix (because the matrix space form a vector space). But for other scalars the matrices need to be square in order to form a ring.

@rwst
Copy link
Author

rwst commented Sep 7, 2015

comment:10

Secondly, the operations import XYZ and isinstance(x, my_type) are not at all negligeable. Though, in this case it is not very noticeable since the cost is elsewhere. But with your strategy, you are potentially slowing down everthing for a corner case speed up.

This is nonsense. Look at how many instance checks you have in Matrix_generic_dense.__init__. Lots of special cases. I have no idea why you would try to avoid this one here.

@rwst
Copy link
Author

rwst commented Sep 7, 2015

@rwst
Copy link
Author

rwst commented Sep 7, 2015

Changed commit from 75e679c to 2c4fa61

@rwst
Copy link
Author

rwst commented Sep 7, 2015

comment:12

This patch has the following footprint:

           SR   Laurent poly
develop   2 ms     45 us
patched  78 us     52 us

I don't believe you would stall this again because of, say, ten microseconds?


New commits:

2c4fa6118979: avoid Maxima on creation of symbolic matrices

@nbruin
Copy link
Contributor

nbruin commented Sep 7, 2015

comment:13

I think invading a fundamental piece of infrastructure with specific tests for a specific ring should not be taken lightly. I think you can avoid most of your trouble by delaying the is_zero test to the last possible moment:

        elif self._nrows == self._ncols:
            self._entries = [entries if i==j else zero for i in range(self.nrows) for j in range(self.nrows)]
        elif entries == zero:
            self._entries = [zero]*(self._nrows*self._ncols)
        else:
            raise TypeError("nonzero scalar matrix must be square")

The first list comprehension can be optimized further, if required (I suspect that testing whether "entries" is zero to more quickly create a square zero matrix is a false optimization: if you need that thing quickly often, you should reuse an immutable zero matrix. That will be much quicker (and constant in size of the matrix!), so I don't think it's necessary).

If the zero check is expensive, it will almost certainly lead to an error.

With this code there is no big need to single out SR anymore, and you fix a more fundamental issue: in order to create a matrix (even a diagonal square one) it shouldn't be necessary that your ring has a decidable (or efficient) zero test. For a non-square zero matrix, we can't avoid a zero check, but at least we can delay that until we really need to know (because otherwise we raise an error).

@nbruin
Copy link
Contributor

nbruin commented Sep 7, 2015

Changed branch from u/rws/18979 to u/nbruin/18979

@nbruin
Copy link
Contributor

nbruin commented Sep 7, 2015

Changed author from Ralf Stephan to Nils Bruin

@nbruin
Copy link
Contributor

nbruin commented Sep 7, 2015

comment:16

OK, I'll put my code where my mouth is.

Previous branch u/rws/18979 should still be present if it's preferred to resolve this ticket on the solution proposed there.


New commits:

257c0dftrac 18979: avoid zero-testing in matrix initialization until we really have to.

@nbruin
Copy link
Contributor

nbruin commented Sep 7, 2015

Changed commit from 2c4fa61 to 257c0df

@videlec
Copy link
Contributor

videlec commented Sep 7, 2015

comment:17

As I said in comment:5, the solution should be non intrusive and your solution is what I suggested at the very end of this comment.

Though, using a list extension is twice slower. With

def m1():
    cdef Py_ssize_t i
    cdef list l = [0] * 100
    for i in range(10): l[10*i + i] = 12
    return l
def m2():
    cdef Py_ssize_t i,j
    return [12 if i == j else 0 for i in range(10) for j in range(10)]

I got

sage: %runfile test.pyx
sage: %timeit l = m1()
1000000 loops, best of 3: 456 ns per loop
sage: %timeit l = m2()
1000000 loops, best of 3: 1.09 µs per loop

@nbruin
Copy link
Contributor

nbruin commented Sep 7, 2015

comment:18

Replying to @videlec:

As I said in comment:5, the solution should be non intrusive and your solution is what I suggested at the very end of this comment.

Though, using a list extension is twice slower. With

def m1():
    cdef Py_ssize_t i
    cdef list l = [0] * 100
    for i in range(10): l[10*i + i] = 12
    return l
def m2():
    cdef Py_ssize_t i,j
    return [12 if i == j else 0 for i in range(10) for j in range(10)]

I got

sage: %runfile test.pyx
sage: %timeit l = m1()
1000000 loops, best of 3: 456 ns per loop
sage: %timeit l = m2()
1000000 loops, best of 3: 1.09 µs per loop

OK, go ahead and replace the loop by whatever works better. I hoped cython would be able to optimize this loop, but apparently it cannot. As your example shows, we're likely better off writing the diagonal twice. If this absolutely needs to be fast, we can kick down to a PyList_New and PyList_SET_ITEM inside a c-style loop. Do we need that? Probably if you need a diagonal matrix fast you should some something sparse.

The only thing that testing for zero got us in the square case is that for a zero matrix, we can avoid reinitializing the diagonal. I think we can do without that optimization, so then at least the problem for SR and other rings without (easily) testable equality are mostly fine except for situations that will most likely be an error anyway.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Sep 7, 2015

Changed commit from 257c0df to e42f285

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Sep 7, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

e42f285trac 18979: apparently cython's list comprehension optimization isn't so good for lists of predictable length.

@videlec
Copy link
Contributor

videlec commented Sep 7, 2015

comment:21

Thanks for the changes. Note that your solution does not change anything to the problem mentioned in the description of this ticket!

Vincent

@nbruin
Copy link
Contributor

nbruin commented Sep 7, 2015

comment:22

Replying to @videlec:

Note that your solution does not change anything to the problem mentioned in the description of this ticket!

Well, it solves the problem for the example mentioned: for square matrices and a single ring element as initialization it doesn't compare it to 0 anymore, so for square matrices, maxima is now avoided.

I think comparison in SR is smart enough that an explicit 0 compares equal to 0 without punting to maxima. So maxima should only be hit if we have a non-square matrix and the argument is not obviously 0. In that case it probably isn't zero, and in any case the user deserves what they have coming to them then.

In general, I think indeed that the generic matrix code shouldn't assume equality- or zero-testing is particularly cheap, and hence avoid it unless it's really necessary. The change here is a step in the right direction and basically has the same effect of what rws was trying to do on his branch.

@videlec
Copy link
Contributor

videlec commented Sep 7, 2015

Reviewer: Vincent Delecroix

@videlec
Copy link
Contributor

videlec commented Sep 7, 2015

comment:23

Replying to @nbruin:

Replying to @videlec:

Note that your solution does not change anything to the problem mentioned in the description of this ticket!

Well, it solves the problem for the example mentioned: for square matrices and a single ring element as initialization it doesn't compare it to 0 anymore, so for square matrices, maxima is now avoided.

Right. Sorry, I misunderstood the branching of the code!

@videlec
Copy link
Contributor

videlec commented Sep 7, 2015

comment:24

Replying to @nbruin:

In general, I think indeed that the generic matrix code shouldn't assume equality- or zero-testing is particularly cheap, and hence avoid it unless it's really necessary. The change here is a step in the right direction and basically has the same effect of what rws was trying to do on his branch.

Right, but for sparse matrices it is explicitely assumed that only nonzero entries are stored... what should we do in that case?

@nbruin
Copy link
Contributor

nbruin commented Sep 7, 2015

comment:25

Replying to @videlec:

Right, but for sparse matrices it is explicitely assumed that only nonzero entries are stored... what should we do in that case?

Different problem, different ticket. Of course, one should only avoid comparisons if one can.

Since matrices rarely "accidentally" remain sparse, you probably get decent performance by just assuming any entry that is actually given, is nonzero (so no explicit checks necessary). Of course that doesn't work when converting a dense matrix to a sparse one.

If this is an issue, perhaps we should have a special "comparison avoiding" sparse matrix class.

@rwst
Copy link
Author

rwst commented Sep 8, 2015

comment:26

In any case #19040 will make this all obsolete.

@vbraun
Copy link
Member

vbraun commented Sep 8, 2015

Changed branch from u/nbruin/18979 to e42f285

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants