Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serious regression caused by #9138 #11900

Closed
simon-king-jena opened this issue Oct 5, 2011 · 299 comments
Closed

Serious regression caused by #9138 #11900

simon-king-jena opened this issue Oct 5, 2011 · 299 comments

Comments

@simon-king-jena
Copy link
Member

At sage-devel, Jeroen reported a massive regression in elliptic curve computations. The regression was introduced in the transition from sage-4.7.2.alpha2 to sage-4.7.2.alpha3.

It seems that #9138 is responsible, at least for a big part of the regression. With unpatched sage-4.7.2.alpha2, we find

sage: E = J0(46).endomorphism_ring()
sage: %time g = E.gens()
CPU times: user 5.54 s, sys: 0.15 s, total: 5.69 s
Wall time: 5.81 s

Adding #9138 and its dependency, we obtain

sage: E = J0(46).endomorphism_ring()
sage: %time g = E.gens()
CPU times: user 8.72 s, sys: 0.18 s, total: 8.89 s
Wall time: 8.92 s

It turns out that much time is wasted for calls to sage.categories.Category.join and to sage.categories.Category.hom_category.

When caching these two methods, one can reduce the speed difference to something like that (sage-4.7.2.alpha3 plus #11115 plus an experimental patch for the caching):

sage: E = J0(46).endomorphism_ring()
sage: %time g = E.gens()
CPU times: user 6.82 s, sys: 0.16 s, total: 6.98 s
Wall time: 7.40 s

However, that's still far from good. After caching join and hom_category, there is still too much time spent (according to %prun) for the initialisation of matrix spaces.

Apply

Depends on #11319
Depends on #9138
Depends on #11911
Depends on #9562

CC: @jdemeyer @nthiery @malb

Component: performance

Keywords: categories regression

Author: Simon King

Reviewer: Jeroen Demeyer, Nicolas M. Thiéry, Simon King, Jason Grout

Merged: sage-5.0.beta0

Issue created by migration from https://trac.sagemath.org/ticket/11900

@simon-king-jena
Copy link
Member Author

comment:2

It seems that some time is spent for creating a homset, when creating a matrix space.

In fact, during initialisation, a special map from the basering is created for coercion. That has been introduced in #9138 because otherwise coercion from the base ring to the matrix space is too slow.

However, it might be better to not create the coerce map during initialisation. After all, it is a waste of time to create it when eventually it is not used. Perhaps that is what is happening here?

@simon-king-jena
Copy link
Member Author

comment:3

When avoiding initialisation of coercion from the base ring into a matrix space,(in addition to caching hom_category and join), I get

sage: E = J0(46).endomorphism_ring()
sage: %time g = E.gens()
CPU times: user 6.83 s, sys: 0.22 s, total: 7.05 s
Wall time: 7.07 s

@simon-king-jena
Copy link
Member Author

comment:4

The next problematic call seems to be __get__ of some lazy attribute. But I don't know yet which it is.

@simon-king-jena
Copy link
Member Author

comment:5

Next finding: It could be that the cache for matrix spaces is broken. I inserted a print statement into the initialisation of MatrixSpace_generic that shows base_ring,nrows,ncols and the sparsity.

By the cache implemented in the MatrixSpace constructor, no quadrupel should occur twice, but in fact the space of dense 8 by 8 matrices over GF(46337) and other instances occurs several times. Could it be that the matrix space is directly constructed, i.e., breaking the cache?

@simon-king-jena
Copy link
Member Author

comment:6

Replying to @simon-king-jena:

By the cache implemented in the MatrixSpace constructor, no quadrupel should occur twice, but in fact the space of dense 8 by 8 matrices over GF(46337) and other instances occurs several times. Could it be that the matrix space is directly constructed, i.e., breaking the cache?

No, it is differently: The cache uses weak references. In fact, garbage collection is to blame for the double construction of matrix spaces.

@simon-king-jena
Copy link
Member Author

comment:7

When I use strong references for caching matrix spaces (in addition to the previous changes), I get

sage: E = J0(46).endomorphism_ring()
sage: %time g = E.gens()
CPU times: user 6.15 s, sys: 0.16 s, total: 6.31 s
Wall time: 6.33 s

Of course, one wouldn't like to have all the matrix spaces hanging around. But perhaps it is an option to temporarily turn garbage collection off.

However, the computation of E.gens() actually requires the construction of many different matrix spaces. So, making matrix space creation faster certainly is the best option.

@simon-king-jena
Copy link
Member Author

comment:8

Another time seems to be uselessly spent on a generic method of free module elements: list().

Such a small method ought to be as fast as possible and should thus be overridden in sub-classes. But sage.modules.vector_integer_dense.Vector_integer_dense does not override the list() method.

@simon-king-jena
Copy link
Member Author

comment:9

Replying to @simon-king-jena:

Such a small method ought to be as fast as possible and should thus be overridden in sub-classes. But sage.modules.vector_integer_dense.Vector_integer_dense does not override the list() method.

Neither do sage.modules.vector_rational_dense.Vector_rational_dense.

@simon-king-jena
Copy link
Member Author

comment:10

If the initialisation of matrix spaces is changed such that Parent.__init__ (and thus the full category framework) is not called, then I'm down to

sage: E = J0(46).endomorphism_ring()
sage: %time g = E.gens()
CPU times: user 5.65 s, sys: 0.16 s, total: 5.82 s
Wall time: 5.83 s

As a consequence, the TestSuite of matrix spaces would not fully work, but perhaps it would be worth it. I'll ask sage-devel.

@simon-king-jena
Copy link
Member Author

comment:11

The next surprise is that (even if one avoids category initialisation of matrix spaces) many covariant constructions (quotients and subquotients) are called - that is another reason why Category.join() is called so often.

Part of the problem may be that sage.categories.CovariantConstructionCategory.category_of has been defined as @class_method in #8881, not as @cached_function (one can see in the code that cached_function has been considered, but was commented out). So, perhaps it is worth while to try that change?

But first I need to find out where the quotients and subquotients actually arise.

@simon-king-jena
Copy link
Member Author

comment:12

Yes!!

When I use a cache for category_of, we have

sage: E = J0(46).endomorphism_ring()
sage: %time g = E.gens() 
CPU times: user 5.54 s, sys: 0.19 s, total: 5.74 s
Wall time: 5.75 s

and at least this instance of a regression has gone! I don't know how much of it is due to the fact that I also have #11115 applied, but let me first finish the patch and then we'll see.

@simon-king-jena
Copy link
Member Author

comment:13

I have attached a preliminary patch, if someone already wants to have a look. I need to do more tests (and run doctests), but up to now it seems that it fixes the regression.

Changes for Matrix Spaces

Matrix spaces will not make use of the category framework by default. But I introduced a new method that makes them use the category framework after initialisation:

            sage: MS = MatrixSpace(QQ,8)
            sage: TestSuite(MS).run()
            Failure in _test_category:
            Traceback (most recent call last):
            ...
            AssertionError: category of self improperly initialized
            ------------------------------------------------------------
            The following tests failed: _test_category
            sage: type(MS)
            <class 'sage.matrix.matrix_space.MatrixSpace_generic'>
            sage: MS.full_category_initialisation()
            sage: TestSuite(MS).run()
            sage: type(MS)
            <class 'sage.matrix.matrix_space.MatrixSpace_generic_with_category'>

Minor change: I made __matrix_class a lazy attribute.

Changes for categories

I turn hom_category in a cached method - the result is cached anyway, and thus it makes sense to cache as early as possible, in order to reduce the overhead, and in order to benefit from #11115.

I also cache the class method category_of in sage.categories.covariant_functorial_construction. According to a commented-out line, caching has been attempted anyway, but since it needs to be a class method, caching has not been possible by simply using a decorator.

Vectors

I added custom list() methods for both integral and rational dense vectors. Here is the speed difference (the timing is with patch attached):

sage: v = vector(ZZ,range(100))
sage: %timeit L = v.list()
625 loops, best of 3: 11 µs per loop
# was 17 µs per loop in sage-4.7.2.alpha2
sage: v = vector(QQ,range(100))
sage: %timeit L = v.list()
625 loops, best of 3: 24 µs per loop
# was 27.7 µs per loop in sage-4.7.2.alpha2

The difference is small, but one should benefit from it.

Doc formatting

I corrected some wrong formatting in the documentation of dense integral and rational vectors.

@simon-king-jena
Copy link
Member Author

Dependencies: #9138

@simon-king-jena
Copy link
Member Author

Author: Simon King

@simon-king-jena
Copy link
Member Author

comment:14

The crucial question: Does it fix the speed regression?

Here are Jeroen's examples from sage-devel.

sage-4.7.2.alpha2

sage: %time D = J0(46).endomorphism_ring().discriminant()
CPU times: user 5.60 s, sys: 0.21 s, total: 5.81 s
Wall time: 5.83 s
sage: %time TestSuite(CrystalOfTableaux(['B',4],shape=[2,1,1,0])).run()
CPU times: user 7.08 s, sys: 0.03 s, total: 7.11 s
Wall time: 7.75 s
sage: W.<z> = CyclotomicField(13)
sage: %time M = Matrix(W, 2, 3, [10^30*(1-z)^13, 1, 2, 3, 4, z]).echelon_form()
CPU times: user 3.29 s, sys: 0.03 s, total: 3.32 s
Wall time: 3.38 s
sage: %time L = EllipticCurve('960d1').prove_BSD()
CPU times: user 3.73 s, sys: 0.04 s, total: 3.78 s
Wall time: 4.01 s
sage: def test(E):
....:     for p in prime_range(10000):
....:         if p != 389:
....:             G = E.change_ring(GF(p)).abelian_group()
....:             
sage: E = EllipticCurve('389a')
sage: %time test(E)
CPU times: user 16.47 s, sys: 0.04 s, total: 16.50 s
Wall time: 16.73 s
sage: %time for E in cremona_curves([11..100]): S = E.integral_points(both_signs=False)
CPU times: user 13.88 s, sys: 0.06 s, total: 13.94 s
Wall time: 14.01 s

sage-4.7.2.alpha2 plus #9138 and its dependency

sage: %time D = J0(46).endomorphism_ring().discriminant()
CPU times: user 8.80 s, sys: 0.15 s, total: 8.95 s
Wall time: 8.98 s
sage: %time TestSuite(CrystalOfTableaux(['B',4],shape=[2,1,1,0])).run()
CPU times: user 7.11 s, sys: 0.03 s, total: 7.14 s
Wall time: 7.48 s
sage: W.<z> = CyclotomicField(13)
sage: %time M = Matrix(W, 2, 3, [10^30*(1-z)^13, 1, 2, 3, 4, z]).echelon_form()
CPU times: user 6.07 s, sys: 0.02 s, total: 6.09 s
Wall time: 6.13 s
sage: %time L = EllipticCurve('960d1').prove_BSD()
CPU times: user 10.10 s, sys: 0.07 s, total: 10.17 s
Wall time: 10.55 s
sage: def test(E):
....:     for p in prime_range(10000):
....:         if p != 389:
....:             G = E.change_ring(GF(p)).abelian_group()
....:             
sage: E = EllipticCurve('389a')
sage: %time test(E)
CPU times: user 23.03 s, sys: 0.05 s, total: 23.08 s
Wall time: 23.18 s
sage: %time for E in cremona_curves([11..100]): S = E.integral_points(both_signs=False)
CPU times: user 16.29 s, sys: 0.05 s, total: 16.34 s
Wall time: 16.48 s

sage-4.7.2.alpha2 plus #9138 and its dependency plus the patch from here

sage: %time D = J0(46).endomorphism_ring().discriminant()
CPU times: user 6.02 s, sys: 0.20 s, total: 6.22 s
Wall time: 6.23 s
sage: %time TestSuite(CrystalOfTableaux(['B',4],shape=[2,1,1,0])).run()
CPU times: user 7.11 s, sys: 0.01 s, total: 7.12 s
Wall time: 7.39 s
sage: W.<z> = CyclotomicField(13)
sage: %time M = Matrix(W, 2, 3, [10^30*(1-z)^13, 1, 2, 3, 4, z]).echelon_form()
CPU times: user 3.81 s, sys: 0.04 s, total: 3.84 s
Wall time: 3.85 s
sage: %time L = EllipticCurve('960d1').prove_BSD()
CPU times: user 6.11 s, sys: 0.07 s, total: 6.18 s
Wall time: 6.20 s
sage: def test(E):
....:     for p in prime_range(10000):
....:         if p != 389:
....:             G = E.change_ring(GF(p)).abelian_group()
....:             
sage: E = EllipticCurve('389a')
sage: %time test(E)
CPU times: user 18.11 s, sys: 0.07 s, total: 18.17 s
Wall time: 18.23 s
sage: %time for E in cremona_curves([11..100]): S = E.integral_points(both_signs=False)
CPU times: user 13.13 s, sys: 0.05 s, total: 13.18 s
Wall time: 13.22 s

Conclusion

#9138 is indeed responsible for all but one regression. The only exception is the test suite for CrytalOfTableaux; it has already been argued on sage-devel that it is actually not a regression, since the test suite became much longer by #11183.

The patch from here fixes most regressions. Problematic remain the EllipticCurve('960d1').prove_BSD() and the E.change_ring(GF(p)).abelian_group() tests.

@simon-king-jena
Copy link
Member Author

comment:15

The patch is updated.

I did not run the doc tests yet, and also I think one should try to fix the remaining regression, but I turn it into "needs review".

@jdemeyer
Copy link

jdemeyer commented Oct 6, 2011

comment:16

Doing some timing...

I am afraid I will not be able to review your patch because I am too unfamiliar with the matter that the patch deals with.

Note that your patch applies with "fuzz 2" against a vanilla sage-4.7.2.alpha3, you probably should make a clean version.

@simon-king-jena
Copy link
Member Author

comment:17

With sage-4.7.2.alpha2 plus #9138 plus #11734 (which is a blocker for 4.7.2 merged in alpha3) and the latest patch, I get

sage: %time D = J0(46).endomorphism_ring().discriminant()
CPU times: user 5.52 s, sys: 0.19 s, total: 5.72 s
Wall time: 5.74 s
sage: %time TestSuite(CrystalOfTableaux(['B',4],shape=[2,1,1,0])).run()
CPU times: user 7.05 s, sys: 0.02 s, total: 7.06 s
Wall time: 7.28 s
sage: W.<z> = CyclotomicField(13)
sage: %time M = Matrix(W, 2, 3, [10^30*(1-z)^13, 1, 2, 3, 4, z]).echelon_form()
CPU times: user 3.70 s, sys: 0.01 s, total: 3.71 s
Wall time: 3.73 s
sage: %time L = EllipticCurve('960d1').prove_BSD()
CPU times: user 6.26 s, sys: 0.04 s, total: 6.30 s
Wall time: 6.31 s
sage: def test(E):
....:     for p in prime_range(10000):
....:         if p != 389:
....:             G = E.change_ring(GF(p)).abelian_group()
....:             
sage: E = EllipticCurve('389a')
sage: %time test(E)
CPU times: user 18.45 s, sys: 0.06 s, total: 18.51 s
Wall time: 18.56 s
sage: %time for E in cremona_curves([11..100]): S = E.integral_points(both_signs=False)
CPU times: user 12.54 s, sys: 0.02 s, total: 12.56 s
Wall time: 12.63 s

@simon-king-jena
Copy link
Member Author

comment:18

Replying to @jdemeyer:

Note that your patch applies with "fuzz 2" against a vanilla sage-4.7.2.alpha3, you probably should make a clean version.

Really? Did you test the latest patch?

Anyway. For me, the priority is to first get it working.

@jdemeyer
Copy link

jdemeyer commented Oct 6, 2011

comment:19

Replying to @simon-king-jena:

Really? Did you test the latest patch?

I think so.

Anyway. For me, the priority is to first get it working.

I agree, but I thougt you should know...

@simon-king-jena
Copy link
Member Author

comment:20

Indeed, there is some fuzz. I'll take care of that in the next patch version.

@simon-king-jena
Copy link
Member Author

comment:21

So far, I did not cache join (only hom_category). That should be done next.

@simon-king-jena
Copy link
Member Author

comment:248

Hi Jeroen,

Is it really a problem to have cython(...) in doctests? I have used cython(...) in the past, for example when testing introspection (namely, cython creates a temporary file, so that introspection is easy).

Aha, is it because some people do not have gcc but install a Sage-binary?

@williamstein
Copy link
Contributor

comment:249

Replying to @simon-king-jena:

Aha, is it because some people do not have gcc but install a Sage-binary?

Standard (non-optional) tests are allowed to depend on gcc. I don't know if this is written in stone anywhere, but there have been many such standard tests in the past. They aren't a problem for buildbots, since they always have GCC installed.

If nothing else, it is very important to test that calling the compiler from Sage does work.

@jdemeyer
Copy link

comment:250

Replying to @williamstein:

Replying to @simon-king-jena:

Aha, is it because some people do not have gcc but install a Sage-binary?

Standard (non-optional) tests are allowed to depend on gcc.

Really? I thought the opposite is true.

I know there are some doctests marked

# optional -- gcc

For example in sage/misc/cython.py

@jdemeyer
Copy link

Merged: sage-5.0.beta0

@zimmermann6
Copy link

comment:252

thank you very much Simon for your work on this ticket (and on #715). Only a few people
have the necessary knowledge to work on speed regressions and memory leaks, but solving
such issues is crucial so that Sage is really usable in research.

Paul Zimmermann

@jdemeyer
Copy link

comment:253

Small question: is there are particular reason for writing

self.__dict__.__delitem__('element_class')

instead of

del self.__dict__['element_class']

The latter is faster because it bypasses the lookup of the __delitem__ attribute. I'm changing this in #24270.

@simon-king-jena
Copy link
Member Author

comment:254

Replying to @jdemeyer:

Small question: is there are particular reason for writing

self.__dict__.__delitem__('element_class')

instead of

del self.__dict__['element_class']

The latter is faster because it bypasses the lookup of the __delitem__ attribute. I'm changing this in #24270.

When I wrote that line, I was ignorant of the difference in performance. Actually I even thought that del self.__dict__['element_class'] was slower, since I expected that the Python interpreter first needs to translate that statement into an attribute lookup that it subsequently has to do anyway.

Meanwhile I know better and wouldn't write it again in that way.

Once again: I HATE THE TRAC WEB INTERFACE!!! While I write these lines, every few seconds the browser window becomes blank and (when I start scrolling and continue writing) reappears in a totally different place. It is a VERY SERIOUS pain in the neck!

@burcin burcin mentioned this issue Jan 21, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants