New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sage crashes when inverting/dividing large number field elements #20693
Comments
This comment has been minimized.
This comment has been minimized.
comment:2
Let me add something that might be helpful in finding the bug:
And crash...
And let me also add that I tried a binary distribution of sage 7.1 on yet another machine as well with the Newforms(...) function and got the same crash. |
comment:4
I would like to add a remark: something really extremely bad happened with the implementation of relative Number Fields. |
comment:5
I confirm the crash and I'm getting the same traceback. For regressions in relative number field performance: it would be nice to have a smaller example where both 6.1 and 7.2 run in reasonable time. We can then just profile the code. There's a good chance that something will be sticking out there, leading to the place of the regression. |
comment:6
@sagetrac-ehlen: are you writing code or intend to write code to fix this? For me, seeing an Author filled in is a good reason not to investigate the bug. |
comment:7
Replying to @sagetrac-ehlen:
For me, this gives
and no crash... |
Replying to @sagetrac-ehlen:
I let this run for a few minutes and didn't get a crash. Do you have a simpler crashing example? |
comment:9
To investigate the crash, it may help to add For the slowdown, I tried the following with increasing weights:
Indeed, the method
Perhaps #18727, #18740 and/or #252 could be relevant for this problem? |
comment:10
I'm sorry, I guess I misinterpreted the "Author" field. I probably won't write code for this as I think the bugs and performance problems come from relative extensions of number fields and I don't know much about the code (and most of it is pari in some way, I guess). My example in #20693 comment:2 was missing lines, sorry again.
To get a crash you have to let it run quite some time (I don't remember how long exactly it was, maybe 15 minutes, I can restart it and let you know). I can come up with more examples for sure. Indeed you have to wait quite a bit until you see a crash sometimes, I had a computation running for Gamma0(19) and weight 14 for 2 days or so until I saw the crash (but I didn't run out of memory or anything else trivial and it also used to work on the same machine with the same space (and sage should really be able to do it!). @nbruin I'll try to find some time to check out sage 6.1 again and compute some spaces that also run in sage 7.2 and then get back to you with some profiling results. I did similar profiling tests as you did, @pjbruin and discovered that the main reason for this being slow is that elements in the base ring (a cyclotomic field) get coerced into a relative extension (obtained by adjoining the roots of a hecke polynomial). This is damn slow for some reason (note that it is stated in the documentation of relative number fields that doing arithmetic in number field towers is very slow but it seems that it got worse than it used to be). I discovered that such operations can be accelerated by very stupid means. For instance, instead of coercing an element e of K into an extension L directly, write it in terms of a power basis (e.list()), coerce the generator of K into L and create the element corresponding to e in L directly. I can document some examples later. |
Changed author from Stephan Ehlen to none |
comment:12
@jdemeyer, OK, so the corrected example I gave in #20693 comment:10 crashes after at most 43 minutes for me. I can probably give examples that crash faster, but not sure. The runtime is most certainly related to the degree of the relative extension (the dimension of the modular symbols space) of the cyclotomic field and it seems that we need to have this large enough to really cause a crash. |
comment:13
Replying to @sagetrac-ehlen:
It turns out that |
comment:14
Replying to @pjbruin:
After #20749, |
comment:15
My old examples still crash with #20749 applied. Now, as for the crash, first of all there is no crash with #20749 when running
However, I now get the more informative
After doing so:
I get a crash again. For some reason this crash is not informative at all on OSX, I will have to run it on Linux to see the backtrace, I guess. Also, your level 17 example is quite good because by increasing the weight only by one to 9, I the same behaviour (PARI stack overflow, crash when increasing it). Try:
|
comment:16
From the traceback it appears that NTL runs out of primes for its FFT-based |
comment:17
Replying to @pjbruin:
Indeed, after finally managing to get sage run in gdb on OSX, I get
Is this a bug in NTL? Is it documented anywhere that this might fail? Can we know in advance that this will happen so that we don't have to run into _invert_c_c()? |
comment:18
After applying #20749, #20759 and #20791, the examples in the ticket description and in comment:10 crash already after about 1-2 minutes; on this ticket we can now focus on the crash instead of the slowness. |
comment:19
Replying to @sagetrac-ehlen:
With the same branches applied as in comment:18, I tried
and got a similar NTL crash as in comment:17 after about 1.5 hours. |
comment:20
I tried to debug this and catch the NTL exception but I failed. |
comment:21
Replying to @sagetrac-ehlen:
The method --- a/src/sage/libs/ntl/ZZX.pxd
+++ b/src/sage/libs/ntl/ZZX.pxd
@@ -35,7 +35,7 @@ cdef extern from "sage/libs/ntl/ntlwrap.cpp":
void ZZX_div_ZZ "div"( ZZX_c x, ZZX_c a, ZZ_c b)
long ZZX_deg "deg"( ZZX_c x )
void ZZX_rem "rem"(ZZX_c r, ZZX_c a, ZZX_c b)
- void ZZX_XGCD "XGCD"(ZZ_c r, ZZX_c s, ZZX_c t, ZZX_c a, ZZX_c b, long deterministic)
+ void ZZX_XGCD "XGCD"(ZZ_c r, ZZX_c s, ZZX_c t, ZZX_c a, ZZX_c b, long deterministic) except +
void ZZX_content "content"(ZZ_c d, ZZX_c f)
void ZZX_factor "factor"(ZZ_c c, vec_pair_ZZX_long_c factors, ZZX_c f, long verbose, long bnd)
--- a/src/sage/rings/number_field/number_field_element.pxd
+++ b/src/sage/rings/number_field/number_field_element.pxd
@@ -26,7 +26,7 @@ cdef class NumberFieldElement(FieldElement):
cdef void _ntl_coeff_as_mpz(self, mpz_t z, long i)
cdef void _ntl_denom_as_mpz(self, mpz_t z)
- cdef void _invert_c_(self, ZZX_c *num, ZZ_c *den)
+ cdef void _invert_c_(self, ZZX_c *num, ZZ_c *den) except *
cdef void _reduce_c_(self)
cpdef ModuleElement _add_(self, ModuleElement right)
cpdef ModuleElement _sub_(self, ModuleElement right)
--- a/src/sage/rings/number_field/number_field_element.pyx
+++ b/src/sage/rings/number_field/number_field_element.pyx
@@ -2258,7 +2258,7 @@ cdef class NumberFieldElement(FieldElement):
"""
return long(self.polynomial())
- cdef void _invert_c_(self, ZZX_c *num, ZZ_c *den):
+ cdef void _invert_c_(self, ZZX_c *num, ZZ_c *den) except *:
"""
Computes the numerator and denominator of the multiplicative
inverse of this element.
@@ -2276,11 +2276,13 @@ cdef class NumberFieldElement(FieldElement):
"""
cdef ZZX_c t # unneeded except to be there
cdef ZZX_c a, b
+ sig_on()
ZZX_mul_ZZ( a, self.__numerator, self.__fld_denominator.x )
ZZX_mul_ZZ( b, self.__fld_numerator.x, self.__denominator )
ZZX_XGCD( den[0], num[0], t, a, b, 1 )
ZZX_mul_ZZ( num[0], num[0], self.__fld_denominator.x )
ZZX_mul_ZZ( num[0], num[0], self.__denominator )
+ sig_off()
def __invert__(self):
""" Then I get
|
comment:22
Instead of |
comment:23
Another solution, using PARI as a fallback: --- a/src/sage/rings/number_field/number_field_element.pyx
+++ b/src/sage/rings/number_field/number_field_element.pyx
@@ -40,6 +40,8 @@ from sage.libs.gmp.mpz cimport *
from sage.libs.gmp.mpq cimport *
from sage.libs.mpfi cimport mpfi_t, mpfi_init, mpfi_set, mpfi_clear, mpfi_div_z, mpfi_init2, mpfi_get_prec, mpfi_set_prec
from sage.libs.mpfr cimport mpfr_less_p, mpfr_greater_p, mpfr_greaterequal_p
+from sage.libs.ntl.error import NTLError
+
from cpython.object cimport Py_EQ, Py_NE, Py_LT, Py_GT, Py_LE, Py_GE
from sage.structure.sage_object cimport rich_to_bool
@@ -2297,9 +2299,14 @@ cdef class NumberFieldElement(FieldElement):
if IsZero_ZZX(self.__numerator):
raise ZeroDivisionError
cdef NumberFieldElement x
- x = self._new()
- self._invert_c_(&x.__numerator, &x.__denominator)
- x._reduce_c_()
+ try:
+ x = self._new()
+ sig_on()
+ self._invert_c_(&x.__numerator, &x.__denominator)
+ x._reduce_c_()
+ sig_off()
+ except NTLError:
+ x = self._parent(~self._pari_())
return x
def _integer_(self, Z=None): It no longer crashes and returns an answer after about 20 minutes... |
comment:24
@pjbruin: Great! Thanks a lot for your work! However, the pari alternative seems to be kind of slow (which is much better than crashing, of course).
I don't like the last line of the code and I still have to make sure I understood everything correctly and it works in all cases but then I could submit a patch for the number field element to use a variant of this code as an alternative. What do you think? |
comment:25
Replying to @sagetrac-ehlen:
I would have hoped for PARI to be quite fast, but it seems this is not particularly optimised in PARI. In any case I didn't try hard to make it fast; it was just the easiest solution that didn't crash.
This is definitely a good start. The |
comment:39
Replying to @pjbruin:
Please also update the ticket description. |
Changed branch from u/ehlen/sage_crashes_when_computing_newforms to u/pbruin/20693-sage_crashes_when_computing_newforms |
Changed author from Stephan Ehlen to Stephan Ehlen, Peter Bruin |
comment:41
Pushing my commit and changing the branch does seem to work. |
comment:42
I did some cleaning up of |
comment:43
@pjbruin Very good, I didn't look at
in |
This comment has been minimized.
This comment has been minimized.
comment:45
@pjbruin I can remove the doctest. If you agree that simplifying Also, did you check very carefully that removing those additional multiplications that used to be done in |
comment:46
PS: my branch |
Changed author from Stephan Ehlen, Peter Bruin to Stephan Ehlen |
comment:48
Replying to @sagetrac-ehlen:
Yes, it might make sense to have a place for tests that are too long to write down or take too much time to run. I agree that this is not the place to discuss this further.
I included your commit in my branch and added a reviewer commit (typographical fixes and using
Here is a proof that the simplified code is correct. Suppose our number field K, is Q[X]/(f) where we may assume without loss of generality that f is in Z[X]. We want to invert x = (g mod f)/d in K, where g is in Z[X] and is coprime to f, and where d is in Z and non-zero. We call
Yes, this could very well be; it is surprising how often one runs into code that is much more complicated than necessary... It now seems to makes more sense to me to be the reviewer instead of an author. I am now running doctests. Maybe it is still good if someone else takes a look at this; these number field operations are after all essential and heavily used... |
Reviewer: Peter Bruin |
comment:49
@pjbruin: I completely agree with the proof and this is essentially what I tried to write in comment:30. And yes, it would be good if someone else could have a look as well although the code is pretty solid now, I think. |
comment:50
Replying to @sagetrac-ehlen:
OK, good to see that we agree on this. I don't think there are any hidden assumptions that could be violated...
I agree. By the way, doctests still pass. |
comment:51
One thing I don't yet understand (and maybe we don't have to) is why it was possible to compute some of the spaces of newforms that caused NTL to run out of FFT primes in older versions of sage. One particular example that I have is level 17, weight 9, character=generator of the dirichlet group. I have a file lying around that says that this was computed with sage 5.12beta4, release date: 2013-08-30. Why did inversion work back then if the cython version number_field_element.pyx exists since 2007 and the inversion has not been changed since then? Somehow I can only imagine that NTL used to be able to handle these large polynomials that the code used to produce. Any other ideas? |
comment:52
I have gone through the proposed patch and everything seems to work as stated.
|
Changed reviewer from Peter Bruin to Peter Bruin, Fredrik Stromberg |
Changed branch from u/pbruin/20693-sage_crashes_when_computing_newforms to |
This ticket used to be about a crash that occurred when computing newforms for a certain character of modulus 23 in sage 7.2.
Here's how to reproduce it (you have to wait 10 minutes or so until the crash happens):
It turned out that this was due to NTL running out of FFT primes when inverting number field elements with humongous denominators. Moreover, it turned out that we only ran into this problem in the example (and other examples in the comments) because the function
_invert_c_()
of a number field element was doing unnecessary work.Component: number fields
Author: Stephan Ehlen
Branch/Commit:
eb3da68
Reviewer: Peter Bruin, Fredrik Stromberg
Issue created by migration from https://trac.sagemath.org/ticket/20693
The text was updated successfully, but these errors were encountered: