Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Floating point exception in 2.11.1 tests #194

Open
infinity0 opened this issue Oct 19, 2017 · 21 comments
Open

Floating point exception in 2.11.1 tests #194

infinity0 opened this issue Oct 19, 2017 · 21 comments

Comments

@infinity0
Copy link

We're trying to upgrade flint-arb in Debian and running into this test failure:

[..]
gcc -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -fdebug-prefix-map=$BUILD_PATH=. -fstack-protector-strong -Wformat -Werror=format-security -I$BUILD_PATH -I/usr/local/include -I/usr/local/include -I/usr/include test/t-evaluate_acb.c -o ../build/arb_fmpz_poly/test/t-evaluate_acb -L$BUILD_PATH -L/usr/local/lib -L/usr/local/lib -L/usr/lib -lflint-arb -lflint -lmpfr -lgmp -lm -lpthread  -MMD -MP -MF ../build/arb_fmpz_poly/test/t-evaluate_acb.d -MT "../build/arb_fmpz_poly/test/t-evaluate_acb" -MT "../build/arb_fmpz_poly/test/t-evaluate_acb.d"
gcc -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -fdebug-prefix-map=$BUILD_PATH=. -fstack-protector-strong -Wformat -Werror=format-security -I$BUILD_PATH -I/usr/local/include -I/usr/local/include -I/usr/include test/t-gauss_period_minpoly.c -o ../build/arb_fmpz_poly/test/t-gauss_period_minpoly -L$BUILD_PATH -L/usr/local/lib -L/usr/local/lib -L/usr/lib -lflint-arb -lflint -lmpfr -lgmp -lm -lpthread  -MMD -MP -MF ../build/arb_fmpz_poly/test/t-gauss_period_minpoly.d -MT "../build/arb_fmpz_poly/test/t-gauss_period_minpoly" -MT "../build/arb_fmpz_poly/test/t-gauss_period_minpoly.d"
gcc -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -fdebug-prefix-map=$BUILD_PATH=. -fstack-protector-strong -Wformat -Werror=format-security -I$BUILD_PATH -I/usr/local/include -I/usr/local/include -I/usr/include test/t-complex_roots.c -o ../build/arb_fmpz_poly/test/t-complex_roots -L$BUILD_PATH -L/usr/local/lib -L/usr/local/lib -L/usr/lib -lflint-arb -lflint -lmpfr -lgmp -lm -lpthread  -MMD -MP -MF ../build/arb_fmpz_poly/test/t-complex_roots.d -MT "../build/arb_fmpz_poly/test/t-complex_roots" -MT "../build/arb_fmpz_poly/test/t-complex_roots.d"
gcc -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -fdebug-prefix-map=$BUILD_PATH=. -fstack-protector-strong -Wformat -Werror=format-security -I$BUILD_PATH -I/usr/local/include -I/usr/local/include -I/usr/include test/t-evaluate_arb.c -o ../build/arb_fmpz_poly/test/t-evaluate_arb -L$BUILD_PATH -L/usr/local/lib -L/usr/local/lib -L/usr/lib -lflint-arb -lflint -lmpfr -lgmp -lm -lpthread  -MMD -MP -MF ../build/arb_fmpz_poly/test/t-evaluate_arb.d -MT "../build/arb_fmpz_poly/test/t-evaluate_arb" -MT "../build/arb_fmpz_poly/test/t-evaluate_arb.d"
gauss_period_minpoly....../Makefile.subdirs:84: recipe for target '../build/arb_fmpz_poly/test/t-gauss_period_minpoly_RUN' failed
make[3]: *** [../build/arb_fmpz_poly/test/t-gauss_period_minpoly_RUN] Floating point exception
make[3]: *** Waiting for unfinished jobs....
make[3]: Leaving directory '$BUILD_PATH/arb_fmpz_poly'
Makefile:179: recipe for target 'check' failed
make[2]: *** [check] Error 2
make[2]: Leaving directory '$BUILD_PATH'
dh_auto_test: make -j4 check VERBOSE=1 AT= QUIET_CXX= QUIET_CC= QUIET_AR= "ABI_FLAG=-Wl,-z,relro -Wl,-z,now -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -fdebug-prefix-map=$BUILD_PATH=. -fstack-protector-strong -Wformat -Werror=format-security" returned exit code 2
debian/rules:22: recipe for target 'override_dh_auto_test' failed
make[1]: *** [override_dh_auto_test] Error 2
make[1]: Leaving directory '$BUILD_PATH'
debian/rules:8: recipe for target 'build' failed
make: *** [build] Error 2

I can paste more logs on request. Versions of dependencies are:

libflint-dev 2.5.2-15
libgmp-dev 2:6.1.2+dfsg-1.1
libmpfr-dev 3.1.6-1

Our config is:

MAKE_OVERRIDE = AT= QUIET_CXX= QUIET_CC= QUIET_AR= ABI_FLAG='$(LDFLAGS) $(CPPFLAGS) $(CFLAGS)'
./configure --prefix=/usr --disable-static --with-flint=/usr CFLAGS='$(CPPFLAGS) $(CFLAGS)'
@fredrik-johansson
Copy link
Collaborator

I don't know what would be causing this. Could you try to track down the point of failure in the code, and the values of the variables in the t-gauss_period_minpoly program (q, n, etc.) when it happens?

@infinity0
Copy link
Author

Unfortunately k, g, e, n are optimised out and printf() did not work for some reason, but the other variables are there, see further below. The test passes when you compile the test without -O2, suggesting that maybe the main code is actually OK.

diff --git a/arb_fmpz_poly/test/t-gauss_period_minpoly.c b/arb_fmpz_poly/test/t-gauss_period_minpoly.c
index c5e7bec..21948fb 100644
--- a/arb_fmpz_poly/test/t-gauss_period_minpoly.c
+++ b/arb_fmpz_poly/test/t-gauss_period_minpoly.c
@@ -30,6 +30,10 @@ int main()
 
         for (q = 0; q < 1000; q++)
         {
+#ifdef FLINT_TEST_DEBUG
+            flint_printf("q = %ld\n", q);
+            fflush(stdout);
+#endif
             acb_dirichlet_roots_t zeta;
             prec = 100 + n_randint(state, 500);
 
@@ -38,6 +42,10 @@ int main()
 
             for (n = 0; n < 1000; n++)
             {
+#ifdef FLINT_TEST_DEBUG
+                flint_printf("q = %ld, n = %ld\n", q, n);
+                fflush(stdout);
+#endif
                 arb_fmpz_poly_gauss_period_minpoly(pol, q, n);
 
                 if (!fmpz_poly_is_zero(pol))
@@ -52,6 +60,10 @@ int main()
 
                     for (iter = 0; iter < 3; iter++)
                     {
+#ifdef FLINT_TEST_DEBUG
+                        flint_printf("q = %ld, n = %ld, iter = %ld\n", q, n, iter);
+                        fflush(stdout);
+#endif
                         k = n_randint(state, n);
                         gk = n_powmod2(g, k, 2 * q);
                         acb_zero(u);
$ touch arb_fmpz_poly/test/t-gauss_period_minpoly.c; make check MOD=arb_fmpz_poly QUIET_CC= CFLAGS=""
make[1]: Entering directory '$PWD'
make[1]: Nothing to be done for 'shared'.
make[1]: Leaving directory '$PWD'
make[1]: Entering directory '$PWD/arb_fmpz_poly'
gcc  -I$PWD -I/usr/include -I/usr/include -I/usr/include test/t-gauss_period_minpoly.c -o ../build/arb_fmpz_poly/test/t-gauss_period_minpoly -L$PWD -L/usr/lib -L/usr/lib -L/usr/lib -lflint-arb -lflint -lmpfr -lgmp -lm -lpthread  -MMD -MP -MF ../build/arb_fmpz_poly/test/t-gauss_period_minpoly.d -MT "../build/arb_fmpz_poly/test/t-gauss_period_minpoly" -MT "../build/arb_fmpz_poly/test/t-gauss_period_minpoly.d"
evaluate_acb....PASS
gauss_period_minpoly....PASS
complex_roots....PASS
evaluate_arb....PASS
make[1]: Leaving directory '$PWD/arb_fmpz_poly'

$ touch arb_fmpz_poly/test/t-gauss_period_minpoly.c; make check MOD=arb_fmpz_poly QUIET_CC= CFLAGS="-O2 -DFLINT_TEST_DEBUG"
make[1]: Entering directory '$PWD'
make[1]: Nothing to be done for 'shared'.
make[1]: Leaving directory '$PWD'
make[1]: Entering directory '$PWD/arb_fmpz_poly'
gcc -O2 -DFLINT_TEST_DEBUG -I$PWD -I/usr/include -I/usr/include -I/usr/include test/t-gauss_period_minpoly.c -o ../build/arb_fmpz_poly/test/t-gauss_period_minpoly -L$PWD -L/usr/lib -L/usr/lib -L/usr/lib -lflint-arb -lflint -lmpfr -lgmp -lm -lpthread  -MMD -MP -MF ../build/arb_fmpz_poly/test/t-gauss_period_minpoly.d -MT "../build/arb_fmpz_poly/test/t-gauss_period_minpoly" -MT "../build/arb_fmpz_poly/test/t-gauss_period_minpoly.d"
evaluate_acb....PASS
gauss_period_minpoly....q = 0
../Makefile.subdirs:84: recipe for target '../build/arb_fmpz_poly/test/t-gauss_period_minpoly_RUN' failed
make[1]: *** [../build/arb_fmpz_poly/test/t-gauss_period_minpoly_RUN] Floating point exception
make[1]: Leaving directory '$PWD/arb_fmpz_poly'
Makefile:189: recipe for target 'check' failed
make: *** [check] Error 2
exit code 2

$ touch arb_fmpz_poly/test/t-gauss_period_minpoly.c; make check MOD=arb_fmpz_poly QUIET_CC= CFLAGS="-O2 -DFLINT_TEST_DEBUG -g"
make[1]: Entering directory '$PWD'
make[1]: Nothing to be done for 'shared'.
make[1]: Leaving directory '$PWD'
make[1]: Entering directory '$PWD/arb_fmpz_poly'
gcc -O2 -DFLINT_TEST_DEBUG -g -I$PWD -I/usr/include -I/usr/include -I/usr/include test/t-gauss_period_minpoly.c -o ../build/arb_fmpz_poly/test/t-gauss_period_minpoly -L$PWD -L/usr/lib -L/usr/lib -L/usr/lib -lflint-arb -lflint -lmpfr -lgmp -lm -lpthread  -MMD -MP -MF ../build/arb_fmpz_poly/test/t-gauss_period_minpoly.d -MT "../build/arb_fmpz_poly/test/t-gauss_period_minpoly" -MT "../build/arb_fmpz_poly/test/t-gauss_period_minpoly.d"
evaluate_acb....PASS
gauss_period_minpoly....q = 0
../Makefile.subdirs:84: recipe for target '../build/arb_fmpz_poly/test/t-gauss_period_minpoly_RUN' failed
make[1]: *** [../build/arb_fmpz_poly/test/t-gauss_period_minpoly_RUN] Floating point exception
make[1]: Leaving directory '$PWD/arb_fmpz_poly'
Makefile:189: recipe for target 'check' failed
make: *** [check] Error 2
exit codu 2

$ LD_LIBRARY_PATH=$PWD gdb -q build/arb_fmpz_poly/test/t-gauss_period_minpoly 
Reading symbols from build/arb_fmpz_poly/test/t-gauss_period_minpoly...done.
(gdb) run
Starting program: $PWD/build/arb_fmpz_poly/test/t-gauss_period_minpoly 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
gauss_period_minpoly....q = 0

Program received signal SIGFPE, Arithmetic exception.
0x0000555555554ed1 in n_preinvert_limb (n=<optimized out>) at /usr/include/flint/ulong_extras.h:176
176	   invert_limb(ninv, n<<norm);
(gdb) bt
#0  0x0000555555554ed1 in n_preinvert_limb (n=<optimized out>) at /usr/include/flint/ulong_extras.h:176
#1  n_powmod2 (n=<optimized out>, exp=<optimized out>, a=<optimized out>) at /usr/include/flint/ulong_extras.h:230
#2  main () at test/t-gauss_period_minpoly.c:68
(gdb) list
171	mp_limb_t n_preinvert_limb(mp_limb_t n)
172	{
173	   mp_limb_t norm, ninv;
174	
175	   count_leading_zeros(norm, n);
176	   invert_limb(ninv, n<<norm);
177	
178	   return ninv;
179	}
180	
(gdb) up
#1  n_powmod2 (n=<optimized out>, exp=<optimized out>, a=<optimized out>) at /usr/include/flint/ulong_extras.h:230
230	   mp_limb_t ninv = n_preinvert_limb(n);
(gdb) list
225	                                             mp_limb_t ninv, ulong norm);
226	
227	ULONG_EXTRAS_INLINE
228	mp_limb_t n_powmod2(mp_limb_t a, mp_limb_signed_t exp, mp_limb_t n)
229	{
230	   mp_limb_t ninv = n_preinvert_limb(n);
231	
232	   return n_powmod2_preinv(a, exp, n, ninv);
233	}
234	
(gdb) up
#2  main () at test/t-gauss_period_minpoly.c:68
68	                        gk = n_powmod2(g, k, 2 * q);
(gdb) list
63	#ifdef FLINT_TEST_DEBUG
64	                        flint_printf("q = %ld, n = %ld, iter = %ld\n", q, n, iter);
65	                        fflush(stdout);
66	#endif
67	                        k = n_randint(state, n);
68	                        gk = n_powmod2(g, k, 2 * q);
69	                        acb_zero(u);
70	
71	                        for (e = 0; e < d; e++)
72	                        {
(gdb) info locals
k = <optimized out>
g = <optimized out>
e = <optimized out>
u = {{real = {mid = {exp = 1, size = 0, d = {noptr = {d = {140535355409744, 140535355471200}}, ptr = {alloc = 140535355409744, d = 0x7fd0eff0f560}}}, rad = {exp = 140737488347456, man = 140535379646231}}, imag = {mid = {exp = 0, 
        size = 0, d = {noptr = {d = {0, 0}}, ptr = {alloc = 0, d = 0x0}}}, rad = {exp = 0, man = 0}}}}
d = <optimized out>
t = {{real = {mid = {exp = 140535351216552, size = 140535379614868, d = {noptr = {d = {1, 0}}, ptr = {alloc = 1, d = 0x0}}}, rad = {exp = 140535351214816, man = 140535359201384}}, imag = {mid = {exp = 140535355431768, 
        size = 140737488347688, d = {noptr = {d = {140535355409744, 0}}, ptr = {alloc = 140535355409744, d = 0x0}}}, rad = {exp = 140535359208480, man = 140535379614868}}}}
zeta = {{order = 0, reduced_order = 0, z = {{real = {mid = {exp = 280375465082880, size = 140535355933036, d = {noptr = {d = {0, 64}}, ptr = {alloc = 0, d = 0x40}}}, rad = {exp = 1040, man = 1088}}, imag = {mid = {exp = 16, 
            size = 274877907010, d = {noptr = {d = {2, 0}}, ptr = {alloc = 2, d = 0x0}}}, rad = {exp = 472446402653, man = 0}}}}, size = 0, depth = 532575944823, Z = 0x1, use_pow = 1431655485}}
prec = 199
n = <optimized out>
q = 0
pol = {{coeffs = 0x0, alloc = 0, length = 0}}
iter = <optimized out>
state = {{gmp_state = {{_mp_seed = {{_mp_alloc = -269421216, _mp_size = 32720, _mp_d = 0x7fd0f17ed4e8}}, _mp_alg = (unknown: 4021393795), _mp_algdata = {_mp_lc = 0x2b43e9ed}}}, gmp_init = 0, __randval = 14886521159149723162, 
    __randval2 = 15737102702505497715}}

@infinity0
Copy link
Author

For completeness, -O1 also makes the test fail.

@fredrik-johansson
Copy link
Collaborator

If I'm reading this correctly, the the problem occurs when q = 0. In this case, arb_fmpz_poly_gauss_period_minpoly should set pol to the zero polynomial, which means the if (!fmpz_poly_is_zero(pol)) branch should not be entered. But the division by zero seems to occur inside that branch even though pol = {{coeffs = 0x0, alloc = 0, length = 0}}. I have no idea what's going on!

@infinity0
Copy link
Author

The assembly is very heavily optimised and doesn't match the source code very well, things are run out-of-order. Here it is running in gdb's layout asm:

0x555555554da0 <main>           push   %r15
0x555555554da2 <main+2>         push   %r14
0x555555554da4 <main+4>         lea    0x649(%rip),%rdi        # 0x5555555553f4
0x555555554dab <main+11>        push   %r13
0x555555554dad <main+13>        push   %r12
0x555555554daf <main+15>        xor    %eax,%eax
0x555555554db1 <main+17>        push   %rbp
0x555555554db2 <main+18>        push   %rbx
0x555555554db3 <main+19>        sub    $0x238,%rsp
0x555555554dba <main+26>        callq  0x555555554c30 <flint_printf@plt>
0x555555554dbf <main+31>        mov    0x20131a(%rip),%rdi        # 0x5555557560e0 <stdout@@GLIBC_2.2.5>
0x555555554dc6 <main+38>        lea    0x140(%rsp),%r14
0x555555554dce <main+46>        lea    0xe0(%rsp),%r15
0x555555554dd6 <main+54>        callq  0x555555554ce0 <fflush@plt>
0x555555554ddb <main+59>        movabs $0xc0259e16f63f0001,%rax
0x555555554de5 <main+69>        lea    0x80(%rsp),%rdi
0x555555554ded <main+77>        movl   $0x0,0xc0(%rsp)
0x555555554df8 <main+88>        mov    %rax,0xc8(%rsp)
0x555555554e00 <main+96>        movabs $0xb66313f44c4c47b6,%rax
0x555555554e0a <main+106>       mov    %rax,0xd0(%rsp)
0x555555554e12 <main+114>       mov    %rdi,0x40(%rsp)
0x555555554e17 <main+119>       callq  0x555555554d60 <fmpz_poly_init@plt>
0x555555554e1c <main+124>       lea    0xa0(%rsp),%rax
0x555555554e24 <main+132>       movq   $0xffffffffffffffff,0x70(%rsp)
0x555555554e2d <main+141>       movq   $0x0,0x48(%rsp)
0x555555554e36 <main+150>       mov    %rax,0x60(%rsp)
0x555555554e3b <main+155>       lea    0x1a0(%rsp),%rax
0x555555554e43 <main+163>       mov    %rax,0x28(%rsp)
0x555555554e48 <main+168>       mov    0x60(%rsp),%rdi
0x555555554e4d <main+173>       mov    $0x1f4,%esi
0x555555554e52 <main+178>       callq  0x555555554d20 <n_randint@plt>
0x555555554e57 <main+183>       mov    0x48(%rsp),%rdi
0x555555554e5c <main+188>       lea    0x64(%rax),%rbx
0x555555554e60 <main+192>       callq  0x555555554d70 <n_is_prime@plt>
0x555555554e65 <main+197>       test   %eax,%eax
0x555555554e67 <main+199>       je     0x555555554e80 <main+224>
0x555555554e69 <main+201>       mov    0x48(%rsp),%rsi
0x555555554e6e <main+206>       mov    0x28(%rsp),%rdi
0x555555554e73 <main+211>       mov    %rbx,%rcx
0x555555554e76 <main+214>       mov    $0x1e,%edx
0x555555554e7b <main+219>       callq  0x555555554c90 <acb_dirichlet_roots_init@plt>
0x555555554e80 <main+224>       mov    0x48(%rsp),%rax
0x555555554e85 <main+229>       mov    $0xffffffffffffffff,%rdi
0x555555554e8c <main+236>       xor    %r13d,%r13d
0x555555554e8f <main+239>       add    %rax,%rax
0x555555554e92 <main+242>       bsr    %rax,%rcx
0x555555554e96 <main+246>       xor    $0x3f,%rcx
0x555555554e9a <main+250>       mov    %rax,0x78(%rsp)
0x555555554e9f <main+255>       shl    %cl,%rax
0x555555554ea2 <main+258>       mov    %rax,%rcx
0x555555554ea5 <main+261>       mov    %rax,%rsi
0x555555554ea8 <main+264>       mov    %rdi,%rax
0x555555554eab <main+267>       not    %rcx
0x555555554eae <main+270>       mov    %rcx,%rdx
0x555555554eb1 <main+273>       div    %rsi

(gdb) s
flint_printf (str=0x5555555553f4 "gauss_period_minpoly....") at printf.c:79
(gdb) fin
Run till exit from #0  flint_printf (str=0x5555555553f4 "gauss_period_minpoly....") at printf.c:79
main () at test/t-gauss_period_minpoly.c:21
Value returned is $1 = 24
(gdb) s
__GI__IO_fflush (fp=0x7f86e9a16600 <_IO_2_1_stdout_>) at iofflush.c:33
(gdb) fin
Run till exit from #0  __GI__IO_fflush (fp=0x7f86e9a16600 <_IO_2_1_stdout_>) at iofflush.c:33
main () at test/t-gauss_period_minpoly.c:23
Value returned is $2 = 0
(gdb) s
flint_randinit (state=0x7fffffffdba0) at /usr/include/flint/flint.h:167
main () at test/t-gauss_period_minpoly.c:29
flint_randinit (state=0x7fffffffdba0) at /usr/include/flint/flint.h:165
main () at test/t-gauss_period_minpoly.c:29
fmpz_poly_init (poly=0x7fffffffdb80) at init.c:35
(gdb) fin
Run till exit from #0  fmpz_poly_init (poly=0x7fffffffdb80) at init.c:35
0x0000555555554e1c in main () at test/t-gauss_period_minpoly.c:29
(gdb) s
n_randint (state=0x7fffffffdba0, limit=500) at randint.c:32
(gdb) fin
Run till exit from #0  n_randint (state=0x7fffffffdba0, limit=500) at randint.c:32
main () at test/t-gauss_period_minpoly.c:40
Value returned is $3 = 99
(gdb) s
n_is_prime (n=0) at is_prime.c:39
(gdb) fin
Run till exit from #0  n_is_prime (n=0) at is_prime.c:39
0x0000555555554e65 in main () at test/t-gauss_period_minpoly.c:40
Value returned is $4 = 0
(gdb) s
n_powmod2 (n=<optimized out>, exp=<optimized out>, a=<optimized out>) at test/t-gauss_period_minpoly.c:68
n_preinvert_limb (n=<optimized out>) at /usr/include/flint/ulong_extras.h:176
main () at test/t-gauss_period_minpoly.c:31
n_powmod2 (n=<optimized out>, exp=<optimized out>, a=<optimized out>) at test/t-gauss_period_minpoly.c:68
n_preinvert_limb (n=<optimized out>) at /usr/include/flint/ulong_extras.h:175

Program received signal SIGFPE, Arithmetic exception.
0x0000555555554eb1 in n_preinvert_limb (n=<optimized out>) at /usr/include/flint/ulong_extras.h:176

layout asm collapses multiple step commands into a single (gdb) s line, in the second half of the output above, in case you're confused by that.

Hopefully you can reproduce this yourself with -O2 and GCC-7? If not, I can send over the failing test binary if you want.

@fredrik-johansson
Copy link
Collaborator

I tried compiling the test with GCC-7; no error.

@bennybp
Copy link

bennybp commented Oct 20, 2017

I'm also getting this failure with -O1. I'll echo the sentiment that it doesn't make sense that this path is being executed. If my interpretation of gdb during my run is correct, the execution jumps from the n_is_prime check into the n_powmod2 function, bypassing everything else (even print statements)

This kind of smells like a compiler bug (I'm using gcc-7.2), but I will dig some more in the next few days

@fredrik-johansson
Copy link
Collaborator

Thanks. I agree that a compiler bug is plausible. It's also possible that the code contains inadvertent undefined behavior that causes problems only with the optimizations in recent GCC:s. It should be possible to produce a minimal failing example.

@infinity0
Copy link
Author

infinity0 commented Oct 20, 2017

The test passes with -O2 if you replace the platform-specific count_leading_zeros with the fallback definition in flint's longlong.h (outside of arb). i.e.:

--- /usr/include/flint/longlong.h.orig	2017-10-21 01:38:05.824715526 +0200
+++ /usr/include/flint/longlong.h	2017-10-21 01:40:40.130642497 +0200
@@ -74,13 +74,35 @@
        : "=a" (q), "=d" (r)                                                     \
        : "0" ((mp_limb_t)(n0)), "1" ((mp_limb_t)(n1)), "rm" ((mp_limb_t)(dx)))
 
-/* bsrq destination must be a 64-bit register, hence mp_limb_t for __cbtmp. */
-#define count_leading_zeros(count, x)                                 \
-  do {                                                                \
-    mp_limb_t __cbtmp;                                                \
-    FLINT_ASSERT ((x) != 0);                                          \
-    __asm__ ("bsrq %1,%0" : "=r" (__cbtmp) : "rm" ((mp_limb_t)(x)));  \
-    (count) = __cbtmp ^ (mp_limb_t) 63;                               \
+const unsigned char __flint_clz_tab[128] =
+{
+  1,2,3,3,4,4,4,4,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,
+  7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,
+  8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
+  8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8
+};
+#define __BITS4 (GMP_LIMB_BITS/4)
+#define count_leading_zeros(count, x)                        \
+  do {									                            \
+    mp_limb_t __xr = (x);							                \
+    mp_limb_t __a;								                   \
+									                                  \
+    if (GMP_LIMB_BITS == 32)						       \
+      {									                            \
+	__a = __xr < ((mp_limb_t) 1 << 2*__BITS4)				       \
+	  ? (__xr < ((mp_limb_t) 1 << __BITS4) ? 1 : __BITS4 + 1) \
+	  : (__xr < ((mp_limb_t) 1 << 3*__BITS4) ? 2*__BITS4 + 1	 \
+	  : 3*__BITS4 + 1);						                      \
+      }									                            \
+    else								                               \
+      {									                            \
+	for (__a = GMP_LIMB_BITS - 8; __a > 0; __a -= 8) \
+	  if (((__xr >> __a) & 0xff) != 0)				             \
+	    break;							                            \
+	++__a;								                            \
+      }									                            \
+									                                  \
+    (count) = GMP_LIMB_BITS + 1 - __a - __flint_clz_tab[__xr >> __a]; \
   } while (0)
 
 /* bsfq destination must be a 64-bit register, "%q0" forces this in case

The test inlines it via n_powmod2 and n_preinvert_limb.

I'm not sure what this means however, whether it's a compiler bug or a flint bug.

@infinity0
Copy link
Author

infinity0 commented Oct 21, 2017

I'm starting to suspect it's an optimizer bug. I couldn't see anything obviously wrong with the code, though I'm not super-familiar with all the details of what is and is not undefined behaviour in C. However, I found some other seemingly-random ways to make the failure go away.

diff --git a/arb_fmpz_poly/test/t-gauss_period_minpoly.c b/arb_fmpz_poly/test/t-gauss_period_minpoly.c
index c5e7bec..a626818 100644
--- a/arb_fmpz_poly/test/t-gauss_period_minpoly.c
+++ b/arb_fmpz_poly/test/t-gauss_period_minpoly.c
@@ -12,6 +12,29 @@
 #include "arb_fmpz_poly.h"
 #include "acb_dirichlet.h"
 
+#define PREINV(CONSTRAINT) \
+   mp_limb_t norm, ninv; \
+   do { mp_limb_t __cbtmp; ; __asm__ ("bsrq %1,%0" : "=r" (__cbtmp) : CONSTRAINT ((mp_limb_t)(n))); (norm) = __cbtmp ^ (mp_limb_t) 63; } while (0); \
+   do { mp_limb_t dummy; __asm__ ("divq %4" : "=a" (ninv), "=d" (dummy) : "0" ((mp_limb_t)(~((0L)))), "1" ((mp_limb_t)(~(n<<norm))), "rm" ((mp_limb_t)(n<<norm))); } while (0); \
+   return ninv
+
+static __inline__
+mp_limb_t n_preinvert_limb_bad(mp_limb_t n) { PREINV("rm"); }
+
+static __attribute__ ((noinline))
+mp_limb_t n_preinvert_limb_good1(mp_limb_t n) { PREINV("rm"); }
+
+static __inline__
+mp_limb_t n_preinvert_limb_good2(mp_limb_t n) { PREINV("m"); }
+
+#ifndef FTEST_VAR
+#define FTEST_VAR bad
+#endif
+// C sucks
+#define CONCAT_REALLY(x,y) x ## y
+#define CONCAT(x,y)  CONCAT_REALLY(x,y)
+#define n_preinvert_limb_var CONCAT(n_preinvert_limb_, FTEST_VAR)
+
 int main()
 {
     slong iter;
@@ -27,9 +50,11 @@ int main()
         ulong n, q;
         fmpz_poly_t pol;
         fmpz_poly_init(pol);
+        mp_limb_t qx2, qx2inv;
 
         for (q = 0; q < 1000; q++)
         {
+            qx2 = 2 * q;
             acb_dirichlet_roots_t zeta;
             prec = 100 + n_randint(state, 500);
 
@@ -38,6 +63,10 @@ int main()
 
             for (n = 0; n < 1000; n++)
             {
+// these all SIGFPE
+#ifdef FTEST_OUTER
+                qx2inv = n_preinvert_limb_var(qx2);
+#endif
                 arb_fmpz_poly_gauss_period_minpoly(pol, q, n);
 
                 if (!fmpz_poly_is_zero(pol))
@@ -52,14 +81,18 @@ int main()
 
                     for (iter = 0; iter < 3; iter++)
                     {
+// these succeed when FTEST_VAR is good1 or good2
+#ifndef FTEST_OUTER
+                        qx2inv = n_preinvert_limb_var(qx2);
+#endif
                         k = n_randint(state, n);
-                        gk = n_powmod2(g, k, 2 * q);
+                        gk = n_powmod2_preinv(g, k, qx2, qx2inv);
                         acb_zero(u);
 
                         for (e = 0; e < d; e++)
                         {
                             acb_dirichlet_root(t, zeta, n_mulmod2_preinv(gk,
-                                n_powmod2(g, n * e, 2 * q), 2 * q, n_preinvert_limb(2 * q)), prec);
+                                n_powmod2_preinv(g, n * e, qx2, qx2inv), qx2, qx2inv), prec);
                             acb_add(u, u, t, prec);
                         }
 

In the above, I refactor the code slightly so it only has one call to n_preinvert_limb by converting calls from n_powmod2 to using n_powmod2_preinv. The resulting code only calls n_preinvert_limb once - you can confirm this by running gcc -E, delete the original definition whilst keeping our newer versions.

Now we copy the post-processed n_preinvert_limb back into the file (as n_preinvert_limb_bad) so it's clearer what's going on without the multiple levels of macros. Then we figured out 2 ways of making the test failure go away: good1 forces no inlining, and good2 changes the assembly constraint from "rm" (register or memory) to "m" (memory only). However this only works if you call n_preinvert_limb_$var in the inner loop, it doesn't work if you call it in the outer loop - all the variations I tried still have the sigfpe.

Because this is seemingly random, and I couldn't see any obvious UB in the file, I suspect the optimiser is to blame. However I couldn't narrow it down to one particular optimisation flag. It turns out the GCC docs are very inaccurate on which flags are enabled, but there is gcc -Q <optimizer flags> --help=optimizers. From this, I could see that (at least on my version, 7.2), these sets of flags are supposedly identical:

$ diff <(gcc -c -Q -O2 -finline-functions -fno-optimize-strlen --help=optimizers) \
       <(gcc -c -Q -Os --help=optimizers) && echo identical
identical

i.e. -O2 -finline-functions -fno-optimize-strlen in theory should be the same as -Os, according to the actual code of the compiler itself, which should be the ultimate authority on which flags are enabled - unlike the docs which have to be hand-updated.

However the former set of flags exhibits the sigfpe but the latter doesn't!

$ touch arb_fmpz_poly/test/t-gauss_period_minpoly.c; make check MOD=arb_fmpz_poly QUIET_CC= CFLAGS="-Os"
[all pass]

$ touch arb_fmpz_poly/test/t-gauss_period_minpoly.c; make check MOD=arb_fmpz_poly QUIET_CC= CFLAGS="-O2 -finline-functions -fno-optimize-strlen"
[..]
make[1]: *** [../build/arb_fmpz_poly/test/t-gauss_period_minpoly_RUN] Floating point exception
exit code 2

I also tried specifically passing the full set of -f optimiser flags from the output of gcc -Q, rather than a -Ox flag, and was unable to reproduce the failure: the test passes with all combinations of -f flags.

I'm really stumped here, I don't know how to proceed. I was originally thinking that my diff above could lead to a minimal test case for a GCC bug report, but it turns out that seemingly random stuff is significant for the test failure so we can't remove them from the example - or at least, I can't see how to. Maybe someone else could have better luck.

In the worst case, would you recommend us shipping the code (compiled with -O2) despite this failure, and simply run the tests with -Os instead of -O2 so that they pass and don't fail the build?

@fredrik-johansson
Copy link
Collaborator

fredrik-johansson commented Oct 21, 2017

There is probably a less invasive way to make the test pass. Some possibilities:

  1. Insert some non-inline function calls to prevent the compiler from simplifying. For example q = n_pow(q,1) somewhere before the first n_powmod2 call might do the job.

  2. Start the main loop from q = 1 and add a separate test for q = 0, e.g.:

     for (n = 0; n < 100; n++)
     {
         fmpz_poly_one(pol);
         arb_fmpz_poly_gauss_period_minpoly(pol, 0, n);
         if (!fmpz_poly_is_zero(pol))
         {
             flint_printf("FAIL (q = 0)\n");
             flint_abort();
         }
     }
    
  3. Instead of calling n_powmod2, use a simple wrapper around fmpz_powm_ui.

Even if some workaround like this avoids the test failure, I wouldn't consider the problem solved until the cause of the bug is understood (and reported to the GCC devs if it is indeed a compiler bug) since it's likely to cause problems in other code that uses ulong_extras arithmetic.

@infinity0
Copy link
Author

Right, I wasn't suggesting that you adopt the patch I added, it was just to demonstrate different test failures modes.

With advice from #gcc I found a minimal optimiser setting that causes this - and a key point which unfortunately was not made explicit enough in the docs is that one cannot replace -O levels with only combinations of -f flags.

  • -O0 -fira-region=(one|all|mixed) pass
  • -O1 -fira-region=one pass
  • -O1 -fira-region=(all|mixed) sigfpe
  • -O1 sigfpe (ira-region defaults to mixed)

@infinity0
Copy link
Author

Using C-reduce I managed to get a minimal example that doesn't include any flint or arb code. I'm talking to various people to see if the C/asm code has any issues, if not I'll file a bug to GCC tomorrow.

@infinity0
Copy link
Author

I think flint code does have to be changed actually, but so does many other projects, so I've filed a GCC bug even though it's not (I believe) strictly their bug.

@fredrik-johansson
Copy link
Collaborator

Thanks for getting to the bottom of this. Great work!

@wbhart
Copy link
Contributor

wbhart commented Oct 23, 2017

Thanks very much! This was a hard bug to track down, and would have taken us ages. We probably would have just worked around this, since it isn't easy to figure out at all. We really appreciate the help!

wbhart added a commit to flintlib/flint that referenced this issue Nov 10, 2017
@fredrik-johansson
Copy link
Collaborator

With a fix now in flint trunk, I guess we can just add a workaround to the test code so that it passes with flint-2.5?

@fredrik-johansson
Copy link
Collaborator

I have added such a workaround.

@SnarkBoojum
Copy link
Contributor

I tried to update flint-arb in Debian to 2.12.0 (not touching flint):

convol....dft....../Makefile.subdirs:84: recipe for target '../build/acb_dft/test/t-dft_RUN' failed
make[3]: *** [../build/acb_dft/test/t-dft_RUN] Floating point exception

so either it's the same error message with another error, or there's still something fishy somewhere :-/

@infinity0
Copy link
Author

I think it's a different bug and a bug in the C code, I also get the same error with clang-4.0. It should probably be in a separate issue. I won't be able to look in more detail for a few days, though.

@fredrik-johansson
Copy link
Collaborator

I don't know what that is about; the valgrind output is clean for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants