-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
interpolate.splder() failure on Fedora #2911
Comments
@opoplawski what's different about Rawhide? Compiler versions? |
Failure reported against 0.13.0b1 |
One possibility is that the spline used in the test already contains a duplicate knot --- it's from FITPACK fitting, which may be sensitive to rounding error. So it would be useful to
in the test. |
Actually, looks to be more of a 32-bit issue. I see it on Fedora 19+ in 32-bit. |
I don't see it with any Python version on 32-bit Ubuntu 13.04. |
Thanks, can be reproduced in Fedora 19 32-bit VM. ... except that it's stochastic and doesn't occur every time. Memory alignment affecting rounding error in FITPACK, maybe EDIT: this was mistaken, I'm not able to reproduce this issue |
@opoplawski: can you try to apply http://gist.github.com/anonymous/6720219 on top of maintenance/0.13.x branch f4d8447 and post the produced npz file somewhere. I don't seem to be able to reproduce this on i386 Fedora 19 after all now. |
I still see it with current maintenance/0.13.x and that patch applied. The npz files are at http://www.cora.nwra.com/~orion/npz.tar.gz |
@opoplawski: sorry, I meant the |
I'm not able to reproduce this on Fedora rawhide/i386:
and in
Running However, if using Otherwise, I'd need a more detailed description of the steps and environment in which this bug can be reproduced. |
Not reproducible via |
besides on debian unstable I can reproduce it on ubuntu 13.10 i386 but not 13.04. |
yes compiled with g++-4.7 it also works in 13.10 |
And on those platforms you can reproduce it in a VM? I'll take a spin with ubuntu, but I still don't understand why I can't reproduce it on Fedora images. |
I'm can reproduce it on standard pbuilder i386 chroots running ubuntu 13.04 amd64 kernel |
a reason it could not happen on fedora is because debian/ubuntu somehow messes with the default CFLAGS so you end up using -O2 instead of the scipy default -O3 |
I've shifted to using serial atlas, but still see this: http://koji.fedoraproject.org/koji/getfile?taskID=6064130&name=build.log This is with numpy 1.8.0rc2 and scipy 0.13.0rc1. |
Also, it appears on x86_64 and armv7hl as well. |
Can either one of you apply the patch I linked above, and post the file |
Can't reproduce in i386 pbuild on Ubuntu 13.10 amd64 either: https://dl.dropboxusercontent.com/u/5453551/last_operation.log What is different in your setups? The build environment is probably almost identical, so I'm a bit at a loss on where to look... |
I mailed you the file. The only difference I see is the hardware (intel vs amd) or the kernel (3.11 vs 3.8). |
Thanks. I tried it before on two machines and failed to reproduce: Intel(R) Xeon(R) CPU E5430 on Linux 2.6.32; Intel(R) Core(TM) i7-3770K on Linux 3.11.0. Different gcc versions, too (4.7.2 and 4.8.1). Seems to point towards some Intel vs. Amd difference. Looking at the file you sent, it looks like a bug in the FITPACK
whereas in the good case we have
The input spline
doesn't reproduce the strange results here, so it's really probably some issue with the insert routine. The Strangely enough, the code in question does not have anything special on our side. It's some ye olde Fortran code wrapped with C. |
I wouldn't rule bugs in the Fortran code out; it's patched Fitpack code, and some of the patches may be buggy |
Note that further debugging of this issue is in practice impossible without gdb-enabled access to a machine on which the issue can be reproduced. I do not have access to hardware/VMs where this can be reproduced. |
I might be able to help. This is the only test that fails in my build of scipy 0.13.3:
Python 2.7.6, built with GCC 4.1.2 on CentOS 5.8 Some machine information: I would need some instructions on how to proceed debugging this issue, if this is of interest. |
Write file
Run
Here is a "good" trace for comparison: https://gist.github.com/pv/9080048 Examine differences between that and the trace you get, and determine the reason why the result is different on your platform (the 1e-310 floating point numbers can be ignored, as the gdb script prints also uninitialized variables). Are the inputs the same? Are the outputs the same? Is there a bug in the Fortran code? The It's also possible that the Fortran compiler miscompiles the file. Check by compiling it with different optimization levels (use the Try to reduce the problem to a pure-fortran test case, so that it is easier to debug. |
I just noticed that I can reproduce this issue with my new laptop with intel core i7, with gcc 4.8.2-19ubuntu1. It does not occur on gfortran optimization level The miscompiled file is I don't have now time to look into this in depth, but the optimized trees are here: good O2, bad O3. |
This seems to be an argument aliasing issue: gfortran on -O3 seems to assume strict aliasing on the function input arguments, which is broken here: Using different buffers for different args makes the issue go away. |
this is no compiler bug, the fortran language (at least < 95) does not allow aliasing. |
@juliantaylor: I agree, I found the aliasing issue only later. The fix should be relatively simple. |
Fix in gh-3673 |
@Dapid: please double-check by removing existing scipy installations, |
Also double-check that you are on the current master, and not at an |
@pv it seems it was caused by some leftovers, it is now correct. Thanks! |
@rgommers |
From @opoplawski: With Fedora Rawhide (but not Fedora 19) I'm seeing:
and the same with python3.
The text was updated successfully, but these errors were encountered: