Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some tests failling in astropy-0.3.1 #2168

Closed
sergiopasra opened this issue Mar 5, 2014 · 33 comments
Closed

Some tests failling in astropy-0.3.1 #2168

sergiopasra opened this issue Mar 5, 2014 · 33 comments

Comments

@sergiopasra
Copy link
Contributor

Hi, prior to updating the Fedora package, I have tested astropy-0.3.1 in the development version of Fedora and I'm seeing some failures in the tests.

I have python 2.7.6 and numpy 1.8.0. I have tried both the prepackaged version of numpy and via pip in a virtualenv and I get the same errors.

Strangely enough I'm not getting errors in Fedora 20 with python 2.7.5 and numpy 1.8.0

The errors are in
astropy.io.fits.tests.test_image.TestImageFunctions.test_io_manipulation
astropy.io.fits.tests.test_structured.TestStructured.test_structured
astropy.io.fits.tests.test_table.TestTableFunctions.test_copy_vla

@sergiopasra
Copy link
Contributor Author

Is there a way to attach a file?

@taldcroft
Copy link
Member

@sergiopasra - you can create a gist with your file. AFAIK there are no file attachments.

@sergiopasra
Copy link
Contributor Author

This is the output of astropy.test() with numpy from RPM
http://guaix.fis.ucm.es/~spr/github/astropy_native-python276-numpy180.txt

And this is the output with numpy from virtualenv.

http://guaix.fis.ucm.es/~spr/github/astropy_venv-python276-numpy180.txt

No difference between them.
In both cases I'm using the pristine tarball. No patching, tests are done with bundled pytest

@astrofrog astrofrog added this to the v0.3.2 milestone Mar 5, 2014
@embray
Copy link
Member

embray commented Mar 5, 2014

Odd. I've never seen any of those tests fail in any of those ways. Just created a fresh virtualenv with fresh installs of Numpy 1.8 and Astropy 0.3.1 from pip and did not encounter any problems:

============================= test session starts ==============================
platform linux2 -- Python 2.7.3 -- pytest-2.4.0

Running tests with Astropy version 0.3.1.
Running tests in /internal/1/root/home/embray/.virtualenvs/test/lib/python2.7/site-packages/astropy/io/fits.

Platform: Linux-2.6.32-431.5.1.el6.x86_64-x86_64-with-redhat-6.5-Santiago

Executable: /internal/1/root/home/embray/.virtualenvs/test/bin/python

Full Python Version: 
2.7.3 (default, Feb 20 2013, 10:25:44) 
[GCC 4.4.6 20120305 (Red Hat 4.4.6-4)]

encodings: sys: ascii, locale: UTF-8, filesystem: UTF-8, unicode bits: 15
byteorder: little
float info: dig: 15, mant_dig: 15

Numpy: 1.8.0
Scipy: not available
Matplotlib: not available
h5py: not available

Are you running python setup.py test in a source checkout, or is it with python -c 'import astropy; astropy.test()'? Or both?

As you can see I am running Python 2.7.3. I will try to make a 2.7.6 build and see if that reproduces the issue. If so this would have to be an odd Python bug :/

@embray
Copy link
Member

embray commented Mar 5, 2014

Nope, don't get any failures with Python 2.7.6 :/

platform linux2 -- Python 2.7.6 -- pytest-2.4.0

Running tests with Astropy version 0.3.1.
Running tests in astropy /internal/1/root/src/astropy/astropy/docs.

Platform: Linux-2.6.32-431.5.1.el6.x86_64-x86_64-with-redhat-6.5-Santiago

Executable: /internal/1/root/home/embray/.virtualenvs/test/bin/python

Full Python Version: 
2.7.6 (default, Mar  5 2014, 11:54:26) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)]

encodings: sys: ascii, locale: UTF-8, filesystem: UTF-8, unicode bits: 15
byteorder: little
float info: dig: 15, mant_dig: 15

Numpy: 1.8.0
Scipy: not available
Matplotlib: not available
h5py: not available

@embray
Copy link
Member

embray commented Mar 5, 2014

Two differences I see between our test runs is that I have unicode bits: 15 while you have unicode bits: 20, and for some reason the doctests aren't running for you (and the fact that that number is 15 for me is really a display error--it should be "16")

@embray
Copy link
Member

embray commented Mar 5, 2014

I'm going to try building with UCS-4 and see if that makes a difference.

@embray
Copy link
Member

embray commented Mar 5, 2014

Nope, that didn't make any difference either. Nor should it have, for any reason I could think of. But I'm grasping at straws here....

platform linux2 -- Python 2.7.6 -- pytest-2.4.0

Running tests with Astropy version 0.3.1.
Running tests in astropy /internal/1/root/src/astropy/astropy/docs.

Platform: Linux-2.6.32-431.5.1.el6.x86_64-x86_64-with-redhat-6.5-Santiago

Executable: /internal/1/root/home/embray/.virtualenvs/test/bin/python

Full Python Version: 
2.7.6 (default, Mar  5 2014, 13:17:54) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)]

encodings: sys: ascii, locale: UTF-8, filesystem: UTF-8, unicode bits: 20
byteorder: little
float info: dig: 15, mant_dig: 15

Numpy: 1.8.0
Scipy: not available
Matplotlib: not available
h5py: not available

collected 5206 items / 3 skipped 

....

============ 4947 passed, 252 skipped, 10 xfailed in 154.94 seconds ============

@sergiopasra
Copy link
Contributor Author

I did 'python -c 'import astropy; astropy.test()' under 'build'. python setup.py test gives me errors also

I have compiled python 2.7.6 from source code and I'm not seeing any errors. This has to be related with Fedora's patches to python or perhaps the handling of temporary files.

@embray
Copy link
Member

embray commented Mar 5, 2014

Could you point me to this patches? I can see if that makes any difference.

@sergiopasra
Copy link
Contributor Author

So now I have compiled python 2.7.6 in my virtual machine and the tests fails again!

The situation is
Fedora 20, python 2.7.6 compiled from source, numpy and astropy from pip, tests pass
Fedora Rawhide, python 2.7.6 compiled from source, numpy and astropy from pip, tests do no pass.

So it is not related with the Fedora's patches to python, is something else.

In Fedora 21
============================= test session starts ==============================
platform linux2 -- Python 2.7.6 -- py-1.4.20 -- pytest-2.5.2

Running tests with Astropy version 0.3.1.
Running tests in astropy/io/fits/tests/test_table.py.

Platform: Linux-3.14.0-0.rc5.git0.1.fc21.1.x86_64-x86_64-with-fedora-21-Rawhide

Executable: /home/spr/ap2/bin/python

Full Python Version:
2.7.6 (default, Mar 5 2014, 19:08:18)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-14)]

encodings: sys: ascii, locale: UTF-8, filesystem: UTF-8, unicode bits: 15
byteorder: little
float info: dig: 15, mant_dig: 15

Numpy: 1.8.0
Scipy: not available
Matplotlib: not available
h5py: not available

collected 58 items

astropy/io/fits/tests/test_table.py ........................................................F.

=================================== FAILURES ===================================
_______________________ TestTableFunctions.test_copy_vla

In Fedora 20

============================= test session starts ==============================
platform linux2 -- Python 2.7.6 -- py-1.4.20 -- pytest-2.5.2

Running tests with Astropy version 0.3.1.
Running tests in astropy/io/fits/tests/test_table.py.

Platform: Linux-3.13.5-200.fc20.x86_64-x86_64-with-fedora-20-Heisenbug

Executable: /home/spr/devel/ap2/bin/python

Full Python Version:
2.7.6 (default, Mar 5 2014, 18:30:03)
[GCC 4.8.2 20131212 (Red Hat 4.8.2-7)]

encodings: sys: ascii, locale: UTF-8, filesystem: UTF-8, unicode bits: 15
byteorder: little
float info: dig: 15, mant_dig: 15

Numpy: 1.8.0
Scipy: not available
Matplotlib: not available
h5py: not available

collected 58 items

astropy/io/fits/tests/test_table.py ..........................................................

========================== 58 passed in 1.63 seconds ===========================

@embray
Copy link
Member

embray commented Mar 5, 2014

I find it amusing, particularly in this case, that Fedora 20 is code-named "Heisenbug"

@embray
Copy link
Member

embray commented Mar 5, 2014

So you just now used a Python compiled from source that did not use any of Fedora's patches?

@sergiopasra
Copy link
Contributor Author

That's correct. I have compiled the python 2.7.6 tarball from python.org in my virt machine and my workstation. In my workstation with F20 the tests pass and in my virtual machine with F21 they do not. So I'm using the same Python without any patch in both systems

@sergiopasra
Copy link
Contributor Author

Part of the Fedora community considered naming useless, so there will be no more names for releases :(

@embray
Copy link
Member

embray commented Mar 5, 2014

Is there anyone else with a Fedora 21 installation who can confirm this?

@embray
Copy link
Member

embray commented Mar 5, 2014

I'm installing in a VM to see for myself. Too bad about the naming thing--some people hate fun...even more than I do apparently ;)

@sergiopasra
Copy link
Contributor Author

I have a somehow reduced test case that fails in F21 and passes in F20. It writes down the FITS files to disk.

http://guaix.fis.ucm.es/~spr/github/astropy031-f21.py

@sergiopasra
Copy link
Contributor Author

Some of the test FITS files produced by the previous test case are different in both systems.

@sergiopasra
Copy link
Contributor Author

Opening both files with fv, the one in F20 has 3 extensions, that in F21 has only 1

@embray
Copy link
Member

embray commented Mar 5, 2014

Thanks for whittling it down. Should be helpful. I'm still waiting for this VM to finish installing. It is still truly mysterious.

@sergiopasra
Copy link
Contributor Author

@embray Any progress with this?

I have found a possible source of error. If I run this code

from astropy.io import fits

new_hdul = fits.HDUList([fits.PrimaryHDU()])
new_hdul.writeto('mtest1.fits', clobber=True)
assert len(new_hdul) == 1
new_hdul.close()

with fits.open('mtest1.fits') as new_hdul:
    print('append check', new_hdul.filename(), len(new_hdul))

with fits.open('mtest1.fits', mode='append') as new_hdul:
    print('append check', new_hdul.filename(), len(new_hdul))

with fits.open('mtest1.fits') as new_hdul:
    print('append check', new_hdul.filename(), len(new_hdul))

I get

('append check', 'mtest1.fits', 1)
('append check', 'mtest1.fits', 1)
('append check', 'mtest1.fits', 1)

in Fedora 20

and

('append check', 'mtest1.fits', 1)
('append check', 'mtest1.fits', 0)
('append check', 'mtest1.fits', 1)

in Fedora 21, the 'append' mode does not work correctly

@embray
Copy link
Member

embray commented Mar 17, 2014

I haven't had a chance yet to get back to this but I'm hoping to soon. I do have a Fedora 21 VM set up so I can test this, and your test code will help.

This definitely sounds like a behavior of 'append', which does open the underlying file with append mode. This should have a well-defined behavior, and yet many platforms do wonky things with append mode. I have to wonder if there's a bug somewhere in this version of Fedora though it would seem pretty weird.

@embray
Copy link
Member

embray commented Mar 17, 2014

In particular, when you opened a FITS file with mode='append', while pyfits uses the 'ab+' mode in Python (which uses O_APPEND when opening the file) it does force a seek to the beginning of the file in order to read any existing HDUs. O_APPEND should only cause the file to seek to the end before writing, but I could see this behavior occurring if there is also a bug (or "feature"?) causing a seek to the end before the first read of an O_APPEND file too.

@embray embray self-assigned this Mar 19, 2014
@embray
Copy link
Member

embray commented Mar 19, 2014

Okay, I was able to reproduce this, and it's more or less as I expected--seek isn't working on files opened in ab+ mode. According to the Python 2.7.6 docs:

"Note that if the file is opened for appending (mode 'a' or 'a+'), any seek() operations will be undone at the next write. If the file is only opened for writing in append mode (mode 'a'), this method is essentially a no-op, but it remains useful for files opened in append mode with reading enabled (mode 'a+')."

But in this case, even though the file is opened 'ab+' (it should be the same for a+ or ab+, and I confirmed this) seeking isn't working and is essentially being treated as a no-op. The question remains, is this a problem in Python on Fedora, or is it something deeper?

@embray
Copy link
Member

embray commented Mar 20, 2014

Uh oh, this is what I was afraid of:

$ cat test_2168.c 
#include <stdio.h>


int main(int argc, char** argv) {
  FILE* f;
  f = fopen("mtest1.fits", "ab+");
  printf("Before seek: %d\n", (int) ftell(f));
  fseek(f, 0, 0);
  printf("After seek: %d\n", (int) ftell(f));
  fclose(f);
  return 0;
}
$ gcc test_2168.c -o test_2168
$ ./test_2168 
Before seek: 2880
After seek: 2880

This is not the behavior I see on any other platform I just tested this on. On most platforms (except Windows, which I knew, and I believe possibly FreeBSD but I haven't tried) the file pointer starts at 0 upon opening the file, and doesn't jump to the end until a write is performed. That difference is okay because PyFITS already accounts for it. The difference is that on those other platforms the explicit seek to the beginning of the file works, and on here it doesn't. Could there have been a change in the libc between Fedora versions? Could this be intentional? It sounds like a bug to me but I'm not sure.

@embray
Copy link
Member

embray commented Mar 20, 2014

LOL, right here reported 2 days ago: https://sourceware.org/bugzilla/show_bug.cgi?id=16724

@embray
Copy link
Member

embray commented Mar 20, 2014

I'm not sure there's much more I can do with this, given that it's a bug in glibc. Since Fedora 21 final isn't released yet I will probably leave this as is and not try to find a workaround. The reporter of the issue mentioned that a patch is coming so presumably it will be fixed before then.

@mdboom
Copy link
Contributor

mdboom commented Mar 20, 2014

That's a good one! I'm surprised there's so little chatter in that bug report -- I would have assumed this affected a lot of things.

@embray
Copy link
Member

embray commented Mar 20, 2014

I don't know that the "a+" mode is really used by a lot of applications.

@sergiopasra
Copy link
Contributor Author

This is supposed to be fixed in Fedora Rawhide https://bugzilla.redhat.com/show_bug.cgi?id=1078355

The relevant build of glibc is this: http://koji.fedoraproject.org/koji/buildinfo?buildID=505664
It will appear as an update perhaps tomorrow, I will test it then.

@embray Thank you for the research! I couldn't imagine this would be a bug in glibc

@sergiopasra
Copy link
Contributor Author

I have checked that with the new glibc the problem disappears. I'm closing the issue

@embray
Copy link
Member

embray commented Mar 24, 2014

Excellent, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants