fixing gzip import issue 2774 #2783

abostroem · 2014-07-23T03:57:32Z

Changed import from:
import gzip
to:
from ...utils.compat import gzip

ran setup.py test and all io/fits tests passed. I don't have Python 2.6 to test that this really fixed the problem, but it should based on the comments in issue #2774

embray · 2014-07-23T14:30:41Z

Thanks for the fix @abostroem--unfortunately I don't think this quite did it. This is tricky because really there are two gzip modules in use and both need to be supported. This fix is in the right direction--astropy.io.fits.file should import gzip from the astropy utils. However, there are several places in that module that do a check like isinstance(fileobj, gzip.GzipFile). This gets ugly because that needs to return True for both the system gzip.GzipFile, and the copy from Astropy.

So really what this needs is something at the top like:

from ..utils.compat import gzip
import gzip as _system_gzip

_GZIP_FILE_TYPES = (gzip.GzipFile, _system_gzip.GzipFile)

and then replace all instances of isinstance(fileobj, gzip.GzipFile) with isinstance(fileobj, _GZIP_FILE_TYPES).

It's ugly but it might have to do.

For clarity the astropy gzip is called _astropy_gzip and the system gzip is called _system_gzip. All gzip.x statements have been changed to use the _astropy_gzip. All isinstances statements check both

abostroem · 2014-07-23T17:58:48Z

Thanks to Matt Craig I ran this in python 2.6, all the io/fits tests passed

embray · 2014-07-24T17:47:42Z

Thanks, looks good. As one last thing would you mind implementing my suggestion of putting the two GzipFile types as a tuple in some global variable _GZIP_FILE_TYPES. This isn't merely cosmetic--it makes clear that this should be used when type checking this sort of thing and makes it harder to forget to implement this workaround.

Then please also add a changelog entry under the 0.4.1 heading mentioning that this was fixed, along with the issue number (following the example of other entries).

embray · 2014-07-24T17:50:56Z

Oh, sorry, and if you don't mind could you add a regression test? Basically recreating the example from #2774 would suffice (just do the compression using the gzip module rather than calling an external program, but other than that it would just ensure the file an be read once zipped). If you don't have time I can add it.

abostroem · 2014-07-30T01:23:39Z

@embray I'm a little unclear on what regression test you want me to add. Should I zip one of the files in data in place, try to read it and then delete it at the end of the test?

abostroem · 2014-07-30T18:14:09Z

@embray I tried the above test using os.system call to gzip. Instead do you want me to do something like this?
import gzip
f_in = open('file.txt', 'rb')
f_out = gzip.open('file.txt.gz', 'wb')
f_out.writelines(f_in)
f_out.close()
f_in.close()

…er and passes for 0.4.1 and higher

embray · 2014-07-30T20:04:33Z

@abostroem Take a look at the tests starting around

https://github.com/astropy/astropy/blob/master/astropy/io/fits/tests/test_core.py#L588

which test other cases of reading and writing gzip files. This would be just one other such test but specifically implementing the case of #2774. In that issue @cdeil demonstrated the issue using a test data file that comes in the test suite--you can access it from within a test function using self.data('table.fits').

The trick would be to write the test first (without your fixes)--see that it fails--and then add your fixes on top and confirm that the test now passes.

embray · 2014-07-30T20:05:44Z

This test is a particularly good model since it's also a regression test for a bug:

https://github.com/astropy/astropy/blob/master/astropy/io/fits/tests/test_core.py#L598

abostroem · 2014-08-05T03:35:39Z

@embray I'm still a little confused, I think mostly because I'm not entirely clear on exactly what broke originally (what exactly is Table calling that is breaking), what _make_gzip does (how close to a real gzip object is this, is it a table?), what should be the length of the fits.open statement, and what does ignore_warnings do?

This was the original test I wrote before you responded. It passes and fails in the correct places:

from astropy.io import fits
import os
import shutil
import astropy
from astropy.table import Table

def test_gzip():
    data_dir = os.path.join(os.path.dirname(__file__), 'data')
    shutil.copy(os.path.join(data_dir, 'table.fits'), os.path.join(data_dir, 'table2.fits'))
    os.system('gzip '+os.path.join(data_dir, 'table2.fits'))
    try:
        Table.read(os.path.join(data_dir, 'table2.fits.gz'))
        passed = True
    except ValueError as e:
        passed = False
    os.remove(os.path.join(data_dir, 'table2.fits.gz'))
    if not passed:
        raise(e)

This is the function I wrote after your email, it passes in both 2.6 and 2.7. Suggestions?

    def test_gzip(self):
        gf = gzip.GzipFile(self._make_gzip_file())
        with ignore_warnings():
            assert len(Table.read(gf)) == 5

I sound say my original code was a stand alone file while the second one is included in test_core.py

embray · 2014-08-05T14:55:31Z

The _make_gzip_file() helper that's used in some of the tests isn't applicable here since it doesn't gzip a file containing a table.

The original test you wrote is mostly fine. It should just make a few changes when integrating in with the the other tests:

To get at the test files for the io.fits package there's a helper method self.data. Just do self.data('table.fits') to get the path to the data file (like the other tests do). Also there's no reason to copy the file since it's not going to be modified.
Instead of using os.system to gzip the file use the gzip module in Python. In the tests you can use self.temp('filename.fits') to get a temporary filename to write to.
No need for the try/except. If an error occurs it can just bubble up and the test runner will catch it and perform the necessary cleanup.

abostroem · 2014-08-06T06:17:46Z

Is this what you had in mind?

def test_gzip_open_w_table_mod():
    temp_table = self.temp('table.fits.gz')
    with open(self.data('table.fits'), 'rb') as f:
        gz = gzip.open(temp_table, 'wb')
        gz.write(f.read())
        gz.close()
    Table.read(os.path.join('table.fits.gz'))

So now I have another problem. I've put the following on line 647 in test_core.py. I'm trying to run my test using astropy.test('io') but it doesn't seem to be picking it up my new test (when I delete the function is still runs the same number of tests and when I put something blatantly wrong in there it doesn't break (like assert 1 == 2). I did accidentally break another function so I know its looking at the right file. Do I have to do something else?

embray · 2014-08-06T15:56:16Z

Try just running,

python setup.py test --package=io.fits

No need to run all the io tests. Trying to run astropy.test() might not be importing your development version of astropy.

embray · 2014-08-26T22:11:10Z

The following should suffice:

    def test_read_open_astropy_gzip_file(self):
        """  
        Regression test for https://github.com/astropy/astropy/issues/2774

        This tests reading from a ``GzipFile`` object from Astropy's
        compatibility copy of the ``gzip`` module.
        """

        from ....utils.compat import gzip 

        gf = gzip.GzipFile(self._make_gzip_file())
        try: 
            assert len(fits.open(gf)) == 5 
        finally:
            gf.close()

This doesn't need to invoke Table.read directly since the real problem is supporting both the system gzip and astropy.utils.compat.gzip (there's no guarantee in the future as to how Table.read is implemented, so better to go directly to the issue).

@abostroem Just lemme know when you have time and I can walk you through rebasing and stuff if need be :)

embray · 2014-08-26T22:12:27Z

I should add, earlier I wrote:

The _make_gzip_file() helper that's used in some of the tests isn't applicable here since it doesn't gzip a file containing a table.

My bad--there's nothing about this that is exclusive to tables. It's just that this code path was being invoked originally by way of Table.read. After giving this a closer look I realized that the test for this probably shouldn't involve Table.read at all.

This tests the utils.compat.gzip function's ability to open files

abostroem · 2014-08-28T05:14:13Z

@embray I removed a few trailing spaces from my test, let's see if it passes this time...

embray · 2014-08-28T15:01:35Z

@abostroem What editor are you using? I would recommend finding out how to configure it to do this automatically; especially on .py files.

abostroem · 2014-08-28T16:29:07Z

@embray I use TextWrangler, I've just enabled it to strip trailing whitespace.

Having done that, when I do a git status, neither the test_core.py nor file.py show up as modified.

Is there something that will tell me where the trailing whitespace is?

Additionally, astropy_helpers does show up as modified (even though I haven't touched it) and a diff shows this:

Kyras-MacBook-Pro:astropy abostroem$ git diff
diff --git a/astropy_helpers b/astropy_helpers
index c64c7b7..1bca250 160000
--- a/astropy_helpers
+++ b/astropy_helpers
@@ -1 +1 @@
-Subproject commit c64c7b7f0490caca2355b013d3ae76170a339da9
+Subproject commit 1bca250

Thoughts?

embray · 2014-08-28T16:59:34Z

@abostroem You can look at the Travis build that failed here. It shows that there is trailing whitespace at test_core.py on line 658. Indeed if you look it's there after the try: statement.

Sorry about this--I know it's pretty annoying (and for once not even my idea!). But I think ultimately a good thing in the long term.

Don't worry about astropy_helpers--it's a submodule. When you switch between branches, if that submodule is at different commits on those branches it won't automatically move the submodule for you (hence showing it as "modified" which is really probably not the best description of what's actually going on). You can update the submodule to the correct version for the branch you're on by running git submodule update, but really in most cases it doesn't matter.

abostroem · 2014-08-28T20:12:20Z

@embray Done!

embray · 2014-08-28T20:31:51Z

CHANGES.rst

@@ -204,6 +204,8 @@ Bug Fixes

 - ``astropy.io.fits``

+  - Fix ability to read gzipped fits files using Python 2.6  [#2783]


Sorry, one last thing--could you fix this to read something more specific to how the original problem manifested? Something like:

Fixed crash when reading gzip-compressed FITS tables through the Astropy ``Table`` interface. [#2783]

Then also rebase your branch on upstream/master and delete the merge commits from the history. See http://docs.astropy.org/en/stable/development/workflow/additional_git_topics.html?highlight=rebasing#rebasing-on-trunk (it talks about "trunk" for people coming from SVN, but it should probably say "master" instead of "trunk"--I'll see about rewriting that at some point).

…is called

…er and passes for 0.4.1 and higher

This tests the utils.compat.gzip function's ability to open files

…is called

abostroem · 2014-08-29T02:24:12Z

@embray Let me know if this looks right. I followed the directions:
I made my changes on the master branch (learning experience).

git fetch upstream
git rebase -i upstream/master
git push origin master

Looking at the commits I didn't see any merge commits, so I didn't modify the commit history.

Since you said the real issue was not the call from Tables, do you want the commit message to reference the real issue. For now I copied what you wrote.

embray · 2014-08-29T21:08:36Z

I have no idea what's going on anymore because now you're including commits from other issues as well. I'm just going to merge this manually.

embray · 2014-08-29T21:17:12Z

Merged manually via 5d7acd6 Thanks again @abostroem , and indeed next time you'll probably have an easier time if you start a separate branch to work on the issue.

Conflicts: CHANGES.rst

embray · 2014-08-29T21:19:33Z

I'm not sure if it's something I did, or if it was a result of refreshing the page or what, but now the commit history in this PR doesn't look as messed up as it did a few minutes ago! Nevertheless, when I did the merge I also squashed a few commits down (typo fixes and things like that). There was also a tab character in the changelog :(

fixing gzip import issue 2774

ba6b7cd

embray added Affects-release and removed Bug labels Jul 23, 2014

embray added this to the v0.4.1 milestone Jul 23, 2014

Code now import both astropy and system gzip

8a4d1f7

For clarity the astropy gzip is called _astropy_gzip and the system gzip is called _system_gzip. All gzip.x statements have been changed to use the _astropy_gzip. All isinstances statements check both

Azalee Bostroem added 2 commits July 29, 2014 18:19

creating global variable with the two GzipFile types

dde05ca

adding issue 2783 to the changelog for 0.4.1

b6141bd

fixing indentation of global variable

26738ab

adding test for gzip. This test fails in astropy versions 0.4 and low…

b542232

…er and passes for 0.4.1 and higher

embray modified the milestones: v0.4.1, v0.4.2 Aug 5, 2014

abostroem added 3 commits August 26, 2014 18:55

Added test for gzip file suggested by Erik Bray.

8ad1697

This tests the utils.compat.gzip function's ability to open files

this test has been incorporated into test_core.py

5ff38ed

Merge branch 'master' of github.com:astropy/astropy

f187a67

Merge branch 'master' of github.com:astropy/astropy

8faca58

Merge branch 'master' of github.com:astropy/astropy

a779d16

removing last trailing whitespace

f092ee3

embray reviewed Aug 28, 2014
View reviewed changes

abostroem and others added 11 commits August 28, 2014 19:06

modified changelog to be more specific about fixing gzip when Tables …

5011382

…is called

creating global variable with the two GzipFile types

4a23c29

adding issue 2783 to the changelog for 0.4.1

ab5ee97

fixing indentation of global variable

28a61f4

adding test for gzip. This test fails in astropy versions 0.4 and low…

9de8803

…er and passes for 0.4.1 and higher

Added test for gzip file suggested by Erik Bray.

5aa0258

This tests the utils.compat.gzip function's ability to open files

this test has been incorporated into test_core.py

b47bfc5

removing trailing whitespace from the new test

0dbc2a5

removing last trailing whitespace

06e3f23

modified changelog to be more specific about fixing gzip when Tables …

bb36c2a

…is called

Merge branch 'gzip_fix'

6107068

embray added a commit that referenced this pull request Aug 29, 2014

Merge pull request #2783 abostroem/gzip_fix

cd9dfdf

embray force-pushed the master branch from cd9dfdf to 5d7acd6 Compare August 29, 2014 21:15

embray added a commit that referenced this pull request Aug 29, 2014

Merge pull request #2783 abostroem/gzip_fix

5d7acd6

embray closed this Aug 29, 2014

embray added a commit that referenced this pull request Aug 29, 2014

Merge pull request #2783 abostroem/gzip_fix

34daed3

Conflicts: CHANGES.rst

embray mentioned this pull request Sep 12, 2014

Reading Table from gzipped FITS file fails on Python 2.6 #2774

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixing gzip import issue 2774 #2783

fixing gzip import issue 2774 #2783

abostroem commented Jul 23, 2014

embray commented Jul 23, 2014

abostroem commented Jul 23, 2014

embray commented Jul 24, 2014

embray commented Jul 24, 2014

abostroem commented Jul 30, 2014

abostroem commented Jul 30, 2014

embray commented Jul 30, 2014

embray commented Jul 30, 2014

abostroem commented Aug 5, 2014

embray commented Aug 5, 2014

abostroem commented Aug 6, 2014

embray commented Aug 6, 2014

embray commented Aug 26, 2014

embray commented Aug 26, 2014

abostroem commented Aug 28, 2014

embray commented Aug 28, 2014

abostroem commented Aug 28, 2014

embray commented Aug 28, 2014

abostroem commented Aug 28, 2014

embray Aug 28, 2014

abostroem commented Aug 29, 2014

embray commented Aug 29, 2014

embray commented Aug 29, 2014

embray commented Aug 29, 2014

		@@ -204,6 +204,8 @@ Bug Fixes

		- ``astropy.io.fits``

		- Fix ability to read gzipped fits files using Python 2.6 [#2783]

fixing gzip import issue 2774 #2783

fixing gzip import issue 2774 #2783

Conversation

abostroem commented Jul 23, 2014

embray commented Jul 23, 2014

abostroem commented Jul 23, 2014

embray commented Jul 24, 2014

embray commented Jul 24, 2014

abostroem commented Jul 30, 2014

abostroem commented Jul 30, 2014

embray commented Jul 30, 2014

embray commented Jul 30, 2014

abostroem commented Aug 5, 2014

embray commented Aug 5, 2014

abostroem commented Aug 6, 2014

embray commented Aug 6, 2014

embray commented Aug 26, 2014

embray commented Aug 26, 2014

abostroem commented Aug 28, 2014

embray commented Aug 28, 2014

abostroem commented Aug 28, 2014

embray commented Aug 28, 2014

abostroem commented Aug 28, 2014

embray Aug 28, 2014

Choose a reason for hiding this comment

abostroem commented Aug 29, 2014

embray commented Aug 29, 2014

embray commented Aug 29, 2014

embray commented Aug 29, 2014