New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in map .fits.gz write (change map data transpose) #1346

Merged
merged 6 commits into from Mar 26, 2018

Conversation

Projects
None yet
2 participants
@cdeil
Member

cdeil commented Mar 22, 2018

I think I've found a bug in WCSMapND write.

The image orientation is incorrect in this example:
https://nbviewer.jupyter.org/gist/cdeil/2feb9e784d128d43867a707c0ca23092

At first I thought it's a bug in reproject, but maybe it's just in write and I was always looking at incorrectly images from DS9.

@woodmd - You're busy with new things, right?
If so, I'll have a look at this tomorrow and try to debug / fix it.

@cdeil cdeil added the bug label Mar 21, 2018

@cdeil cdeil added this to the 0.8 milestone Mar 21, 2018

@cdeil

This comment has been minimized.

Member

cdeil commented Mar 21, 2018

I tried to read a bit through the code and understand this.

The write code seems simple and correct, just calling make_hdu

hdu_out = fits.PrimaryHDU(data, header=header)

which does basically this:

PrimaryHDU(data=self.data, header=self.geom.wcs.to_header())

I think the problem originates from here, where self.data is initialised in WcsNDMap. _init_data with a view into a transposed Numpy array:

data = np.zeros(shape, dtype=dtype).T

and then sometimes in the gammapy.map code (e.g. in read) code paths are taken where first the map object is created with data=None (and thus this call to WcsNDMap. _init_data triggered from WcsNDMap. init), and then later assignments like map.data = hdu.data are made, which effectively transpose the data.

@woodmd - If you have time for a quick look, what would you advise? Is a quite extensive change in the way map objects are created needed that avoids this data transpose issue?

@woodmd

This comment has been minimized.

Member

woodmd commented Mar 22, 2018

I don't think this is a problem with the write method. The transpose operation in _init_data arises because we use row-major ordering (LON,LAT, ... ) for indexing and shape manipulation and column-major ordering (..., LAT, LON) for the actual data array. FITS expects all data arrays in column-major order. I think I raised the point once before that this design is probably somewhat confusing for people trying to understand the code and it might be better to switch to a consistent internal representation (e.g. row-major everywhere) -- at a minimum it would be good to at least make a note of the two ordering conventions somewhere in the class docstring. Note that if there were a transposition error either in __init__ or the I/O methods we should have been seen it in the unit tests since we have tests that roundtrip the I/O methods.

I had a brief look at your example but what I found is truly bizarre. If I run your example but change the output file name to 'diffuse_model_gc.fits' then I no longer see the orientation flip. Peeking directly at the file it looks like the data array orientation written by astropy.io seems to change depending on whether a .fits or .fits.gz file suffix is used.

@cdeil

This comment has been minimized.

Member

cdeil commented Mar 22, 2018

So the .T transpose calls in _init_data are there by design?
Does this mean that when passing data directly to init it should always be a transposed Numpy array as well?
Or do some maps have "normal" data arrays, and some a transposed array (i.e. a view with axes in the different order)? That mix would be super-hard to treat consistently everywhere within Gammapy and user code, no?

I'm not sure what the solution is here, but I'm +1 to not have transposed arrays as data member always, and instead to transpose or treat axes order as needed on access.

@woodmd - can that be achieved, e.g. by removing the .T in _init_data and instead doing shape[::-1] there?

I'll start working on this soon, but I'm not sure yet which changes to make (or only docstring improvements as you suggest).

I'll try to produce a minimal test case for the different behaviour when writing .fits and .fits.gz and report it to Astropy. Probably they don't ever expect to get a transposed data passed in, and for some weird reason treat it incorrectly only in the case of writing to .fits.gz.

@woodmd

This comment has been minimized.

Member

woodmd commented Mar 22, 2018

So the .T transpose calls in _init_data are there by design?
Does this mean that when passing data directly to init it should always be a transposed Numpy array as well?
Or do some maps have "normal" data arrays, and some a transposed array (i.e. a view with axes in the different order)? That mix would be super-hard to treat consistently everywhere within Gammapy and user code, no?

The data array is column-major (i.e. transposed with respect to row-major) for all map objects and all map init methods expect a data array with this ordering. This was the same convention used in SkyImage and SkyCube and in the fermipy skymap code. Note that what you consider to be a "transposed" array is just a matter of convention. What I personally prefer about the row-major order is that the position of the axes doesn't change when you add dimensions to an object. This makes it simpler to write some indexing logic (for instance you know LON and LAT are the first two axes).

@woodmd - can that be achieved, e.g. by removing the .T in _init_data and instead doing shape[::-1] there?

This would be cleaner but I don't think it would change the behavior of the code in any way. A transposed array and a transposed view should be totally equivalent (but perhaps there are exceptions to this).

I'll start working on this soon, but I'm not sure yet which changes to make (or only docstring improvements as you suggest).

Switching the axis order internally would be a major architectural change and probably deserves some dedicated discussion in a separate issue thread if you want to go this route. Maybe you can move that to #1180 where we discussed the same issue.

I'll try to produce a minimal test case for the different behaviour when writing .fits and .fits.gz and report it to Astropy. Probably they don't ever expect to get a transposed data passed in, and for some weird reason treat it incorrectly only in the case of writing to .fits.gz.

It may be that astropy does something weird when getting a transposed view. In that case changing to shape[::-1] in _init_data would potentially resolve this issue.

@cdeil

This comment has been minimized.

Member

cdeil commented Mar 22, 2018

Thanks for the advice!

In that case changing to shape[::-1] in _init_data would potentially resolve this issue.

I will try that now.

@cdeil

This comment has been minimized.

Member

cdeil commented Mar 22, 2018

The fact that FITS write fails on transposed Numpy data arrays, only (or at least?) for .gz FITS files is a long-standing open issue. I've left a comment and minimal test case to reproduce here:
astropy/astropy#2150 (comment)

I tried this patch in Gammapy maps

$ git diff
diff --git a/gammapy/maps/wcsnd.py b/gammapy/maps/wcsnd.py
index 88f88e56..ebe526b1 100644
--- a/gammapy/maps/wcsnd.py
+++ b/gammapy/maps/wcsnd.py
@@ -39,21 +39,23 @@ class WcsNDMap(WcsMap):
     def __init__(self, geom, data=None, dtype='float32', meta=None):
         # TODO: Figure out how to mask pixels for integer data types
 
+        # Shape in WCS or FITS order is `shape`, in Numpy axis order is `shape_np`
         shape = tuple([np.max(geom.npix[0]), np.max(geom.npix[1])] +
                       [ax.nbin for ax in geom.axes])
+        shape_np = shape[::-1]
+
         if data is None:
-            data = self._init_data(geom, shape, dtype)
-        elif data.shape != shape[::-1]:
+            data = self._init_data(geom, shape_np, dtype)
+        elif data.shape != shape_np:
             raise ValueError('Wrong shape for input data array. Expected {} '
                              'but got {}'.format(shape, data.shape))
 
         super(WcsNDMap, self).__init__(geom, data, meta)
 
-    def _init_data(self, geom, shape, dtype):
+    def _init_data(self, geom, shape_np, dtype):
         # Check whether corners of each image plane are valid
         coords = []
         if not geom.is_regular:
-
             for idx in np.ndindex(geom.shape):
                 pix = (np.array([0.0, float(geom.npix[0][idx] - 1)]),
                        np.array([0.0, float(geom.npix[1][idx] - 1)]))
@@ -61,24 +63,22 @@ class WcsNDMap(WcsMap):
                 coords += geom.pix_to_coord(pix)
 
         else:
-
             pix = (np.array([0.0, float(geom.npix[0] - 1)]),
                    np.array([0.0, float(geom.npix[1] - 1)]))
             pix += tuple([np.array(2 * [0.0]) for i in range(geom.ndim - 2)])
             coords += geom.pix_to_coord(pix)
 
         if np.all(np.isfinite(np.vstack(coords))):
-
             if geom.is_regular:
-                data = np.zeros(shape, dtype=dtype).T
+                data = np.zeros(shape_np, dtype=dtype)
             else:
-                data = np.nan * np.ones(shape, dtype=dtype).T
+                data = np.nan * np.ones(shape_np, dtype=dtype)
                 for idx in np.ndindex(geom.shape):
                     data[idx,
                          slice(geom.npix[0][idx]),
                          slice(geom.npix[1][idx])] = 0.0
         else:
-            data = np.full(shape, np.nan, dtype=dtype).T
+            data = np.full(shape_np, np.nan, dtype=dtype)
             idx = geom.get_idx()
             m = np.all(np.stack([t != -1 for t in idx]), axis=0)
             data[m] = 0.0

But this is making these I/O tests fail:

gammapy/maps/tests/test_wcsnd.py::test_wcsndmap_read_write[None-10.0-GAL-AIT-skydir0-None] FAILED                                                  [ 70%]
gammapy/maps/tests/test_wcsnd.py::test_wcsndmap_read_write[None-10.0-GAL-AIT-skydir1-axes1] FAILED                                                 [ 71%]
gammapy/maps/tests/test_wcsnd.py::test_wcsndmap_read_write[None-binsz2-GAL-AIT-skydir2-axes2] FAILED                                               [ 71%]
gammapy/maps/tests/test_wcsnd.py::test_wcsndmap_read_write[None-10.0-GAL-AIT-skydir3-axes3] FAILED                                                 [ 71%]
gammapy/maps/tests/test_wcsnd.py::test_wcsndmap_read_write[None-binsz4-GAL-AIT-skydir4-axes4] FAILED                                               [ 71%]

here:

>       assert_allclose(m0.data, m1.data)

gammapy/maps/tests/test_wcsnd.py:77: 

I guess I should go hunt for another place where we currently transpose data?

@cdeil cdeil requested a review from woodmd Mar 22, 2018

@cdeil

This comment has been minimized.

Member

cdeil commented Mar 22, 2018

I attached a commit with a regression test (all of the asserts are failing before the fix), and then the attempted fix to avoid having transposed Numpy arrays as map data objects.

It fixes the regression tests, and most things still work. However sparse map I/O is failing for some existing tests (but not the new one). I haven't figured out yet how to trigger the sparse map I/O issue with a simple test case or how to fix it.

I'm offline now, but I'll try to fully resolve this tomorrow. @woodmd - If you could have a look and comment if you're OK with the changes so far, that would be very helpful.

@cdeil

This comment has been minimized.

Member

cdeil commented Mar 22, 2018

@woodmd - might the issue be here?

data_flat[~np.isfinite(data_flat)] = 0

Looks like in WcsMap.make_hdu you are modifying self.data.
That shouldn't be the case, make_hdu should leave self.data alone, no?

@cdeil

This comment has been minimized.

Member

cdeil commented Mar 22, 2018

I can confirm that changing

data = self.data

at the top of WcsMap.make_hdu to

data = self.data.copy()

resolves the remaining failing tests:

gammapy/maps/tests/test_wcsnd.py::test_wcsndmap_read_write[None-10.0-GAL-AIT-skydir0-None] FAILED                                                  [ 70%]
gammapy/maps/tests/test_wcsnd.py::test_wcsndmap_read_write[None-10.0-GAL-AIT-skydir1-axes1] FAILED                                                 [ 71%]
gammapy/maps/tests/test_wcsnd.py::test_wcsndmap_read_write[None-binsz2-GAL-AIT-skydir2-axes2] FAILED                                               [ 71%]
gammapy/maps/tests/test_wcsnd.py::test_wcsndmap_read_write[None-10.0-GAL-AIT-skydir3-axes3] FAILED                                                 [ 71%]
gammapy/maps/tests/test_wcsnd.py::test_wcsndmap_read_write[None-binsz4-GAL-AIT-skydir4-axes4] FAILED                                               [ 71%]

Is that the proper fix or is there a better way?

@woodmd

This comment has been minimized.

Member

woodmd commented Mar 23, 2018

Looks like in WcsMap.make_hdu you are modifying self.data.
That shouldn't be the case, make_hdu should leave self.data alone, no?

I think I had assumed np.ravel makes a copy but looking at the documentation now I see that's not the case. You're correct that make_hdu should not be modifying self.data.

Is that the proper fix or is there a better way?

Yes that would be fine although somewhat inefficient for the non-sparse case. A copy isn't needed when sparse=False so you could instead move this inside the if/else statement:

if sparse:
    data = self.data.copy()

    ...
elif hdu == 'PRIMARY':
    hdu_out = fits.PrimaryHDU(self.data, header=header)
else:
    hdu_out = fits.ImageHDU(self.data, header=header, name=hdu)

I noticed that we also initialize with a transposed array in HpxNDMap.__init__:

def __init__(self, geom, data=None, dtype='float32', meta=None):
shape = tuple([np.max(geom.npix)] + [ax.nbin for ax in geom.axes])
if data is None:
if geom.npix.size > 1:
data = np.nan * np.ones(shape, dtype=dtype).T
idx = geom.get_idx(local=True)
data[idx[::-1]] = 0.0
else:
data = np.zeros(shape, dtype=dtype).T
elif data.shape != shape[::-1]:
raise ValueError('Wrong shape for input data array. Expected {} '
'but got {}'.format(shape, data.shape))

so maybe you should switch to initializing with shape[::-1] here as well for consistency?

@cdeil

This comment has been minimized.

Member

cdeil commented Mar 23, 2018

@woodmd - Thanks for the tips.

Fix in e203822 - one question I have is why you always go to 64-bit float? Isn't the main application for the sparse format count maps? Do you want to use 64 bit float for those?
(I'd like to simplify the code there a bit more and resolve the one TODO I added for now)

@cdeil

This comment has been minimized.

Member

cdeil commented Mar 23, 2018

I've done some more small fixes in d3cf7ce and e1fd764 and then removed the data transpose in the default HPX map data init in 8fb4b4d as you suggested.

The only remaining cases where we transpose map data arrays are those:

$ ack 'data\.T' gammapy/maps
gammapy/maps/hpxnd.py
324:        return self.data.T[idx]
373:            return np.sum(self.data.T[pix] * wts, axis=0)
399:            val += np.nansum(wts * wt * self.data.T[pix[:1] + pix_i], axis=0)
416:        idx_local = np.ravel_multi_index(idx_local, self.data.T.shape)
419:        self.data.T.flat[idx_local] += weights
424:        self.data.T[idx_local] = vals

gammapy/maps/wcsnd.py
127:        return self.data.T[idx]
167:            data = self.data.T
177:        return map_coordinates(self.data.T, pix, order=order, mode='nearest')
241:        idx = np.ravel_multi_index(idx, self.data.T.shape)
244:        self.data.T.flat[idx] += weights
248:        self.data.T[idx] = vals
439:        data = map_coordinates(self.data.T, pix, order=order, mode='nearest')

@woodmd - I think all of those are OK, and while probably they could be rewritten to avoid the transpose, there's no need? Is this PR ready to merge?

@cdeil cdeil changed the title from Bug in WCSMapND write to Fix bug in map write to .fits.gz (related to map data array transpose) Mar 23, 2018

@cdeil cdeil changed the title from Fix bug in map write to .fits.gz (related to map data array transpose) to Fix bug in map write to .fits.gz (related to map data transpose) Mar 23, 2018

@cdeil cdeil changed the title from Fix bug in map write to .fits.gz (related to map data transpose) to Fix bug in map .fits.gz write (change map data transpose) Mar 23, 2018

@woodmd

This comment has been minimized.

Member

woodmd commented Mar 24, 2018

Fix in e203822 - one question I have is why you always go to 64-bit float? Isn't the main application for the sparse format count maps? Do you want to use 64 bit float for those?

Yes this was an oversight. There should be some logic here to generate the appropriate FITS data type keyword from the numpy array dtype. This somewhat goes to the question of whether we want to support integer data types at all. At present the code doesn't really support them because we use np.nan as a guard value for undefined pixels (undefined pixels arise for multi-resolution geometries and some all-sky WCS projections). To fully support integer types we would need to update the logic for handling guard values in a way that depends on the data type or perhaps rewrite the handling of undefined pixels entirely with a mask array. I've begun to think that having a mask array would be a better design than the current guard value approach but that's probably better discussed in a separate thread. Maybe you want to open a new issue on the topic of integer support?

I think all of those are OK, and while probably they could be rewritten to avoid the transpose, there's no need? Is this PR ready to merge?

Yes I agree that these don't need to be rewritten. Looks ready to merge.

@woodmd

woodmd approved these changes Mar 24, 2018

@cdeil cdeil merged commit 427878a into gammapy:master Mar 26, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@cdeil

This comment has been minimized.

Member

cdeil commented Mar 26, 2018

I've begun to think that having a mask array would be a better design than the current guard value approach but that's probably better discussed in a separate thread. Maybe you want to open a new issue on the topic of integer support?

@woodmd - I've opened #1350 . It's not clear to me if introducing mask arrays is necessary or a better solution, so I've put it as an open question.

@cdeil

This comment has been minimized.

Member

cdeil commented Mar 26, 2018

@woodmd - Thank you for the advice and review here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment