Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variables became numpy masked arrays #849

Closed
minxu74 opened this issue Oct 16, 2018 · 6 comments
Closed

Variables became numpy masked arrays #849

minxu74 opened this issue Oct 16, 2018 · 6 comments

Comments

@minxu74
Copy link

minxu74 commented Oct 16, 2018

I have noticed that variables in a nc file became a numpy masked arrays though there were no any masked values in variables. For example:

$ ncdump -v lon lon_bnds.nc |more

netcdf lon_bnds {
dimensions:
        lon = 288 ;
        nbnd = 2 ;
variables:
        double lon(lon) ;
                lon:long_name = "Longitude" ;
                lon:standard_name = "longitude" ;
                lon:units = "degrees_east" ;
                lon:axis = "X" ;
                lon:bounds = "lon_bnds" ;
        double lon_bnds(lon, nbnd) ;
                lon_bnds:long_name = "Gridcell longitude interfaces" ;

$ python

Python 2.7.15 |Anaconda, Inc.| (default, Oct 10 2018, 21:32:13) 
[GCC 7.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import netCDF4 as nc4
>>> nc4.__version__
'1.4.1'
>>> f=nc4.Dataset("lon_bnds.nc", "r")
>>> type(f.variables['lon'][...])
<class 'numpy.ma.core.MaskedArray'>
>>> f.variables['lon'][...].mask
False
>>> f.variables['lon'][...].mask.size
1

The size of mask of the masked array is only 1, not the same size of the array itself. The early version of netCDF4 treated it as a normal numpy array.

@akrherz
Copy link

akrherz commented Oct 16, 2018

See discussion on #785. You can restore the old behaviour by setting set_always_mask

@minxu74
Copy link
Author

minxu74 commented Oct 16, 2018

Hi @akrherz Thanks a lot for pointing out the discussion and a solution. It is okay for me to get the masked array as default, but is it reasonable to make the size of mask same as that of array, instead of just returning a scalar mask for a masked array though there is no any masked value?

@dopplershift
Copy link
Member

The use of a scalar value of False for the mask is an optimization (for size) done by MaskedArray for the case where a MaskedArray has no masked values. Is there a particular use case where this causes a problem for you?

@minxu74
Copy link
Author

minxu74 commented Oct 16, 2018

Hi @dopplershift Thanks a lot. Yes. In our codes, if an array (A) is a masked array, we will do some calculations based on the values of the A.mask and also assume that the size of A.mask is same as A. The returning scalar "False" will break the codes.

@jswhit
Copy link
Collaborator

jswhit commented Oct 17, 2018

I don't think you can assume that A.mask is the same shape as A for masked arrays (see the comment by @dopplershift above). If A.mask is a scalar boolean, you could create a full mask that has the same shape as A (mask = np.ones(A.shape, dtype='bool') if A.mask=True, ormask = np.zeros(A.shape, dtype='bool') if A.mask=False). Having A.mask be a scalar boolean can save a lot of memory when A is large.

@minxu74
Copy link
Author

minxu74 commented Nov 1, 2018

Thanks. I saw your point. I think I should close this issue now.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants