Accessing fields for a masked structured array fails with ValueError #2972

Closed
gerritholl opened this Issue Feb 8, 2013 · 3 comments

Projects

None yet

2 participants

@gerritholl

I'm trying to create a mask for a structured array. When I try to access a field from the array or simply represent it, numpy throws a ValueError: field names A not found, as illustrated below.

>>> R = numpy.empty(10, dtype=[("A", "<f2"), ("B", "<f4")])
>>> Rm = numpy.ma.masked_where(R["A"]<0.1, R)
>>> Rm["A"]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/local/gerrit/python3.2-bleed/lib/python3.2/site-packages/numpy/ma/core.py", line 3014, in __getitem__
    dout._mask = _mask[indx]
ValueError: field named A not found.
>>> print(Rm)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/local/gerrit/python3.2-bleed/lib/python3.2/site-packages/numpy/ma/core.py", line 3583, in __str__
    _recursive_printoption(res, m, f)
  File "/local/gerrit/python3.2-bleed/lib/python3.2/site-packages/numpy/ma/core.py", line 2294, in _recursive_printoption
    (curdata, curmask) = (result[name], mask[name])
ValueError: field named A not found.
>>> print(numpy.version.version)
1.8.0.dev-b8bfcd0
>>> print(numpy.version.git_revision)
b8bfcd02a2f246a9c23675e1650c3d316d733306

I tested it with the stable version 1.6.2 and the bleeding-edge version obtained directly from git.
It also fails in earlier stable versions (tested with 1.6.2).

Note: if I instead create directly a masked array with numpy.ma.empty, I get a different error:

>>> R2 = numpy.ma.empty(10, dtype=[("A", "<f2"), ("B", "<f4")])
>>> Rm2 = numpy.ma.masked_where(R2["A"]<0.1, R2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/local/gerrit/python3.2-bleed/lib/python3.2/site-packages/numpy/ma/core.py", line 1810, in masked_where
    cond = mask_or(cond, a._mask)
  File "/local/gerrit/python3.2-bleed/lib/python3.2/site-packages/numpy/ma/core.py", line 1627, in mask_or
    raise ValueError("Incompatible dtypes '%s'<>'%s'" % (dtype1, dtype2))
ValueError: Incompatible dtypes 'bool'<>'[('A', '?'), ('B', '?')]
@gerritholl

I don't know if this is the same issue, but with newer numpy versions, the error message has changed:

>>> import sys, os
>>> print(numpy.version.version, numpy.version.git_revision, sys.version, os.uname())
1.9.0.dev-a0cf183 a0cf18394d5ce33514fdc37093bd2f65ad4b0dde 3.4.0 (default, Apr  3 2014, 09:15:09) 
[GCC 4.8.1] posix.uname_result(sysname='Linux', nodename='ostrovnoy', release='3.11.0-19-generic', version='#33-Ubuntu SMP Tue Mar 11 18:48:34 UTC 2014', machine='x86_64')
>>> A = numpy.ma.masked_all(shape=5, dtype=[("A", "f4", 3)])
>>> b = A[0]
>>> b["A"]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/gerrit/venv/python-3.4-bleed/lib/python3.4/site-packages/numpy/ma/core.py", line 5619, in __getitem__
    if m is not nomask and m[indx]:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
@gerritholl

Still present in version 1.10.0.dev+00ee332 but with a yet different error message:

In [131]: R = numpy.empty(10, dtype=[("A", "<f2"), ("B", "<f4")])

In [132]: Rm = numpy.ma.masked_where(R["A"]<0.1, R)

In [133]: Rm["A"]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-133-7f4318176b14> in <module>()
----> 1 Rm["A"]

/export/data/home/gholl/venv/bleeding/lib/python3.4/site-packages/numpy/ma/core.py in __getitem__(self, indx)
   3071             # Update the mask if needed
   3072             if _mask is not nomask:
-> 3073                 dout._mask = _mask[indx]
   3074                 dout._sharedmask = True
   3075 #               Note: Don't try to check for m.any(), that'll take too long...

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
@gerritholl

I started to fix this, but as I realised the problem is caused by Rm.mask.dtype being unstructured (type bool), although Rm.dtype is structured. Is this supposed to be possible? I asked at Stack Overflow: http://stackoverflow.com/q/28182408/974555

@gerritholl gerritholl pushed a commit to gerritholl/numpy that referenced this issue Jan 28, 2015
Gerrit Holl BUG: fix for issue #2972
This commit fixes issue #2972 by considering the possibility that, for a
masked structured array, the mask is not necessarily structured.  This
affects item getting and string representation.
83ddd6f
@gerritholl gerritholl pushed a commit to gerritholl/numpy that referenced this issue Jan 28, 2015
Gerrit Holl TST: Test for fix for issue #2972
This commit adds a test for the bugfix in commit
83ddd6f, which fixes issue #2972.
f862109
@gerritholl gerritholl pushed a commit to gerritholl/numpy that referenced this issue Jan 29, 2015
Gerrit Holl BUG: Fix for #2972
This commit fixes issue #2972 by a small change in masked_where.
masked_where was inadvertently setting an unstructured mask on a
structured array.  This causes problems for many methods (.tolist(),
.__repr__(), .__getitem__(), perhaps more).  Setting .mask instead of
._mask fixes this problem.
b1e4b73
@gerritholl gerritholl pushed a commit to gerritholl/numpy that referenced this issue Jan 29, 2015
Gerrit Holl TST: Added test case for commit fixing #2972
This commit adds a test case for commit
b1e4b73, to make sure that masked_where
respects that a structured masked array should have a structured mask.
48f3716
@gerritholl gerritholl pushed a commit to gerritholl/numpy that referenced this issue Jan 29, 2015
Gerrit Holl BUG: Fix for issue #2972
This commit fixes a bug in numpy.ma.masked_where when it is passed a
structured array.  masked_where was inadvertently setting the _mask field
of a structured array to a non-structured array, so we would end up with
a masked structured array where x.data was structured, but x.mask was not.
This would lead to troubles for methods such as __getitem__, __repr__, and
tolist.  By writinge to .mask instead of ._mask, this problem is
mitigated.
b58fc30
@gerritholl gerritholl pushed a commit to gerritholl/numpy that referenced this issue Jan 29, 2015
Gerrit Holl TST: Add test for fix to issue #2972
Add a test to TestMaskedArrayFunctions to verify that a masked array
created with masked_where has a structured mask.  This was previously
unstructured, leading to bug #2972.  Commit for original bugfix was
b58fc30.
e084749
@gerritholl gerritholl pushed a commit to gerritholl/numpy that referenced this issue Jan 29, 2015
Gerrit Holl BUG: Fix for issue #2972
This commit fixes a bug in numpy.ma.masked_where when it is passed a
structured array.  masked_where was inadvertently setting the _mask field
of a structured array to a non-structured array, so we would end up with
a masked structured array where x.data was structured, but x.mask was not.
This would lead to troubles for methods such as __getitem__, __repr__, and
tolist.  By writinge to .mask instead of ._mask, this problem is
mitigated.
ca4e201
@gerritholl gerritholl pushed a commit to gerritholl/numpy that referenced this issue Jan 29, 2015
Gerrit Holl TST: Add test for fix to issue #2972
Add a test to TestMaskedArrayFunctions to verify that a masked array
created with masked_where has a structured mask.  This was previously
unstructured, leading to bug #2972.  Commit for original bugfix was
ca4e201.
d764f02
@gerritholl gerritholl pushed a commit to gerritholl/numpy that referenced this issue Mar 12, 2015
Gerrit Holl TST: Fix test for fix to issue #2972
Expand test for fix for issue #2972 to be more thorough.  Commit for
original bugfix was ca4e201 and commit
for original test was d764f02.
f6c790b
@charris charris added a commit to charris/numpy that referenced this issue May 2, 2015
Gerrit Holl TST: Add test for fix to issue #2972
Add a test to TestMaskedArrayFunctions to verify that a masked
structures array created with masked_where has a structured mask. The
mask was previously unstructured, leading to bug #2972.
d7ffaea
@charris charris added a commit that closed this issue May 2, 2015
Gerrit Holl BUG: Fix mask assignment in masked_where to use .mask property.
This commit fixes a bug in numpy.ma.masked_where when it is passed a
structured array.  masked_where was inadvertently setting the _mask
field of a structured array to a non-structured array, so we would end
up with a masked structured array where x.data was structured, but
x.mask was not.  This led to troubles for methods such as __getitem__,
__repr__, and tolist.  By writing to the property .mask instead of
._mask, this problem is fixed.

Closes #2972.
a7d663f
@charris charris closed this in a7d663f May 2, 2015
@daniel-perry daniel-perry added a commit to daniel-perry/numpy that referenced this issue Jun 2, 2016
Gerrit Holl BUG: Fix mask assignment in masked_where to use .mask property.
This commit fixes a bug in numpy.ma.masked_where when it is passed a
structured array.  masked_where was inadvertently setting the _mask
field of a structured array to a non-structured array, so we would end
up with a masked structured array where x.data was structured, but
x.mask was not.  This led to troubles for methods such as __getitem__,
__repr__, and tolist.  By writing to the property .mask instead of
._mask, this problem is fixed.

Closes #2972.
59bee5c
@daniel-perry daniel-perry added a commit to daniel-perry/numpy that referenced this issue Jun 2, 2016
Gerrit Holl TST: Add test for fix to issue #2972
Add a test to TestMaskedArrayFunctions to verify that a masked
structures array created with masked_where has a structured mask. The
mask was previously unstructured, leading to bug #2972.
e4a3012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment