Skip to content

Commit

Permalink
Merge pull request #10537 from ahaldane/revert_multifield_view_1.14
Browse files Browse the repository at this point in the history
BUG: multifield-indexing adds padding bytes: revert for 1.14.1
  • Loading branch information
charris committed Feb 7, 2018
2 parents 0cb9e62 + 08469bb commit 7e980b5
Show file tree
Hide file tree
Showing 9 changed files with 294 additions and 35 deletions.
22 changes: 20 additions & 2 deletions doc/release/1.14.1-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,33 @@
NumPy 1.14.1 Release Notes
==========================

This is a bugfix release for some problems found since 1.14.0. This release
includes fixes to the spacing in the str and repr of complex values.
This is a bugfix release for some problems found since 1.14.0. The most
important fixes are the reversion of the multifield index "view" behavior (see
compatibility notes), and fixes to the spacing in the str and repr of complex
values.

The Python versions supported are 2.7 and 3.4 - 3.6. The Python 3.6 wheels
available from PIP are built with Python 3.6.2 and should be compatible with
all previous versions of Python 3.6. It was cythonized with Cython 0.26.1,
which should be free of the bugs found in 0.27 while also being compatible with
Python 3.7-dev.


Compatibility notes
===================

Multifield Indexing of Structured Arrays will still return a copy
-----------------------------------------------------------------
We have reverted the change in 1.14.0 that multi-field indexing of structured
arrays returns a view instead of a copy, and pushed it back to 1.15. This will
give us time to implement additional related bugfixes to ease the transition.
Affected users should read the the 1.14.1 Numpy User Guide for advice on how to
manage this transition, under "basics/structured arrays/accessing multiple
fields". A new method `numpy.lib.recfunctions.repack_fields` has been
introduced to help with this. All other changes to structured arrays described
in the 1.14.0 release notes still hold in 1.14.1, and have not been reverted.


Contributors
============

Expand Down
2 changes: 1 addition & 1 deletion numpy/core/src/multiarray/convert.c
Original file line number Diff line number Diff line change
Expand Up @@ -617,7 +617,7 @@ PyArray_View(PyArrayObject *self, PyArray_Descr *type, PyTypeObject *pytype)
const char *msg =
"Numpy has detected that you may be viewing or writing to an array "
"returned by selecting multiple fields in a structured array. \n\n"
"This code may break in numpy 1.13 because this will return a view "
"This code may break in numpy 1.15 because this will return a view "
"instead of a copy -- see release notes for details.";
/* 2016-09-19, 1.12 */
if (DEPRECATE_FUTUREWARNING(msg) < 0) {
Expand Down
59 changes: 52 additions & 7 deletions numpy/core/src/multiarray/mapping.c
Original file line number Diff line number Diff line change
Expand Up @@ -1383,15 +1383,56 @@ array_subscript_asarray(PyArrayObject *self, PyObject *op)
return PyArray_EnsureAnyArray(array_subscript(self, op));
}

/*
* Helper function for _get_field_view which turns a multifield
* view into a "packed" copy, as done in numpy 1.14 and before.
* In numpy 1.15 this function is removed.
*/
NPY_NO_EXPORT int
_multifield_view_to_copy(PyArrayObject **view) {
static PyObject *copyfunc = NULL;
PyObject *viewcopy;

/* return a repacked copy of the view */
npy_cache_import("numpy.lib.recfunctions", "repack_fields", &copyfunc);
if (copyfunc == NULL) {
goto view_fail;
}

PyArray_CLEARFLAGS(*view, NPY_ARRAY_WARN_ON_WRITE);
viewcopy = PyObject_CallFunction(copyfunc, "O", *view);
if (viewcopy == NULL) {
goto view_fail;
}
Py_DECREF(*view);
*view = (PyArrayObject*)viewcopy;

/* warn when writing to the copy */
PyArray_ENABLEFLAGS(*view, NPY_ARRAY_WARN_ON_WRITE);
return 0;

view_fail:
Py_DECREF(*view);
*view = NULL;
return 0;
}


/*
* Attempts to subscript an array using a field name or list of field names.
*
* If an error occurred, return 0 and set view to NULL. If the subscript is not
* a string or list of strings, return -1 and set view to NULL. Otherwise
* return 0 and set view to point to a new view into arr for the given fields.
*
* In numpy 1.14 and before, in the case of a list of field names the returned
* view will actually be a copy by default, with fields packed together.
* The `force_view` argument causes a view to be returned. This argument can be
* removed in 1.15 when we plan to return a view always.
*/
NPY_NO_EXPORT int
_get_field_view(PyArrayObject *arr, PyObject *ind, PyArrayObject **view)
_get_field_view(PyArrayObject *arr, PyObject *ind, PyArrayObject **view,
int force_view)
{
*view = NULL;

Expand Down Expand Up @@ -1490,12 +1531,12 @@ _get_field_view(PyArrayObject *arr, PyObject *ind, PyArrayObject **view)
Py_DECREF(names);
return 0;
}
// disallow use of titles as index
/* disallow use of titles as index */
if (PyTuple_Size(tup) == 3) {
PyObject *title = PyTuple_GET_ITEM(tup, 2);
int titlecmp = PyObject_RichCompareBool(title, name, Py_EQ);
if (titlecmp == 1) {
// if title == name, we were given a title, not a field name
/* if title == name, we got a title, not a field name */
PyErr_SetString(PyExc_KeyError,
"cannot use field titles in multi-field index");
}
Expand All @@ -1508,7 +1549,7 @@ _get_field_view(PyArrayObject *arr, PyObject *ind, PyArrayObject **view)
}
Py_DECREF(title);
}
// disallow duplicate field indices
/* disallow duplicate field indices */
if (PyDict_Contains(fields, name)) {
PyObject *errmsg = PyUString_FromString(
"duplicate field of name ");
Expand Down Expand Up @@ -1562,7 +1603,11 @@ _get_field_view(PyArrayObject *arr, PyObject *ind, PyArrayObject **view)
return 0;
}

return 0;
if (force_view) {
return 0;
}

return _multifield_view_to_copy(view);
}
return -1;
}
Expand Down Expand Up @@ -1590,7 +1635,7 @@ array_subscript(PyArrayObject *self, PyObject *op)
/* return fields if op is a string index */
if (PyDataType_HASFIELDS(PyArray_DESCR(self))) {
PyArrayObject *view;
int ret = _get_field_view(self, op, &view);
int ret = _get_field_view(self, op, &view, 0);
if (ret == 0){
if (view == NULL) {
return NULL;
Expand Down Expand Up @@ -1879,7 +1924,7 @@ array_assign_subscript(PyArrayObject *self, PyObject *ind, PyObject *op)
/* field access */
if (PyDataType_HASFIELDS(PyArray_DESCR(self))){
PyArrayObject *view;
int ret = _get_field_view(self, ind, &view);
int ret = _get_field_view(self, ind, &view, 1);
if (ret == 0){
if (view == NULL) {
return -1;
Expand Down
14 changes: 14 additions & 0 deletions numpy/core/src/private/npy_import.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,18 @@ npy_cache_import(const char *module, const char *attr, PyObject **cache)
}
}

NPY_INLINE static PyObject *
npy_import(const char *module, const char *attr)
{
PyObject *mod = PyImport_ImportModule(module);
PyObject *ret = NULL;

if (mod != NULL) {
ret = PyObject_GetAttrString(mod, attr);
}
Py_XDECREF(mod);

return ret;
}

#endif
63 changes: 59 additions & 4 deletions numpy/core/tests/test_multiarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -4705,10 +4705,21 @@ def test_field_names(self):
# multiple subfields
fn2 = func('f2')
b[fn2] = 3

assert_equal(b[['f1', 'f2']][0].tolist(), (2, 3))
assert_equal(b[['f2', 'f1']][0].tolist(), (3, 2))
assert_equal(b[['f1', 'f3']][0].tolist(), (2, (1,)))
with suppress_warnings() as sup:
sup.filter(FutureWarning,
"Numpy has detected that you .*")

assert_equal(b[['f1', 'f2']][0].tolist(), (2, 3))
assert_equal(b[['f2', 'f1']][0].tolist(), (3, 2))
assert_equal(b[['f1', 'f3']][0].tolist(), (2, (1,)))
# view of subfield view/copy
assert_equal(b[['f1', 'f2']][0].view(('i4', 2)).tolist(),
(2, 3))
assert_equal(b[['f2', 'f1']][0].view(('i4', 2)).tolist(),
(3, 2))
view_dtype = [('f1', 'i4'), ('f3', [('', 'i4')])]
assert_equal(b[['f1', 'f3']][0].view(view_dtype).tolist(),
(2, (1,)))

# non-ascii unicode field indexing is well behaved
if not is_py3:
Expand All @@ -4718,6 +4729,50 @@ def test_field_names(self):
assert_raises(ValueError, a.__setitem__, u'\u03e0', 1)
assert_raises(ValueError, a.__getitem__, u'\u03e0')

def test_field_names_deprecation(self):

def collect_warnings(f, *args, **kwargs):
with warnings.catch_warnings(record=True) as log:
warnings.simplefilter("always")
f(*args, **kwargs)
return [w.category for w in log]

a = np.zeros((1,), dtype=[('f1', 'i4'),
('f2', 'i4'),
('f3', [('sf1', 'i4')])])
a['f1'][0] = 1
a['f2'][0] = 2
a['f3'][0] = (3,)
b = np.zeros((1,), dtype=[('f1', 'i4'),
('f2', 'i4'),
('f3', [('sf1', 'i4')])])
b['f1'][0] = 1
b['f2'][0] = 2
b['f3'][0] = (3,)

# All the different functions raise a warning, but not an error
assert_equal(collect_warnings(a[['f1', 'f2']].__setitem__, 0, (10, 20)),
[FutureWarning])
# For <=1.12 a is not modified, but it will be in 1.13
assert_equal(a, b)

# Views also warn
subset = a[['f1', 'f2']]
subset_view = subset.view()
assert_equal(collect_warnings(subset_view['f1'].__setitem__, 0, 10),
[FutureWarning])
# But the write goes through:
assert_equal(subset['f1'][0], 10)
# Only one warning per multiple field indexing, though (even if there
# are multiple views involved):
assert_equal(collect_warnings(subset['f1'].__setitem__, 0, 10), [])

# make sure views of a multi-field index warn too
c = np.zeros(3, dtype='i8,i8,i8')
assert_equal(collect_warnings(c[['f0', 'f2']].view, 'i8,i8'),
[FutureWarning])


def test_record_hash(self):
a = np.array([(1, 2), (1, 2)], dtype='i1,i2')
a.flags.writeable = False
Expand Down
3 changes: 2 additions & 1 deletion numpy/core/tests/test_records.py
Original file line number Diff line number Diff line change
Expand Up @@ -384,7 +384,8 @@ def test_objview_record(self):

# https://github.com/numpy/numpy/issues/3256
ra = np.recarray((2,), dtype=[('x', object), ('y', float), ('z', int)])
ra[['x','y']] # TypeError?
with assert_warns(FutureWarning):
ra[['x','y']] # TypeError?

def test_record_scalar_setitem(self):
# https://github.com/numpy/numpy/issues/3561
Expand Down
74 changes: 56 additions & 18 deletions numpy/doc/structured_arrays.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,10 +133,9 @@
Offsets may be chosen such that the fields overlap, though this will mean
that assigning to one field may clobber any overlapping field's data. As
an exception, fields of :class:`numpy.object` type .. (see
:ref:`object arrays <arrays.object>`) cannot overlap with other fields,
because of the risk of clobbering the internal object pointer and then
dereferencing it.
an exception, fields of :class:`numpy.object` type cannot overlap with
other fields, because of the risk of clobbering the internal object
pointer and then dereferencing it.
The optional 'aligned' value can be set to ``True`` to make the automatic
offset computation use aligned offsets (see :ref:`offsets-and-alignment`),
Expand Down Expand Up @@ -235,6 +234,11 @@
alignment conditions, the array will have the ``ALIGNED`` :ref:`flag
<numpy.ndarray.flags>` set.
A convenience function :func:`numpy.lib.recfunctions.repack_fields` converts an
aligned dtype or array to a packed one and vice versa. It takes either a dtype
or structured ndarray as an argument, and returns a copy with fields re-packed,
with or without padding bytes.
.. _titles:
Field Titles
Expand Down Expand Up @@ -396,27 +400,61 @@
Accessing Multiple Fields
```````````````````````````
One can index a structured array with a multi-field index, where the index is a
list of field names::
One can index and assign to a structured array with a multi-field index, where
the index is a list of field names.
.. warning::
The behavior of multi-field indexes will change from Numpy 1.14 to Numpy
1.15.
>>> a = np.zeros(3, dtype=[('a', 'i8'), ('b', 'i4'), ('c', 'f8')])
In Numpy 1.15, the result of indexing with a multi-field index will be a view
into the original array, as follows::
>>> a = np.zeros(3, dtype=[('a', 'i4'), ('b', 'i4'), ('c', 'f4')])
>>> a[['a', 'c']]
array([(0, 0.0), (0, 0.0), (0, 0.0)],
dtype={'names':['a','c'], 'formats':['<i8','<f8'], 'offsets':[0,11], 'itemsize':19})
array([(0, 0.), (0, 0.), (0, 0.)],
dtype={'names':['a','c'], 'formats':['<i4','<f4'], 'offsets':[0,8], 'itemsize':12})
Assignment to the view modifies the original array. The view's fields will be
in the order they were indexed. Note that unlike for single-field indexing, the
view's dtype has the same itemsize as the original array, and has fields at the
same offsets as in the original array, and unindexed fields are merely missing.
In Numpy 1.14, indexing an array with a multi-field index returns a copy of
the result above for 1.15, but with fields packed together in memory as if
passed through :func:`numpy.lib.recfunctions.repack_fields`. This is the
behavior of Numpy 1.7 to 1.13.
.. warning::
The new behavior in Numpy 1.15 leads to extra "padding" bytes at the
location of unindexed fields. You will need to update any code which depends
on the data having a "packed" layout. For instance code such as::
>>> a[['a','c']].view('i8') # will fail in Numpy 1.15
ValueError: When changing to a smaller dtype, its size must be a divisor of the size of original dtype
will need to be changed. This code has raised a ``FutureWarning`` since
Numpy 1.12.
The following is a recommended fix, which will behave identically in Numpy
1.14 and Numpy 1.15::
>>> from numpy.lib.recfunctions import repack_fields
>>> repack_fields(a[['a','c']]).view('i8') # supported 1.14 and 1.15
array([0, 0, 0])
Assigning to an array with a multi-field index will behave the same in Numpy
1.14 and Numpy 1.15. In both versions the assignment will modify the original
array::
>>> a[['a', 'c']] = (2, 3)
>>> a
array([(2, 0, 3.0), (2, 0, 3.0), (2, 0, 3.0)],
dtype=[('a', '<i8'), ('b', '<i4'), ('c', '<f8')])
The resulting array is a view into the original array, such that assignment to
the view modifies the original array. The view's fields will be in the order
they were indexed. Note that unlike for single-field indexing, the view's dtype
has the same itemsize as the original array, and has fields at the same offsets
as in the original array, and unindexed fields are merely missing.
Since the view is a structured array itself, it obeys the assignment rules
described above. For example, this means that one can swap the values of two
fields using appropriate multi-field indexes::
This obeys the structured array assignment rules described above. For example,
this means that one can swap the values of two fields using appropriate
multi-field indexes::
>>> a[['a', 'c']] = a[['c', 'a']]
Expand Down
Loading

0 comments on commit 7e980b5

Please sign in to comment.