Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Nditer as context manager #9998

Merged
merged 3 commits into from
Apr 21, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
26 changes: 26 additions & 0 deletions doc/release/1.15.0-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,17 @@ Deprecations
In the future, it might return a different result. Use `np.sum(np.from_iter(generator))`
or the built-in Python `sum` instead.

* Users of the C-API should call ``PyArrayResolveWriteBackIfCopy`` or
``PyArray_DiscardWritbackIfCopy`` on any array with the ``WRITEBACKIFCOPY``
flag set, before the array is deallocated. A deprecation warning will be
emitted if those calls are not used when needed.

* Users of ``nditer`` should use the nditer object as a context manager
anytime one of the iterator operands is writeable, so that numpy can
manage writeback semantics, or should call ``it.close()``. A
`RuntimeWarning` will be emitted otherwise in these cases. Users of the C-API
should call ``NpyIter_Close`` before ``NpyIter_Dealloc``.


Future Changes
==============
Expand All @@ -60,6 +71,19 @@ Future Changes
Compatibility notes
===================

Under certain conditions, nditer must be used in a context manager
------------------------------------------------------------------
When using an nditer with the ``"writeonly"`` or ``"readwrite"`` flags, there
are some circumstances where nditer doesn't actually give you a view onto the
writable array. Instead, it gives you a copy, and if you make changes to the
copy, nditer later writes those changes back into your actual array. Currently,
this writeback occurs when the array objects are garbage collected, which makes
this API error-prone on CPython and entirely broken on PyPy. Therefore,
``nditer`` should now be used as a context manager whenever using ``nditer``
with writeable arrays (``with np.nditer(...) as it: ...``). You may also
explicitly call ``it.close()`` for cases where a context manager is unusable,
for instance in generator expressions.

Numpy has switched to using pytest instead of nose for testing
--------------------------------------------------------------
The last nose release was 1.3.7 in June, 2015, and development of that tool has
Expand Down Expand Up @@ -93,6 +117,8 @@ using the old API.
C API changes
=============

``NpyIter_Close`` has been added and should be called before
``NpyIter_Dealloc`` to resolve possible writeback-enabled arrays.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be NpyIter_Deallocate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed (in doc-nditer PR)


New Features
============
Expand Down
87 changes: 51 additions & 36 deletions doc/source/reference/arrays.nditer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,14 @@ Modifying Array Values

By default, the :class:`nditer` treats the input array as a read-only
object. To modify the array elements, you must specify either read-write
or write-only mode. This is controlled with per-operand flags.
or write-only mode. This is controlled with per-operand flags. The
operands may be created as views into the original data with the
`WRITEBACKIFCOPY` flag. In this case the iterator must either

- be used as a context manager, and the temporary data will be written back
to the original array when the `__exit__` function is called.
- have a call to the iterator's `close` function to ensure the modified data
is written back to the original array.

Regular assignment in Python simply changes a reference in the local or
global variable dictionary instead of modifying an existing variable in
Expand All @@ -99,8 +106,9 @@ the ellipsis.
>>> a
array([[0, 1, 2],
[3, 4, 5]])
>>> for x in np.nditer(a, op_flags=['readwrite']):
... x[...] = 2 * x
>>> with np.nditer(a, op_flags=['readwrite']) as it:
... for x in it:
... x[...] = 2 * x
...
>>> a
array([[ 0, 2, 4],
Expand Down Expand Up @@ -178,9 +186,10 @@ construct in order to be more readable.
0 <(0, 0)> 1 <(0, 1)> 2 <(0, 2)> 3 <(1, 0)> 4 <(1, 1)> 5 <(1, 2)>

>>> it = np.nditer(a, flags=['multi_index'], op_flags=['writeonly'])
>>> while not it.finished:
... it[0] = it.multi_index[1] - it.multi_index[0]
... it.iternext()
>>> with it:
.... while not it.finished:
... it[0] = it.multi_index[1] - it.multi_index[0]
... it.iternext()
...
>>> a
array([[ 0, 1, 2],
Expand Down Expand Up @@ -426,9 +435,10 @@ reasons.
... flags = ['external_loop', 'buffered'],
... op_flags = [['readonly'],
... ['writeonly', 'allocate', 'no_broadcast']])
... for x, y in it:
... y[...] = x*x
... return it.operands[1]
... with it:
... for x, y in it:
... y[...] = x*x
... return it.operands[1]
...

>>> square([1,2,3])
Expand Down Expand Up @@ -505,9 +515,10 @@ For a simple example, consider taking the sum of all elements in an array.

>>> a = np.arange(24).reshape(2,3,4)
>>> b = np.array(0)
>>> for x, y in np.nditer([a, b], flags=['reduce_ok', 'external_loop'],
... op_flags=[['readonly'], ['readwrite']]):
... y[...] += x
>>> with np.nditer([a, b], flags=['reduce_ok', 'external_loop'],
... op_flags=[['readonly'], ['readwrite']]) as it:
... for x,y in it:
... y[...] += x
...
>>> b
array(276)
Expand All @@ -525,11 +536,12 @@ sums along the last axis of `a`.
>>> it = np.nditer([a, None], flags=['reduce_ok', 'external_loop'],
... op_flags=[['readonly'], ['readwrite', 'allocate']],
... op_axes=[None, [0,1,-1]])
>>> it.operands[1][...] = 0
>>> for x, y in it:
... y[...] += x
>>> with it:
... it.operands[1][...] = 0
... for x, y in it:
... y[...] += x
...
>>> it.operands[1]
... it.operands[1]
array([[ 6, 22, 38],
[54, 70, 86]])
>>> np.sum(a, axis=2)
Expand Down Expand Up @@ -558,12 +570,13 @@ buffering.
... 'buffered', 'delay_bufalloc'],
... op_flags=[['readonly'], ['readwrite', 'allocate']],
... op_axes=[None, [0,1,-1]])
>>> it.operands[1][...] = 0
>>> it.reset()
>>> for x, y in it:
... y[...] += x
>>> with it:
... it.operands[1][...] = 0
... it.reset()
... for x, y in it:
... y[...] += x
...
>>> it.operands[1]
... it.operands[1]
array([[ 6, 22, 38],
[54, 70, 86]])

Expand Down Expand Up @@ -609,11 +622,12 @@ Here's how this looks.
... op_flags=[['readonly'], ['readwrite', 'allocate']],
... op_axes=[None, axeslist],
... op_dtypes=['float64', 'float64'])
... it.operands[1][...] = 0
... it.reset()
... for x, y in it:
... y[...] += x*x
... return it.operands[1]
... with it:
... it.operands[1][...] = 0
... it.reset()
... for x, y in it:
... y[...] += x*x
... return it.operands[1]
...
>>> a = np.arange(6).reshape(2,3)
>>> sum_squares_py(a)
Expand Down Expand Up @@ -661,16 +675,17 @@ Here's the listing of sum_squares.pyx::
op_flags=[['readonly'], ['readwrite', 'allocate']],
op_axes=[None, axeslist],
op_dtypes=['float64', 'float64'])
it.operands[1][...] = 0
it.reset()
for xarr, yarr in it:
x = xarr
y = yarr
size = x.shape[0]
for i in range(size):
value = x[i]
y[i] = y[i] + value * value
return it.operands[1]
with it:
it.operands[1][...] = 0
it.reset()
for xarr, yarr in it:
x = xarr
y = yarr
size = x.shape[0]
for i in range(size):
value = x[i]
y[i] = y[i] + value * value
return it.operands[1]

On this machine, building the .pyx file into a module looked like the
following, but you may have to find some Cython tutorials to tell you
Expand Down
28 changes: 22 additions & 6 deletions doc/source/reference/c-api.iterator.rst
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ number of non-zero elements in an array.
/* Increment the iterator to the next inner loop */
} while(iternext(iter));

NpyIter_Close(iter) /* best practice, not strictly required in this case */
NpyIter_Deallocate(iter);

return nonzero_count;
Expand Down Expand Up @@ -194,6 +195,7 @@ is used to control the memory layout of the allocated result, typically
ret = NpyIter_GetOperandArray(iter)[1];
Py_INCREF(ret);

NpyIter_Close(iter);
if (NpyIter_Deallocate(iter) != NPY_SUCCEED) {
Py_DECREF(ret);
return NULL;
Expand Down Expand Up @@ -490,7 +492,10 @@ Construction and Destruction

Indicate how the user of the iterator will read or write
to ``op[i]``. Exactly one of these flags must be specified
per operand.
per operand. Using ``NPY_ITER_READWRITE`` or ``NPY_ITER_WRITEONLY``
for a user-provided operand may trigger `WRITEBACKIFCOPY``
semantics. The data will be written back to the original array
when ``NpyIter_Close`` is called.

.. c:var:: NPY_ITER_COPY

Expand All @@ -502,12 +507,12 @@ Construction and Destruction

Triggers :c:data:`NPY_ITER_COPY`, and when an array operand
is flagged for writing and is copied, causes the data
in a copy to be copied back to ``op[i]`` when the iterator
is destroyed.
in a copy to be copied back to ``op[i]`` when ``NpyIter_Close`` is
called.

If the operand is flagged as write-only and a copy is needed,
an uninitialized temporary array will be created and then copied
to back to ``op[i]`` on destruction, instead of doing
to back to ``op[i]`` on calling ``NpyIter_Close``, instead of doing
the unnecessary copy operation.

.. c:var:: NPY_ITER_NBO
Expand Down Expand Up @@ -754,10 +759,21 @@ Construction and Destruction

Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.

.. c:function:: int NpyIter_Close(NpyIter* iter)

Resolves any needed writeback resolution. Must be called before
``NpyIter_Deallocate``. After this call it is not safe to use the operands.

Returns ``0`` or ``-1`` if unsuccessful.

.. c:function:: int NpyIter_Deallocate(NpyIter* iter)

Deallocates the iterator object. This additionally frees any
copies made, triggering UPDATEIFCOPY behavior where necessary.
Deallocates the iterator object.

`NpyIter_Close` should be called before this. If not, and if writeback is
needed, it will be performed at this point in order to maintain
backward-compatibility with older code, and a deprecation warning will be
emmitted. Old code shuold be updated to call `NpyIter_Close` beforehand.

Returns ``NPY_SUCCEED`` or ``NPY_FAIL``.

Expand Down
49 changes: 41 additions & 8 deletions numpy/add_newdocs.py
Original file line number Diff line number Diff line change
Expand Up @@ -319,8 +319,9 @@ def iter_add_py(x, y, out=None):
addop = np.add
it = np.nditer([x, y, out], [],
[['readonly'], ['readonly'], ['writeonly','allocate']])
for (a, b, c) in it:
addop(a, b, out=c)
with it:
for (a, b, c) in it:
addop(a, b, out=c)
return it.operands[2]

Here is the same function, but following the C-style pattern::
Expand All @@ -344,13 +345,12 @@ def outer_it(x, y, out=None):

it = np.nditer([x, y, out], ['external_loop'],
[['readonly'], ['readonly'], ['writeonly', 'allocate']],
op_axes=[range(x.ndim)+[-1]*y.ndim,
[-1]*x.ndim+range(y.ndim),
op_axes=[list(range(x.ndim)) + [-1] * y.ndim,
[-1] * x.ndim + list(range(y.ndim)),
None])

for (a, b, c) in it:
mulop(a, b, out=c)

with it:
for (a, b, c) in it:
mulop(a, b, out=c)
return it.operands[2]

>>> a = np.arange(2)+1
Expand Down Expand Up @@ -381,6 +381,32 @@ def luf(lamdaexpr, *args, **kwargs):
>>> luf(lambda i,j:i*i + j/2, a, b)
array([ 0.5, 1.5, 4.5, 9.5, 16.5])

If operand flags `"writeonly"` or `"readwrite"` are used the operands may
be views into the original data with the WRITEBACKIFCOPY flag. In this case
nditer must be used as a context manager. The temporary
data will be written back to the original data when the `` __exit__``
function is called but not before::

>>> a = np.arange(6, dtype='i4')[::-2]
>>> with nditer(a, [],
... [['writeonly', 'updateifcopy']],
... casting='unsafe',
... op_dtypes=[np.dtype('f4')]) as i:
... x = i.operands[0]
... x[:] = [-1, -2, -3]
... # a still unchanged here
>>> a, x
array([-1, -2, -3]), array([-1, -2, -3])

It is important to note that once the iterator is exited, dangling
references (like `x` in the example) may or may not share data with
the original data `a`. If writeback semantics were active, i.e. if
`x.base.flags.writebackifcopy` is `True`, then exiting the iterator
will sever the connection between `x` and `a`, writing to `x` will
no longer write to `a`. If writeback semantics are not active, then
`x.data` will still point at some part of `a.data`, and writing to
one will affect the other.

""")

# nditer methods
Expand Down Expand Up @@ -524,6 +550,13 @@ def luf(lamdaexpr, *args, **kwargs):

""")

add_newdoc('numpy.core', 'nditer', ('close',
"""
close()

Resolve all writeback semantics in writeable operands.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps reference the with statement here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed (in doc-nditer PR)


"""))


###############################################################################
Expand Down
3 changes: 3 additions & 0 deletions numpy/core/code_generators/cversions.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,6 @@
# Version 12 (NumPy 1.14) Added PyArray_ResolveWritebackIfCopy,
# PyArray_SetWritebackIfCopyBase and deprecated PyArray_SetUpdateIfCopyBase.
0x0000000c = a1bc756c5782853ec2e3616cf66869d8

# Version 13 (NumPy 1.15) Added NpyIter_Close
0x0000000d = 4386e829d65aafce6bd09a85b142d585
5 changes: 4 additions & 1 deletion numpy/core/code_generators/numpy_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@

Whenever you change one index, you break the ABI (and the ABI version number
should be incremented). Whenever you add an item to one of the dict, the API
needs to be updated.
needs to be updated in both setup_common.py and by adding an appropriate
entry to cversion.txt (generate the hash via "python cversions.py".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing paren

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "best practice" when fixing a merged PR, a new PR or modifying this one again?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't modify an existing PR once it's merged.

What I normally do is keep working in the same branch, but open a new PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed (in doc-nditer PR)


When adding a function, make sure to use the next integer not used as an index
(in case you use an existing index or jump, the build will stop and raise an
Expand Down Expand Up @@ -349,6 +350,8 @@
'PyArray_ResolveWritebackIfCopy': (302,),
'PyArray_SetWritebackIfCopyBase': (303,),
# End 1.14 API
'NpyIter_Close': (304,),
# End 1.15 API
}

ufunc_types_api = {
Expand Down
3 changes: 2 additions & 1 deletion numpy/core/setup_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@
# 0x0000000a - 1.12.x
# 0x0000000b - 1.13.x
# 0x0000000c - 1.14.x
C_API_VERSION = 0x0000000c
# 0x0000000d - 1.15.x
C_API_VERSION = 0x0000000d

class MismatchCAPIWarning(Warning):
pass
Expand Down