Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Warn if out argument has refcnt 1 and owns its data. #9632

Closed
wants to merge 3 commits into from
Closed

MAINT: Warn if out argument has refcnt 1 and owns its data. #9632

wants to merge 3 commits into from

Conversation

MSeifert04
Copy link
Contributor

@MSeifert04 MSeifert04 commented Aug 31, 2017

Based on the feedback in #7244 but it doesn't actually solve it (see below) emit a RuntimeWarning in case the out argument seems dubious. Note that this test can lead to false positives in case someone actually catches the result:

res = np.add([1,2], [3,4], out=np.empty(2))

And it won't trigger if someone actually makes a new array and creates a view on that:

res = np.add([1,2], [3,4], out=np.empty(2).ravel())

with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always')
np.add(arr, arr, out=np.empty(3))
assert len(w) == 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably needs to be skipped on non-refcounting Pythons (PyPy).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also just use assert_warns (or also suppress_warnings since you also match the text). I just like it a bit more:

with suppress_warnings() as sup:
    r = sup.record(RutimeWarning, ".*the result may be lost")
    ...
assert_(len(r) == 1)

but I do not care much about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yes, you have to check for refcounting of the interpreter (I am not sure if the C-Code needs something here).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we allow assert statements nowadays? I something think we might, but we used to not use them (since they are more commonly used with pytest?)

@MSeifert04
Copy link
Contributor Author

MSeifert04 commented Aug 31, 2017

and no idea why this fails on Python 3.6. Did they change something in warnings.catch_warnings?

Also if that change would be too disruptive I won't mind if it's declined.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why it fails in 3.6, there were some minor changes, but I do not really see them affecting the tests/catch_warnings in any case.

I think I like the change as such, there is practically no use case for this case, so it probably almost always will be a programming oversight, in that sense, I should think a bit about what the warning should say.

The PyPy stuff needs to be fixed (I am not sure what Py_REFNT does on PyPy, but it might be best to just put the whole thing into an #ifndef block.

@@ -887,7 +892,9 @@ def test_unary_ufunc_where_same(self):
ufunc = np.invert

def check(a, out, mask):
c1 = ufunc(a, out=out.copy(), where=mask.copy())
with warnings.catch_warnings(record=True) as w:
warnings.simplefilter("always")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use:

with suppress_warnings() as sup:
    sup.filter(WarningType, [warning_message_regexp])

its not a big thing, but its nicer since it is more specific (especially if you put the message text).

with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always')
np.add(arr, arr, out=np.empty(3))
assert len(w) == 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also just use assert_warns (or also suppress_warnings since you also match the text). I just like it a bit more:

with suppress_warnings() as sup:
    r = sup.record(RutimeWarning, ".*the result may be lost")
    ...
assert_(len(r) == 1)

but I do not care much about it.

with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always')
np.add(arr, arr, out=np.empty(3))
assert len(w) == 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yes, you have to check for refcounting of the interpreter (I am not sure if the C-Code needs something here).

with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always')
np.add(arr, arr, out=np.empty(3))
assert len(w) == 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we allow assert statements nowadays? I something think we might, but we used to not use them (since they are more commonly used with pytest?)

@MSeifert04
Copy link
Contributor Author

MSeifert04 commented Sep 6, 2017

Interesting. It really doesn't warn on Python3.6. I'm not sure what the reason for this is though. Probably the out argument has a refcount of more than 1...

@MSeifert04
Copy link
Contributor Author

@seberg Thanks for all the suggestions. I should've investigated how NumPy does these warning tests before submitting the PR.

@MSeifert04
Copy link
Contributor Author

MSeifert04 commented Sep 6, 2017

okay, some investigation later ... it's really that a simple example like this:

>>> import numpy as np
>>> np.add([1,2,3], [1,2,3], out=np.empty(3))
array([ 2.,  4.,  6.])

doesn't warn on Python 3.6 because the reference count for out is 2. Strangely it seems like Python 3.6 somewhere adds one reference. Even in the normal case

>>> arr = np.empty(3)
>>> np.add([1,2,3], [1,2,3], out=arr)

The out argument has a reference count of 3 there.

@njsmith
Copy link
Member

njsmith commented Sep 6, 2017

@MSeifart04: Probably there's one reference for the value on the stack, and one for the value passed into numpy, or something like that.

I wonder if this means that @juliantaylor's temporary elision is broken on 3.6...

@charris
Copy link
Member

charris commented Sep 6, 2017

I think elision require big arrays. I don't see that here

In [14]: arr = np.empty(3)

In [15]: sys.getrefcount(arr)
Out[15]: 2

In [16]: for i in range(10):
    ...:     np.add([1,2,3], [1,2,3], out=arr)
    ...:     

In [17]: sys.getrefcount(arr)
Out[17]: 2

EDIT: Python 3.6.2

@MSeifert04
Copy link
Contributor Author

MSeifert04 commented Sep 6, 2017

@charris

I tested it with this change:

@@ -501,10 +501,11 @@ _set_out_array(PyObject *obj, PyArrayObject **store)
         /* Translate None to NULL */
         return 0;
     }
     if PyArray_Check(obj) {
 #ifndef PYPY_VERSION
+        printf("refcount: %i\n", (int)Py_REFCNT(obj));
         if (Py_REFCNT(obj) == 1 &&
                 (PyArray_FLAGS((PyArrayObject *)obj) & NPY_ARRAY_OWNDATA)) {
             if (PyErr_WarnEx(PyExc_RuntimeWarning,
                              "the output array seems to be a new array, if the "
                              "returned values of the ufunc are not assigned to "

with these results:

Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 12:30:02) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.add([1,2,3], [1,2,3], out=np.empty(3))
refcount: 2
array([ 2.,  4.,  6.])
>>> arr = np.empty(3)
>>> np.add([1,2,3], [1,2,3], out=arr)
refcount: 3
array([ 2.,  4.,  6.])

Based on this PR.

On Python 3.5 the same test gave different results:

Python 3.5.4 |Continuum Analytics, Inc.| (default, Aug 14 2017, 13:41:13) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.add([1,2,3], [1,2,3], out=np.empty(3))
refcount: 1
__main__:1: RuntimeWarning: the output array seems to be a new array, if the returned values of the ufunc are not assigned to a new variable the results may be lost
array([ 2.,  4.,  6.])
>>> arr = np.empty(3)
>>> np.add([1,2,3], [1,2,3], out=arr)
refcount: 2
array([ 2.,  4.,  6.])

@MSeifert04
Copy link
Contributor Author

@njsmith Yes, I was wondering. Not sure if it's maybe related to the _FastCall optimizations though.

@MSeifert04
Copy link
Contributor Author

Probably not a "good solution" but I skipped the assert on 3.6. At least it's passing the tests.

It would be possible to change it so that it asserts "no warning" on 3.6+ in case the reference counting is updated again and the out has the "expected reference count" again. Or should I just check for refcnt 2 in 3.6+?

@njsmith
Copy link
Member

njsmith commented Sep 9, 2017

I suspect x = ...; np.add(..., out=x) will have refcnt 2 on cython.

I dunno, if this is fragile enough that our choices are to either encode quirky interpreter details, or else give up on supporting it on the most recent versions of Python, then is it worth doing? Presumably in a few years then all the Python versions we support will be like 3.6...

@MSeifert04
Copy link
Contributor Author

then is it worth doing?

Probably not.

@MSeifert04 MSeifert04 closed this Sep 9, 2017
@njsmith
Copy link
Member

njsmith commented Sep 9, 2017

@MSeifert04 It was a nice idea; sorry it didn't work out.

@MSeifert04 MSeifert04 deleted the testbranch branch September 11, 2017 00:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants