-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Should cython object buffers check for NULL? #4858
Comments
I think this would be the right changes (probably, I did not test it thoroughly): diff --git a/Cython/Compiler/ExprNodes.py b/Cython/Compiler/ExprNodes.py
index c20a76bd4..19aa787e0 100644
--- a/Cython/Compiler/ExprNodes.py
+++ b/Cython/Compiler/ExprNodes.py
@@ -4578,17 +4578,17 @@ class BufferIndexNode(_IndexingBaseNode):
buffer_entry, ptrexpr = self.buffer_lookup_code(code)
if self.buffer_type.dtype.is_pyobject:
- # Must manage refcounts. Decref what is already there
- # and incref what we put in.
+ # Must manage refcounts. XDecref what is already there
+ # and incref what we put in (NumPy allows there to be NULL)
ptr = code.funcstate.allocate_temp(buffer_entry.buf_ptr_type,
manage_ref=False)
rhs_code = rhs.result()
code.putln("%s = %s;" % (ptr, ptrexpr))
- code.put_gotref("*%s" % ptr, self.buffer_type.dtype)
- code.putln("__Pyx_INCREF(%s); __Pyx_DECREF(*%s);" % (
+ code.put_xgotref("*%s" % ptr, self.buffer_type.dtype)
+ code.putln("__Pyx_INCREF(%s); __Pyx_XDECREF(*%s);" % (
rhs_code, ptr))
code.putln("*%s %s= %s;" % (ptr, op, rhs_code))
- code.put_giveref("*%s" % ptr, self.buffer_type.dtype)
+ code.put_xgiveref("*%s" % ptr, self.buffer_type.dtype)
code.funcstate.release_temp(ptr)
else:
# Simple case
@@ -4609,8 +4609,11 @@ class BufferIndexNode(_IndexingBaseNode):
# is_temp is True, so must pull out value and incref it.
# NOTE: object temporary results for nodes are declared
# as PyObject *, so we need a cast
- code.putln("%s = (PyObject *) *%s;" % (self.result(), self.buffer_ptr_code))
- code.putln("__Pyx_INCREF((PyObject*)%s);" % self.result())
+ res = self.result()
+ code.putln("%s = (PyObject *) *%s;" % (res, self.buffer_ptr_code))
+ # NumPy does (occasionally) allow NULL to denote None.
+ code.putln("if (%s == NULL) %s = Py_None;" % (res, res))
+ code.putln("__Pyx_INCREF((PyObject*)%s);" % res)
def free_subexpr_temps(self, code):
for temp in self.index_temps: I had a brief look at the test, but would have to look more to figure out how to get the NULLs in there best. |
While NumPy tends to not actively create object buffers initialized only with NULL (rather than filled with None), at least older versions of NumPy did do that. And NumPy guards against this. This guards against embedded NULLs in object buffers interpreting a NULL as None (and anticipating a NULL value also when setting the buffer for reference count purposes). Closes cythongh-4858
* BUG: Fortify object buffers against included NULLs While NumPy tends to not actively create object buffers initialized only with NULL (rather than filled with None), at least older versions of NumPy did do that. And NumPy guards against this. This guards against embedded NULLs in object buffers interpreting a NULL as None (and anticipating a NULL value also when setting the buffer for reference count purposes). Closes gh-4858
* BUG: Fortify object buffers against included NULLs While NumPy tends to not actively create object buffers initialized only with NULL (rather than filled with None), at least older versions of NumPy did do that. And NumPy guards against this. This guards against embedded NULLs in object buffers interpreting a NULL as None (and anticipating a NULL value also when setting the buffer for reference count purposes). Closes gh-4858
I guess the follow-up question is: is there an equivalent issue with object memoryviews? |
Oh, this only fixes |
I believe so (although I haven't actually tested what typed memoryviews do). I'll re-open this issue and so it covers them too |
So it looks like this did affect memoryviews too (which I'm slightly surprised at but I guess is good). So false alarm I think |
I'll just create a copy of the tests for memoryviews and then it can be closed properly |
I think we can close this now, thanks! Will try to remember to ping the pandas folks when I see the release, it sounds like they can remove some awful hacks then ;). |
Pandas seems to semi-regularly run into issues with NumPy's logic which currently says that object arrays may be filled with
NULL
at initialization so thatNULL
is accepted everywhere to have the same meaning asNone
.(See also pandas-dev/pandas#47097)
This is a bit weird, since NumPy also fills the array with
None
almost always, so in the few places where it doesn't it is unexpected!I have opened numpy/numpy#21817 to solve this in NumPy. The intention would be that NumPy for now should accept
NULL
, but defines it as incorrect and will never produce it on its own (with some "internal" exceptions).Now, I am not sure what the best solution is here and if you think that Cython should fix this (or we should do both), I can look into it.
In some sense, a fix in cython might be best... then pandas can just upgrade its Cython dependency and stop worrying about these oddities.
The text was updated successfully, but these errors were encountered: