Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: MAINT: avoid numpy internals in {sg}et_array_base #2528

Merged
merged 3 commits into from Aug 10, 2018

Conversation

mattip
Copy link
Contributor

@mattip mattip commented Aug 1, 2018

In the discussion of issue #2498, it seems cython uses direct access of ndarray structure base attribute in set_array_base and get_array_base. This bypasses much of the NumPy error checks, and can lead to violating assumptions NumPy makes as to the correctness of the base attribute.

This PR refactors the functions to use NumPy APIs. I am not sure it is correct, I could find no tests for the functionality, and truthfully struggle to see when these functions could be properly used. I would prefer to deprecate them.

NumPy uses the base attribute internally when

  • creating a view by sharing the data attribute of both self and base, or
  • in creating temporary buffer data in ufuncs via writeback semantics.

The simple code in these functions does none of the data attribute manipulations done by those two internal uses.

@mattip
Copy link
Contributor Author

mattip commented Aug 9, 2018

Any ideas why the c++ tests are failing but the c are passing?

@scoder
Copy link
Contributor

scoder commented Aug 9, 2018

The tests fail with this error:

convolve2.cpp: In function "void __pyx_f_5numpy_set_array_base(PyObject*, PyObject*)":
convolve2.cpp:4563:94: error: cannot convert "PyObject*" {aka "_object*"} to "PyArrayObject*" {aka "tagPyArrayObject_fields*"} in argument passing
   (void)(PyArray_SetBaseObject(__pyx_v_arr, __pyx_v_base));

C isn't as strict as C++ here.

else:
return <object>arr.base
cdef inline object get_array_base(object arr):
return <object>arr.base
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to cast here, since this is clearly using Python object access, which returns an … object! :)
But why isn't this using PyArray_BASE()? (Probably not expected to occur in performance critical code …)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted it to go through the python-level attribute access, maybe that is extreme. I will redo this with PyArray_BASE.

@@ -719,6 +719,7 @@ cdef extern from "numpy/arrayobject.h":
object PyArray_CheckAxis (ndarray, int *, int)
npy_intp PyArray_OverflowMultiplyList (npy_intp *, int)
int PyArray_CompareString (char *, char *, size_t)
int PyArray_SetBaseObject(object, object)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first argument is an ndarray.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -973,22 +974,12 @@ cdef extern from "numpy/ufuncobject.h":

int _import_umath() except -1

cdef inline void set_array_base(object arr, object base):
Py_INCREF(base)
PyArray_SetBaseObject(arr, base)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just keep ndarray arr in the function signature of set_array_base(). That's what the signature was, and that's also what's needed for this call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@mattip
Copy link
Contributor Author

mattip commented Aug 9, 2018

Thanks for the help. I admit not knowing how the failing tests are generated, so I couldn't run them before pushing. I also left some lingering questions around the use of PyArray_BASE and refcounts.

@mattip
Copy link
Contributor Author

mattip commented Aug 10, 2018

Reading the documentation helps, now running tests locally

@@ -395,7 +395,7 @@ cdef extern from "numpy/arrayobject.h":
npy_intp PyArray_DIM(ndarray, size_t)
npy_intp PyArray_STRIDE(ndarray, size_t)

# object PyArray_BASE(ndarray) wrong refcount semantics
object PyArray_BASE(ndarray) #wrong refcount semantics?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NumPy documentation isn't clear here, but I assume that PyArray_BASE() is just a macro that returns a borrowed reference? If so, then the correct return type here is PyObject * and not a normally refcounted object. You'll then have to cast it to <object> on use, which will turn it into an owned reference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's just a macro. Fixing to old logic

return <object>arr.base

base = PyArray_BASE(arr)
# Do we need to convert NULL -> None?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess so, as it was done before.


base = PyArray_BASE(arr)
# Do we need to convert NULL -> None?
# Do we need to incref base or is that done by cython?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment regarding the macro signature.

@mattip
Copy link
Contributor Author

mattip commented Aug 10, 2018

It seems the clang failures are unrelated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants