Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Create ufunc for struct array #462

Merged
merged 15 commits into from May 15, 2013

Conversation

Projects
None yet
3 participants
Contributor

jayvius commented Sep 27, 2012

This change allows ufuncs to be registered for structured arrays. For example, a ufunc could be registered to take two arrays of type 'u8,u8,u8' and return an array of type 'u8,u8,u8'.

The following are the key changes:

  1. Add new API method called PyUFunc_RegisterLoopForStructType. This method behaves the same as PyUFunc_RegisterLoopForType, except that instead of taking an array of type nums for the ufunc operands, it takes an array of struct dtype objects.

  2. Add arg_dtypes array to PyUFunc_Loop1d struct. This stores the dtype objects passed in through call to RegisterLoopForStructType.

  3. Add nargs int value to PyUFunc_Loop1d struct. This stores the number of input and output arguments to ufunc, so that dtype objects in PyUFunc_Loop1d struct can be DECREFed when PyUFUnc_Loop1d struct is destroyed.

  4. Add logic in ufunc_loop_matches function that checks struct dtype object in arg_dtypes if current operand being checked is type NPY_VOID.

The following is a trivial 1-d loop function example written in Cython that adds corresponding fields in two arrays of type 'u8,u8,u8':

cdef void add_uint64_triplet(char **args,
    npy_intp *dimensions,
    npy_intp *steps,
    void *innerloopdata):

    cdef npy_intp i
    cdef npy_intp is1=steps[0]
    cdef npy_intp is2=steps[1]
    cdef npy_intp os=steps[2]
    cdef npy_intp n=dimensions[0]
    cdef uint64_t *x, *y, *z

    cdef char *i1=args[0], *i2=args[1], *op=args[2]

    for i in range(n):

        # Initialize pointers to each operand
        x = <uint64_t*>i1
        y = <uint64_t*>i2
        z = <uint64_t*>op

        # Add each field in struct array record
        z[0] = x[0] + y[0]
        z[1] = x[1] + y[1]
        z[2] = x[2] + y[2]

        i1 += is1; i2 += is2; op += os


def test_ufunc:
    cdef PyArray_Descr *types[3]
    dtype = np.dtype('u8,u8,u8')

    types[0] = <PyArray_Descr*>dtype
    types[1] = <PyArray_Descr*>dtype
    types[2] = <PyArray_Descr*>dtype

    ufunc = PyUFunc_FromFuncAndData(NULL, NULL, NULL, 0,
                                        2, 1, PyUFunc_None,
                                        NULL, NULL, 0) 

    # Call new method for registering ufunc for struct array
    PyUFunc_RegisterLoopForStructType(<PyUFuncObject*>ufunc,
                                      <PyArray_Descr*>dtype,
                                      &add_uint64_triplet,
                                      <PyArray_Descr**>types,
                                      NULL)

    a = np.array([(1,2,3)], dtype=dtype)
    b = np.array([(1,2,3)], dtype=dtype)

    result = ufunc(a, b)

@87 87 commented on an outdated diff Oct 1, 2012

numpy/core/src/umath/ufunc_object.c
@@ -4226,6 +4269,100 @@ static int get_ufunc_arguments(PyUFuncObject *ufunc,
+/*UFUNC_API*/
@87

87 Oct 1, 2012

Contributor

I think you have to embed the UFUNC_API declaration into the comment below, like so:

/*UFUNC_API
 * This function ...
 */

Right now, it doesn't build, says Travis.

@charris charris commented on an outdated diff Apr 1, 2013

numpy/core/src/umath/ufunc_object.c
@@ -764,23 +764,56 @@ static int get_ufunc_arguments(PyUFuncObject *ufunc,
if (out_op[i] == NULL) {
return -1;
}
+
+ int type_num = PyArray_DESCR(out_op[i])->type_num;
@charris

charris Apr 1, 2013

Owner

Declaration of type_num needs to be moved to the top of block.

@charris charris commented on an outdated diff Apr 1, 2013

numpy/core/src/umath/ufunc_object.c
any_object = 1;
}
+
+ /*
+ * If any operand is a flexible dtype, check to see if any
+ * struct dtype ufuncs are registered. A ufunc has been registered
+ * for a struct dtype if ufunc's arg_dtypes array is not NULL.
+ */
+ if (PyTypeNum_ISFLEXIBLE(type_num) &&
+ !any_flexible_userloops &&
@charris

charris Apr 1, 2013

Owner

Need another indent for the continuation lines.

@charris charris commented on an outdated diff Apr 1, 2013

numpy/core/src/umath/ufunc_object.c
any_object = 1;
}
+
+ /*
+ * If any operand is a flexible dtype, check to see if any
+ * struct dtype ufuncs are registered. A ufunc has been registered
+ * for a struct dtype if ufunc's arg_dtypes array is not NULL.
+ */
+ if (PyTypeNum_ISFLEXIBLE(type_num) &&
+ !any_flexible_userloops &&
+ ufunc->userloops != NULL) {
+ PyUFunc_Loop1d *funcdata;
@charris

charris Apr 1, 2013

Owner

Broken indent. Any hard tabs here?

@charris

charris Apr 1, 2013

Owner

Never mind, the lack of indent in previous line confused me.

@charris charris commented on an outdated diff Apr 1, 2013

numpy/core/src/umath/ufunc_type_resolution.c
@@ -1181,7 +1181,7 @@
for (i = 0; i < nin; ++i) {
int type_num = dtypes[i]->type_num;
- if (type_num != last_userdef && PyTypeNum_ISUSERDEF(type_num)) {
+ if (type_num != last_userdef && (PyTypeNum_ISUSERDEF(type_num) || type_num == NPY_VOID)) {
@charris

charris Apr 1, 2013

Owner

Long line needs a break.

@charris charris commented on an outdated diff Apr 1, 2013

numpy/core/src/umath/ufunc_type_resolution.c
/*
* Copy the dtype from 'op' if the type_num matches,
* to preserve metadata.
*/
- if (op[i] != NULL && PyArray_DESCR(op[i])->type_num == type_nums[i]) {
+ } else if (op[i] != NULL && PyArray_DESCR(op[i])->type_num == type_nums[i]) {
@charris

charris Apr 1, 2013

Owner

Long line needs break.

@charris charris commented on an outdated diff Apr 1, 2013

numpy/core/src/umath/ufunc_type_resolution.c
@@ -1589,7 +1603,7 @@
for (i = 0; i < nin; ++i) {
int type_num = PyArray_DESCR(op[i])->type_num;
- if (type_num != last_userdef && PyTypeNum_ISUSERDEF(type_num)) {
+ if (type_num != last_userdef && (PyTypeNum_ISUSERDEF(type_num) || type_num == NPY_VOID)) {
@charris

charris Apr 1, 2013

Owner

Long line needs break.

@charris charris commented on an outdated diff Apr 1, 2013

numpy/core/src/umath/ufunc_type_resolution.c
&no_castable_output, &err_src_typecode,
&err_dst_typecode)) {
/* It works */
case 1:
- set_ufunc_loop_data_types(self, op, out_dtype, types);
+ set_ufunc_loop_data_types(self, op, out_dtype, types, NULL);
@charris

charris Apr 1, 2013

Owner

Long line needs break.

Owner

charris commented Apr 1, 2013

This looks reasonable to me, but it needs documention. The example might also provide both a test and example as this needs tests.

@charris charris commented on an outdated diff Apr 28, 2013

numpy/core/src/umath/ufunc_object.c
any_object = 1;
}
+
+ /*
+ * If any operand is a flexible dtype, check to see if any
+ * struct dtype ufuncs are registered. A ufunc has been registered
+ * for a struct dtype if ufunc's arg_dtypes array is not NULL.
+ */
+ if (PyTypeNum_ISFLEXIBLE(type_num) &&
+ !any_flexible_userloops &&
@charris

charris Apr 28, 2013

Owner

This and the next line should be indented another tab.

@charris charris commented on the diff Apr 28, 2013

numpy/core/src/umath/ufunc_object.c
+ key = PyInt_FromLong(type_num);
+ if (key == NULL) {
+ continue;
+ }
+ obj = PyDict_GetItem(ufunc->userloops, key);
+ Py_DECREF(key);
+ if (obj == NULL) {
+ continue;
+ }
+ funcdata = (PyUFunc_Loop1d *)NpyCapsule_AsVoidPtr(obj);
+ while (funcdata != NULL) {
+ if (funcdata->arg_dtypes != NULL) {
+ any_flexible_userloops = 1;
+ break;
+ }
+
@charris

charris Apr 28, 2013

Owner

Could drop blank line here.

@charris charris commented on an outdated diff Apr 28, 2013

numpy/core/src/umath/ufunc_object.c
@@ -4353,6 +4397,100 @@ static int get_ufunc_arguments(PyUFuncObject *ufunc,
#endif
+/*
+ * This function allows the user to register a 1-d loop for structured arrays
+ * with an already created ufunc. The ufunc is called whenever any of it's input
+ * arguments match the user_dtype argument.
+ * ufunc - ufunc object created from call to PyUFunc_FromFuncAndData
+ * user_dtype - struct dtype that ufunc will be registered with
+ * function - 1-d loop function pointer
+ * arg_dtypes - array of struct dtype objects describing the ufunc operands
+ * data - arbitrary data pointer passed in to loop function
+ */
@charris

charris Apr 28, 2013

Owner

Nice to see some documentation on the function. You could write this up like python function documentation, there is a plugin for doxygen to handle that.

@charris charris commented on an outdated diff Apr 28, 2013

numpy/core/src/umath/ufunc_object.c
+ */
+/*UFUNC_API*/
+NPY_NO_EXPORT int
+PyUFunc_RegisterLoopForStructType(PyUFuncObject *ufunc,
+ PyArray_Descr *user_dtype,
+ PyUFuncGenericFunction function,
+ PyArray_Descr **arg_dtypes,
+ void *data)
+{
+ int i;
+ int result = 0;
+ int *arg_typenums;
+ PyObject *key, *cobj;
+
+ if (user_dtype == NULL) {
+ PyErr_SetString(PyExc_TypeError, "unknown user defined struct dtype");
@charris

charris Apr 28, 2013

Owner

I've pretty much settled on putting the string on the next line in these cases, it makes later modification simpler and looks nice (IMHO).

@charris charris commented on an outdated diff Apr 28, 2013

numpy/core/src/umath/ufunc_object.c
+ for (i = 0; i < ufunc->nargs; i++) {
+ arg_typenums[i] = arg_dtypes[i]->type_num;
+ }
+ }
+ else {
+ for (i = 0; i < ufunc->nargs; i++) {
+ arg_typenums[i] = user_dtype->type_num;
+ }
+ }
+
+ result = PyUFunc_RegisterLoopForType(ufunc, user_dtype->type_num, function, arg_typenums, data);
+
+ if (result == 0) {
+ cobj = PyDict_GetItem(ufunc->userloops, key);
+ if (cobj == NULL) {
+ PyErr_SetString(PyExc_KeyError, "userloop for user dtype not found");
@charris

charris Apr 28, 2013

Owner

See comment above.

@charris charris commented on an outdated diff Apr 28, 2013

numpy/core/src/umath/ufunc_object.c
+ }
+
+ result = PyUFunc_RegisterLoopForType(ufunc, user_dtype->type_num, function, arg_typenums, data);
+
+ if (result == 0) {
+ cobj = PyDict_GetItem(ufunc->userloops, key);
+ if (cobj == NULL) {
+ PyErr_SetString(PyExc_KeyError, "userloop for user dtype not found");
+ result = -1;
+ }
+ else {
+ PyUFunc_Loop1d *current, *prev = NULL;
+ int cmp = 1;
+ current = (PyUFunc_Loop1d *)NpyCapsule_AsVoidPtr(cobj);
+ while (current != NULL) {
+ cmp = cmp_arg_types(current->arg_types, arg_typenums, ufunc->nargs);
@charris

charris Apr 28, 2013

Owner

Some long lines here.

@charris charris commented on an outdated diff Apr 28, 2013

numpy/core/src/umath/ufunc_object.c
+ }
+ else {
+ for (i = 0; i < ufunc->nargs; i++) {
+ current->arg_dtypes[i] = user_dtype;
+ Py_INCREF(current->arg_dtypes[i]);
+ }
+ }
+ current->nargs = ufunc->nargs;
+ }
+ else {
+ result = -1;
+ }
+ }
+ }
+
+ free(arg_typenums);
@charris

charris Apr 28, 2013

Owner

The free doesn't match the allocation.

Owner

charris commented May 3, 2013

@jayvius Need a rebase.

@charris charris and 1 other commented on an outdated diff May 9, 2013

doc/source/reference/c-api.ufunc.rst
@@ -140,6 +140,14 @@ Functions
in as *arg_types* which must be a pointer to memory at least as
large as ufunc->nargs.
+.. cfunction:: int PyUFunc_RegisterLoopForStructType(PyUFuncObject* ufunc,
@charris

charris May 9, 2013

Owner

Can this function also be used to register ordinary ufuncs as well as for the structured type? Just curious.,

@jayvius

jayvius May 10, 2013

Contributor

I haven't extensively tested it with user defined scalar types, but in theory it should work (I tried registering a ufunc for adding scalar rationals and it seemed to work okay).

@charris

charris May 10, 2013

Owner

I think that would be worth adding to the documentation then. It looks like a generalization of PyUFunc_RegisterLoopForType that allows registration of any dtype via PyArray_Descr. If so the name should probably be changed to reflect that generality. Maybe PyUFunc_RegisterLoopByDescr. See also issue #2506. Could add a test using the scalar rationals. Does the type still need to be registered first?

@charris charris commented on the diff May 9, 2013

doc/source/user/c-info.ufunc-tutorial.rst
+ #else
+ PyMODINIT_FUNC initstruct_ufunc_test(void)
+ #endif
+ {
+ PyObject *m, *add_triplet, *d;
+ PyObject *dtype_dict;
+ PyArray_Descr *dtype;
+ PyArray_Descr *dtypes[3];
+
+ #if defined(NPY_PY3K)
+ m = PyModule_Create(&moduledef);
+ #else
+ m = Py_InitModule("struct_ufunc_test", StructUfuncTestMethods);
+ #endif
+
+ if (m == NULL) {
@charris

charris May 9, 2013

Owner

Should just return m in the PY3K case.

@charris

charris May 9, 2013

Owner

nvm, just noticed end of function below..

@charris charris commented on the diff May 9, 2013

numpy/core/src/umath/struct_ufunc_test.c.src
+PyMODINIT_FUNC initstruct_ufunc_test(void)
+#endif
+{
+ PyObject *m, *add_triplet, *d;
+ PyObject *dtype_dict;
+ PyArray_Descr *dtype;
+ PyArray_Descr *dtypes[3];
+
+#if defined(NPY_PY3K)
+ m = PyModule_Create(&moduledef);
+#else
+ m = Py_InitModule("struct_ufunc_test", StructUfuncTestMethods);
+#endif
+
+ if (m == NULL) {
+#if defined(NPY_PY3K)
@charris

charris May 9, 2013

Owner

Should just return m in PY3K case.

@charris

charris May 9, 2013

Owner

Nvm, just noticed the end of this function below.

@charris charris commented on an outdated diff May 9, 2013

numpy/core/src/umath/ufunc_type_resolution.c
/*
* Copy the dtype from 'op' if the type_num matches,
* to preserve metadata.
*/
- if (op[i] != NULL && PyArray_DESCR(op[i])->type_num == type_nums[i]) {
+ } else if (op[i] != NULL &&
@charris

charris May 9, 2013

Owner
}
else if () {

@charris charris commented on an outdated diff May 9, 2013

numpy/core/src/umath/struct_ufunc_test.c.src
+
+ x = (uint64_t*)i1;
+ y = (uint64_t*)i2;
+ z = (uint64_t*)op;
+
+ z[0] = x[0] + y[0];
+ z[1] = x[1] + y[1];
+ z[2] = x[2] + y[2];
+
+ i1 += is1;
+ i2 += is2;
+ op += os;
+ }
+}
+
+/* This a pointer to the above function */
@charris

charris May 9, 2013

Owner

This is clear from the assignment. Might be better to say why it is stored here.

@charris charris commented on the diff May 10, 2013

numpy/core/src/umath/ufunc_object.c
+ int result = 0;
+ int *arg_typenums;
+ PyObject *key, *cobj;
+
+ if (user_dtype == NULL) {
+ PyErr_SetString(PyExc_TypeError,
+ "unknown user defined struct dtype");
+ return -1;
+ }
+
+ key = PyInt_FromLong((long) user_dtype->type_num);
+ if (key == NULL) {
+ return -1;
+ }
+
+ arg_typenums = PyArray_malloc(ufunc->nargs * sizeof(int));
@charris

charris May 10, 2013

Owner

Allocation success needs a check.

Owner

charris commented May 10, 2013

And needs a rebase. I suspect some of your commits step on each other.

Owner

charris commented May 12, 2013

What do you think about changing the name of the new functions since it isn't restricted to the structured dtype?

Contributor

jayvius commented May 12, 2013

Sounds good. I'll make that change today.

Owner

charris commented May 13, 2013

And now I'm thinking RegisterLoopForDescr would be more consistent :( What do you think? I can make the change if you agree.

BTW, github doesn't mail a notification after changes are pushed, so it is best to add a comment.

Owner

charris commented May 13, 2013

I can't quite figure out the test failures. One timed out with a double free error. I'm going to ping the Travis bot again.

@charris charris closed this May 13, 2013

@charris charris reopened this May 13, 2013

Owner

charris commented May 13, 2013

Python 2.6 gives this after test completion:

*** glibc detected *** python: corrupted double-linked list: 0x000000000561dd40 ***

And the other error looks legitimate also. I don't know what is going on here, it may be related to some other commit.

Contributor

jayvius commented May 13, 2013

I think RegisterLoopForDescr sounds better too. I'll go ahead and make the change when I figure out a fix for these build issues.

@charris charris commented on an outdated diff May 15, 2013

doc/source/reference/c-api.ufunc.rst
@@ -140,6 +140,16 @@ Functions
in as *arg_types* which must be a pointer to memory at least as
large as ufunc->nargs.
+.. cfunction:: int PyUFunc_RegisterLoopByDescr(PyUFuncObject* ufunc,
@charris

charris May 15, 2013

Owner

Needs the new name.

@charris charris commented on an outdated diff May 15, 2013

doc/source/user/c-info.ufunc-tutorial.rst
@@ -884,6 +884,171 @@ as well as all other properties of a ufunc.
}
#endif
+
+.. _`sec:Numpy-struct-dtype`:
+
+Example Numpy ufunc with structured array dtype arguments
+=========================================================
+
+This example shows how to create a ufunc for a structured array dtype.
+For the example we show a trivial ufunc for adding two arrays with dtype
+'u8,u8,u8'. The process is a bit different from the other examples since
+a call to PyUFunc_FromFuncAndData doesn't fully register ufuncs for
+custom dtypes and structured array dtypes. We need to also call
+PyUFunc_RegisterLoopByDescr to finish setting up the ufunc.
@charris

charris May 15, 2013

Owner

Name change needed.

Owner

charris commented May 15, 2013

Tests pass, yay. The function still needs the rename.

Contributor

jayvius commented May 15, 2013

Rename done

Owner

charris commented May 15, 2013

OK, in it goes. Now we can start the debugging ;) BTW, I suspect that the bento scripts may need to be fixed as Ralph did for the first commit in this series.

Thanks.

charris added a commit that referenced this pull request May 15, 2013

@charris charris merged commit c546a22 into numpy:master May 15, 2013

1 check passed

default The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment