Ctypes callback #190

Merged
merged 1 commit into from Jun 3, 2012

3 participants

@teoliphant
SciPy member

This Pull Request allows integrate.quad to by-pass the Python-layer when given a Ctypes function pointer of the correct signature. When the callable passed to integrate.quad is a ctypes function pointer, it is checked for the correct signature, and then the raw C-function is handed to the underlying integration routine which does not call-back into Python.

@rgommers
SciPy member

On the C code changes I can't really comment (except that it looks useful).

The test is going to fail on some systems, because ctypes is not always available. So the import should be conditional, and the test should be decorated with @skipif.

About the timing test, doesn't that belong under benchmarks instead of tests? Each module has a bench() function for this sort of thing.

@pv pv and 1 other commented on an outdated diff Apr 7, 2012
scipy/integrate/__quadpack.h
+}
+
+#define CHECK_FUNC(func_type, fcn) { \
+ func_type = get_func_type(fcn); \
+ if (func_type == ERROR) return NULL; \
+ if (func_type == NOTCALLABLE) { \
+ PyErr_SetString(PyExc_TypeError, "quad: first argument is not callable"); \
+ return NULL; \
+ } \
+ if (func_type == INVALID_CTYPE) { \
+ PyErr_SetString(PyExc_TypeError, \
+ "quad: first argument is a ctypes function pointer with incorrect signature"); \
+ return NULL; \
+ }}
+
+/* Just the first part of the ctypes CFuncPtrObject */
@pv
SciPy member
pv added a note Apr 7, 2012

This hack should be replaced with a call to ctypes.addressof via Python space -- I'd like to avoid mucking around with CPython internals.

@teoliphant
SciPy member

This part is not a hack --- look at get_func_type again. It's just making calls to the ctypes module to identify what fcn is (a callable, a ctype-function with the right signature, or whatnot).

It sounds like you might be talking about the _GET_FUNC macro which is assuming something about the structure of ctypes objects. We could use ctypes.addressof via Python to get the function pointer, but that is a lot of C-code and reference count manipulation when it's a simple pointer de-reference in C-space. It would be very surprising if ctypes ever changed it's ABI so that the pointer immediately following PyObject_HEAD is not the function pointer we are after.

I just don't see it as a major problem in this case.

@pv
SciPy member
pv added a note Apr 9, 2012

Yep, it's about the _GET_FUNC below. I agree with you that it's unlikely that the ctypes API will change. However, this call is not on a hot path, so I would rather prefer the code use public APIs, if the performance gain from the hack is not significant.

@teoliphant
SciPy member

This is going to come-up again on the NumPy list. I'm not excited about being dependent on a slow Python-only API to get this information out of CTypes whenever we need it --- so we are going to do something like this in NumPy.

One thought is to do this in NumPy and then have SciPy rely on the NumPy API. This seems like a reasonable path forward.

@teoliphant
SciPy member

To that end, I will create a function for getting the raw function pointer using the Python code: ctypes.cast(func, ctypes.c_void_p).value. This is a bit un-pleasant in C and I would rather replace the code with something more direct (like an API that is supported somewhere else --- perhaps in NumPy).

@pv
SciPy member
pv added a note Apr 9, 2012

Ok, sounds good. Doing Python on C level is indeed messy (a similar situation with no C-level APIs occurs also with I/O on Python 3). It's probably best to move grabbing the pointer to a helper routine in a separate utilities header file. Also using the direct access hack could be considered if it is better encapsulated this way. But anyway, this is a minor issue, so it's best not to waste too much time on it.

@teoliphant
SciPy member

I wrote a C-function to do the casting and my timing test is now 4x slower instead of 2x faster. Of course this is in a loop of 100 calls to the integration routine for a simple function.

This is really unfortunate. Let me see if I can play with it some more.

@teoliphant
SciPy member

I changed the code back to using the direct function-pointer access. It was much, much faster. It makes me also wonder how much it is costing to do the ctypes type checks as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@pv pv and 1 other commented on an outdated diff Apr 7, 2012
scipy/integrate/__quadpack.h
@@ -67,6 +67,94 @@
already_printed_python_error = 0;\
}
+#define VALID_CTYPE 1
@pv
SciPy member
pv added a note Apr 7, 2012

Maybe enum?

@teoliphant
SciPy member

Sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@pv pv and 1 other commented on an outdated diff Apr 7, 2012
scipy/integrate/__quadpack.h
- if (PyErr_Occurred()) {
- ier = 80; /* Python error */
- PyErr_Clear();
+ if (PyErr_Occurred()) {
+ ier = 80; /* Python error */
+ PyErr_Clear();
+ }
+ }
+ else { /* func_type == VALID_CTYPE */
+ /* Can't allow another thread to run because of the global variables
+ quadpack_raw_function and quad_function2 being used */
@pv
SciPy member
pv added a note Apr 7, 2012

Could also this be made re-entrant by storing the quadpack_raw_function in store_quadpack_globals, like is done for the Python function? Granted, it's less likely that someone calls quad again from C code, but it could happen.

A thunk library would make life easier...

@teoliphant
SciPy member

Yes, I thought about that. It would be easy enough to make it re-entrant and probably worth it, even with the low-probability event.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@teoliphant
SciPy member

Ralf and Pauli thanks for the feedback. I've commented on Pauli's notes in-line.

Ralf, thanks for the skipif suggestion. The timing test is not really a bench-mark. It's more of a "is this code calling the C-function pointer directly, or is it interpreting the Ctypes object as a Python-callback". So, I'm thinking of it as a test to see whether or not the C-function pointer is being called directly by QUADPACK. But, other ways to test this would be helpful. The speed test is a bit brittle.

@charris charris and 1 other commented on an outdated diff Apr 9, 2012
scipy/integrate/__quadpack.h
@@ -20,35 +20,35 @@
*/
#if defined(NO_APPEND_FORTRAN)
- #if defined(UPPERCASE_FORTRAN)
- /* nothing to do here */
- #else
- #define DQAGSE dqagse
- #define DQAGIE dqagie
- #define DQAGPE dqagpe
- #define DQAWOE dqawoe
- #define DQAWFE dqawfe
- #define DQAWSE dqawse
- #define DQAWCE dqawce
- #endif
+ #if defined(UPPERCASE_FORTRAN)
+ /* nothing to do here */
+ #else
+ #define DQAGSE dqagse
@charris
SciPy member
charris added a note Apr 9, 2012

Tabs?

@teoliphant
SciPy member

I'm not really sure what is going on here. Perhaps there are tabs in the original file. These have been replaced with spaces now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@charris charris commented on an outdated diff Apr 9, 2012
scipy/integrate/__quadpack.h
+ Returns INVALID_CTYPE if it is a CType Function but of the incorrect type
+ Returns NOTCALLABLE if it is not a Python callable
+ Returns ERROR if other error occurs.
+*/
+
+static int
+get_func_type(PyObject *func) {
+ PyObject *ctypes_module, *CFuncPtr, *check, *c_double;
+ int result;
+
+ if (!PyCallable_Check(func)) return NOTCALLABLE;
+ ctypes_module = PyImport_ImportModule("ctypes");
+ if (ctypes_module == NULL) return ERROR;
+ CFuncPtr = PyObject_GetAttrString(ctypes_module, "_CFuncPtr");
+ if (CFuncPtr == NULL) {
+ Py_DECREF(ctypes_module);
@charris
SciPy member
charris added a note Apr 9, 2012

Indentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@charris charris and 1 other commented on an outdated diff Apr 9, 2012
scipy/integrate/__quadpack.h
+ goto fail;
+
+ return VALID_CTYPE;
+
+fail:
+ Py_DECREF(check);
+ Py_XDECREF(c_double);
+ return INVALID_CTYPE;
+}
+
+#define CHECK_FUNC(func_type, fcn) { \
+ func_type = get_func_type(fcn); \
+ if (func_type == ERROR) return NULL; \
+ if (func_type == NOTCALLABLE) { \
+ PyErr_SetString(PyExc_TypeError, "quad: first argument is not callable"); \
+ return NULL; \
@charris
SciPy member
charris added a note Apr 9, 2012

Putting control transfer in a macro is just sinful. I cry every time I see this in Numpy, and I'm sure that somewhere a puppy dies.

@teoliphant
SciPy member

That is a bit melo-dramatic. However, this is not general "control transfer" it's early exit from the function because of errors. It's semantically contained and easily identified upon reviewing the code. The point of any software engineering "ethics" should be to make code easier to write, read, and maintain. You always have to balance many things in that process. I find that zealous over-adherence to any one principle typically causes unintended effects that make things more difficult.

In this particular case, do you have an alternative suggestion that does not require removing the benefit of the macro in encapsulating code that is identical?

@charris
SciPy member
charris added a note Apr 9, 2012

What benefits? Modern compilers will inline small functions without prompting, plus you get type checking on the arguments and your eye can follow the flow of control without reference to some macro off in the woods. Really, I'm trying to break you of your macro habit, they simply aren't necessary for efficiency and when used like this obfuscate the code.

@teoliphant
SciPy member

I told you the benefit. It prevents writing a lot of similar code over and over again. I'm not convinced that being able to "follow the flow of control" actually matters when the flow on error is out-of-the-function.

Any desire to "break" a habit requires first seeing the need to. You do a lot of complaining about style, code re-writing, but not a lot of demonstrating a better approach. This rarely works and generally alienates the people you are purportedly trying to "help". It also takes a lot of time and resources -- resources that are already in short supply. You have to also look at how often someone will care about a particular piece of code. The fact that this bit of code (its supposedly ugly macros and all) has not been really touched in 10 years says something about the need to spend time agonizing over how it gets written.

I'm left to do the work of inferring what you would rather see. Are you preferring to see something like:

if (!check_func(fcn)) return NULL;

where check_func is a regular function? How else would you do things?

-Travis

@teoliphant
SciPy member

I've got code working that does the check_func and replaces that macro. I do not know how to replace the other macros and keep the benefit of storage variables and error-returning from within the other macros without major code re-design. Eventually, this C-code might be replaced with Cython anyway, and I'd rather not spend more time on it.

@teoliphant
SciPy member

The presumed check_func function and the get_func_type function have been merged into a single function (no macros). There was the need to create a macro to do the temporary function storage trick which is similar to the python-function code-path. If a better approach can be written, I would be happy to see it.

@teoliphant
SciPy member

Chuck: Sorry to be a bit prickly before on your observations about the use of Macros. My tendency is towards getting as quickly as possible to working code. Coupling that with most of the code being written when I knew less about the real reasons for object-oriented style and when it matters and when it doesn't often leaves me copying style instead of trying to improve it. I'm sure I'll continue to be more focused on getting something working rather than style, but I don't want to discourage you (or really anyone else as I know you are unflappable) from voicing your stylistic opinions.

@charris
SciPy member
charris added a note Apr 10, 2012

Not to worry. I figure it's my job to play the critic in these situations as your standing and reputation could easily intimidate some folks. Now I'm waiting for Mark to instruct me in the finer points...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@charris charris and 1 other commented on an outdated diff Apr 10, 2012
scipy/integrate/quadpack.h
@@ -60,6 +59,18 @@
quadpack_extra_arguments = store_quadpack_globals[1]; \
memcpy(&quadpack_jmpbuf, &store_jmp, sizeof(jmp_buf));
+#define INIT_CTYPES_FUNC(fcn) \
@charris
SciPy member
charris added a note Apr 10, 2012

In C++ this would be a class with a constructor and a restore method. I'd suggest a C implementation along the lines:

struct cfunc_store {
    void *qfunc;
    void *cfunc;
}

NPY_INLINE int
cfunc_store_init(struct cfunc_store *store, void *cfunc, void *qfunc)
{
    store->qfunc = qfunc;
    if ((store->cfunc = get_ctypes_function_pointer(cfunc)) == NULL) {
        return -1;
    }
    return 0;
}

NPY_INLINE void
cfunc_store_restore(struct cfunc_store *store)
{
    store->cfunc = store->qfunc;
}
@teoliphant
SciPy member

Thanks. This is helpful. In just trying to get something to work, I was just mirroring the code that was there (from 1999). I'm not sure if it's the best use of time or not, but I agree it's better to define a structure like this. I'm changing all the code to reflect this style.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@teoliphant teoliphant ENH: Add ability to use ctypes function pointers in quadpack and clea…
…n-up quadpack to use Object-style C instead of Macros.
9ba5fcc
@teoliphant
SciPy member

O.K. I've redone the PR to address the major concerns. The ctypes implementation dependent stuff is reduced to a single function which could be updated to use NumPy's function call when it is created.

@pv
SciPy member
pv commented May 19, 2012

Is there something additional that needs to be done to this? As far as I see, we don't need to wait here for the Cython/Numba/Numpy fast callback discussion to settle down. The quadpack wrapper code would call for some more reorganization, but that can maybe wait.

@teoliphant
SciPy member

I think this is ready to merge. I agree, we don't need to wait for the fast callback discussion.

@rgommers rgommers merged commit b2afa94 into scipy:master Jun 3, 2012
@rgommers
SciPy member

OK, merged to get it in for 0.11.

@rgommers
SciPy member

Ctypes is used in quadpack, this doesn't work. See http://projects.scipy.org/scipy/ticket/1670.

@teoliphant
SciPy member

I will fix this up. It's an import in the C-code. I thought the only reference to ctypes itself was commented out. But, there is one more place. I will submit a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment