-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT: Simplify mtrand.pyx helpers #6997
Conversation
Could someone take a look at this? I think it should be good to merge unless I've missed something in the simplification. I've just been rebasing this PR continuously with |
array_data = <double *>PyArray_DATA(array) | ||
|
||
with lock, nogil: | ||
for i from 0 <= i < multi.size: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some reason not to convert this ancient PyRex format loop to a modern for i in range(multi.size)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason at all. Done.
Have you tried to measure the performance on say a 2-input function + size when size is none and so must be computed from the initial broadcast? I would think that is the only concern - it is possible that the refactor might be faster since there can be fewer function calls. I rewrote these here and noticed the same thing in terms of the |
I ran the following command on my local build of
For my local build (version 1.10.2):
For my PR build:
That's approximately a 15% speed-up there. |
Travis + Appveyor are happy. If there is nothing else, this should be good to merge. |
Could you rewrite this using the Python API (instead of the C API) where it won't hurt performance? For instance, this has no performance loss against current master, and is much more readable for maintainers without a good grasp of C: cdef object cont2_array(rk_state *state, rk_cont2 func, object size,
ndarray oa, ndarray ob, object lock):
cdef double *array_data
cdef double *oa_data
cdef double *ob_data
cdef ndarray array "arrayObject"
cdef npy_intp i
cdef broadcast multi
if size is None:
multi = <broadcast>np.broadcast(oa, ob)
array = <ndarray>np.empty(multi.shape, dtype=np.float64)
else:
array = <ndarray>np.empty(size, dtype=np.float64)
multi = <broadcast>np.broadcast(oa, ob, array)
if multi.shape != array.shape:
raise ValueError("size is not compatible with inputs")
with lock, nogil:
for i from 0 <= i < multi.size:
oa_data = <double *>PyArray_MultiIter_DATA(multi, 0)
ob_data = <double *>PyArray_MultiIter_DATA(multi, 1)
array_data[i] = func(state, oa_data[0], ob_data[0])
PyArray_MultiIter_NEXT(multi)
return array |
@jaimefrio : Sure thing. BTW, are we to ignore the failing |
I think so, it seems to be a problem with nose having dropped Python 3,2 support and started using unicode literals, which don't work in 3.2. |
The test failure is unrelated to this PR. |
In fact, if you rebase -- yes, now would be a good time ;) -- the error should go away as we have dropped Python 3.2 support in master. |
array_data[i] = func(state, on_data[0], op_data[0]) | ||
PyArray_MultiIter_NEXT(multi) | ||
multi = <broadcast>np.broadcast(on, op) | ||
array = <ndarray>np.empty(multi.shape, dtype=int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is dtype=int
the correct value to always get a C long
? Hopefully Appveyor will tell us if this breaks for 64 bit Windows and its 32 bit longs...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was following what had been done in the else
block, but now I suddenly see that there is a gap in the testing there because the test_binomial
function (and I suspect other functions as well) never tests broadcasting and hence the else
block is not reached in the tests.
Is there perhaps a safer dtype
we could use? Perhaps np.long
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int
always translates to np.int_
which always translates to long (since ths is the python object, not some Cython stuff).That said, np.long is much more clear and I agree we should prefer that if we mean a C long equivalent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 - done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm...apparently np.long
is not the way to go, as Travis failed on the 32-bit build here. Reverting to np.int
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use int
. np.int
is an unnecessary alias.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. Done.
Refactored methods that broadcast arguments together by finding additional common ground between code in the if...else branches that involved a size parameter being passed in.
Is this change really free? Maybe due to a change coming in NumPy 1.11? Using Cython/ 0.23/NumPy 1.10 shows different C code for the C-API vs Using C-API __pyx_t_6 = __pyx_f_5numpy_PyArray_MultiIterNew3(((PyObject *)__pyx_v_a_arr), ((PyObject *)__pyx_v_b_arr), ((PyObject *)__pyx_v_randoms)); if (unlikely(!__pyx_t_6)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 161; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
__Pyx_GOTREF(__pyx_t_6);
if (!(likely(((__pyx_t_6) == Py_None) || likely(__Pyx_TypeTest(__pyx_t_6, __pyx_ptype_5numpy_broadcast))))) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 161; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
__pyx_v_it = ((PyArrayMultiIterObject *)__pyx_t_6);
__pyx_t_6 = 0; Using __pyx_t_6 = PyTuple_New(2); if (unlikely(!__pyx_t_6)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 163; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
__Pyx_GOTREF(__pyx_t_6);
__Pyx_INCREF(((PyObject *)__pyx_v_a_arr));
__Pyx_GIVEREF(((PyObject *)__pyx_v_a_arr));
PyTuple_SET_ITEM(__pyx_t_6, 0, ((PyObject *)__pyx_v_a_arr));
__Pyx_INCREF(((PyObject *)__pyx_v_b_arr));
__Pyx_GIVEREF(((PyObject *)__pyx_v_b_arr));
PyTuple_SET_ITEM(__pyx_t_6, 1, ((PyObject *)__pyx_v_b_arr));
__pyx_t_1 = __Pyx_PyObject_Call(((PyObject *)__pyx_ptype_5numpy_broadcast), __pyx_t_6, NULL); if (unlikely(!__pyx_t_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 163; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
__Pyx_GOTREF(__pyx_t_1);
__Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
__pyx_t_6 = __pyx_t_1;
__Pyx_INCREF(__pyx_t_6);
__Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
__pyx_v_it = ((PyArrayMultiIterObject *)__pyx_t_6);
__pyx_t_6 = 0; |
@bashtage No, it is not entirely free, but it is negligibly more expensive in the situations I tested. Most of the extra code in your listing seems to be creating the tuple with the arguments to pass to the Python function call, so if you are calling the function with a small And it really makes the code accessible to a larger audience, which is one of the selling points of Cython. |
I'm going to let AppVeyor do its thing, but will merge it as soon as it gives the green light, thanks for your patience, Greg. |
@jaimefrio : I see green everywhere now. |
MAINT: Simplify mtrand.pyx helpers
In it goes, thanks again! |
Extends from suggestions/comments for #6938 in which it was possible to simplify some of the
if...else
blocks when broadcasting was involved.