MAINT: Simplify mtrand.pyx helpers #6997

gfyoung · 2016-01-12T02:16:12Z

Extends from suggestions/comments for #6938 in which it was possible to simplify some of the if...else blocks when broadcasting was involved.

gfyoung · 2016-01-14T19:48:26Z

Could someone take a look at this? I think it should be good to merge unless I've missed something in the simplification. I've just been rebasing this PR continuously with master.

bashtage · 2016-01-15T22:01:44Z

numpy/random/mtrand/mtrand.pyx

+    array_data = <double *>PyArray_DATA(array)
+
+    with lock, nogil:
+        for i from 0 <= i < multi.size:


Is there some reason not to convert this ancient PyRex format loop to a modern for i in range(multi.size)?

No reason at all. Done.

bashtage · 2016-01-15T22:06:07Z

Have you tried to measure the performance on say a 2-input function + size when size is none and so must be computed from the initial broadcast? I would think that is the only concern - it is possible that the refactor might be faster since there can be fewer function calls.

I rewrote these here and noticed the same thing in terms of the MultiIter_Next that you have refactored (I also moved virtaully all parameter checks upstream, which is the extra code).

gfyoung · 2016-01-15T23:51:49Z

I ran the following command on my local build of numpy and on my PR build of numpy:

python -m timeit -n 100 "import numpy as np;np.random.uniform(low=0, high=np.arange(1000000))"

np.random.uniform calls cont2_array when broadcasting, and this was one of the functions that I had modified.

For my local build (version 1.10.2):

100 loops, best of 3: 55.7 msec per loop

For my PR build:

100 loops, best of 3: 47.5 msec per loop

That's approximately a 15% speed-up there.

gfyoung · 2016-01-19T21:08:40Z

Travis + Appveyor are happy. If there is nothing else, this should be good to merge.

jaimefrio · 2016-01-20T03:07:41Z

Could you rewrite this using the Python API (instead of the C API) where it won't hurt performance?

For instance, this has no performance loss against current master, and is much more readable for maintainers without a good grasp of C:

cdef object cont2_array(rk_state *state, rk_cont2 func, object size,
                        ndarray oa, ndarray ob, object lock):
    cdef double *array_data
    cdef double *oa_data
    cdef double *ob_data
    cdef ndarray array "arrayObject"
    cdef npy_intp i
    cdef broadcast multi

    if size is None:
        multi = <broadcast>np.broadcast(oa, ob)
        array = <ndarray>np.empty(multi.shape, dtype=np.float64)
    else:
        array = <ndarray>np.empty(size, dtype=np.float64)
        multi = <broadcast>np.broadcast(oa, ob, array)
        if multi.shape != array.shape:
            raise ValueError("size is not compatible with inputs")
    with lock, nogil:
        for i from 0 <= i < multi.size:
            oa_data = <double *>PyArray_MultiIter_DATA(multi, 0)
            ob_data = <double *>PyArray_MultiIter_DATA(multi, 1)
            array_data[i] = func(state, oa_data[0], ob_data[0])
            PyArray_MultiIter_NEXT(multi)

    return array

gfyoung · 2016-01-20T03:11:37Z

@jaimefrio : Sure thing. BTW, are we to ignore the failing 3.2 test on Travis?

jaimefrio · 2016-01-20T03:58:38Z

I think so, it seems to be a problem with nose having dropped Python 3,2 support and started using unicode literals, which don't work in 3.2.

charris · 2016-01-20T04:09:39Z

The test failure is unrelated to this PR.

charris · 2016-01-20T04:14:11Z

In fact, if you rebase -- yes, now would be a good time ;) -- the error should go away as we have dropped Python 3.2 support in master.

gfyoung · 2016-01-20T04:18:08Z

@charris : I still see 3.2 in the .travis.yaml file, but will rebase as soon as #7063 is merged

jaimefrio · 2016-01-20T09:09:37Z

numpy/random/mtrand/mtrand.pyx

-                array_data[i] = func(state, on_data[0], op_data[0])
-                PyArray_MultiIter_NEXT(multi)
+        multi = <broadcast>np.broadcast(on, op)
+        array = <ndarray>np.empty(multi.shape, dtype=int)


Is dtype=int the correct value to always get a C long? Hopefully Appveyor will tell us if this breaks for 64 bit Windows and its 32 bit longs...

I was following what had been done in the else block, but now I suddenly see that there is a gap in the testing there because the test_binomial function (and I suspect other functions as well) never tests broadcasting and hence the else block is not reached in the tests.

Is there perhaps a safer dtype we could use? Perhaps np.long?

int always translates to np.int_ which always translates to long (since ths is the python object, not some Cython stuff).That said, np.long is much more clear and I agree we should prefer that if we mean a C long equivalent.

👍 - done.

Hmm...apparently np.long is not the way to go, as Travis failed on the 32-bit build here. Reverting to np.int.

Just use int. np.int is an unnecessary alias.

Fair enough. Done.

Refactored methods that broadcast arguments together by finding additional common ground between code in the if...else branches that involved a size parameter being passed in.

bashtage · 2016-01-20T14:30:19Z

For instance, this has no performance loss against current master, and is much more readable for maintainers without a good grasp of C

Is this change really free? Maybe due to a change coming in NumPy 1.11?

Using Cython/ 0.23/NumPy 1.10 shows different C code for the C-API vs numpy.broadcast.

Using C-API

    __pyx_t_6 = __pyx_f_5numpy_PyArray_MultiIterNew3(((PyObject *)__pyx_v_a_arr), ((PyObject *)__pyx_v_b_arr), ((PyObject *)__pyx_v_randoms)); if (unlikely(!__pyx_t_6)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 161; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __Pyx_GOTREF(__pyx_t_6);
    if (!(likely(((__pyx_t_6) == Py_None) || likely(__Pyx_TypeTest(__pyx_t_6, __pyx_ptype_5numpy_broadcast))))) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 161; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __pyx_v_it = ((PyArrayMultiIterObject *)__pyx_t_6);
    __pyx_t_6 = 0;

Using np.broadcast

    __pyx_t_6 = PyTuple_New(2); if (unlikely(!__pyx_t_6)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 163; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __Pyx_GOTREF(__pyx_t_6);
    __Pyx_INCREF(((PyObject *)__pyx_v_a_arr));
    __Pyx_GIVEREF(((PyObject *)__pyx_v_a_arr));
    PyTuple_SET_ITEM(__pyx_t_6, 0, ((PyObject *)__pyx_v_a_arr));
    __Pyx_INCREF(((PyObject *)__pyx_v_b_arr));
    __Pyx_GIVEREF(((PyObject *)__pyx_v_b_arr));
    PyTuple_SET_ITEM(__pyx_t_6, 1, ((PyObject *)__pyx_v_b_arr));
    __pyx_t_1 = __Pyx_PyObject_Call(((PyObject *)__pyx_ptype_5numpy_broadcast), __pyx_t_6, NULL); if (unlikely(!__pyx_t_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 163; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __Pyx_GOTREF(__pyx_t_1);
    __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
    __pyx_t_6 = __pyx_t_1;
    __Pyx_INCREF(__pyx_t_6);
    __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
    __pyx_v_it = ((PyArrayMultiIterObject *)__pyx_t_6);
    __pyx_t_6 = 0;

jaimefrio · 2016-01-20T14:52:18Z

@bashtage No, it is not entirely free, but it is negligibly more expensive in the situations I tested. Most of the extra code in your listing seems to be creating the tuple with the arguments to pass to the Python function call, so if you are calling the function with a small size, this will probably be somewhat slower. But it is a one time, overhead cost, and e.g. with size=1000 the difference is unnoticeable.

And it really makes the code accessible to a larger audience, which is one of the selling points of Cython.

jaimefrio · 2016-01-20T14:58:11Z

I'm going to let AppVeyor do its thing, but will merge it as soon as it gives the green light, thanks for your patience, Greg.

gfyoung · 2016-01-20T16:24:10Z

@jaimefrio : I see green everywhere now.

MAINT: Simplify mtrand.pyx helpers

jaimefrio · 2016-01-20T18:00:49Z

In it goes, thanks again!

gfyoung changed the title ~~WIP, MAINT: Simplify mtrand.pyx~~ WIP, MAINT: Simplify mtrand.pyx helpers Jan 12, 2016

gfyoung changed the title ~~WIP, MAINT: Simplify mtrand.pyx helpers~~ MAINT: Simplify mtrand.pyx helpers Jan 12, 2016

charris added component: numpy.random 03 - Maintenance labels Jan 12, 2016

bashtage reviewed Jan 15, 2016
View reviewed changes

jaimefrio reviewed Jan 20, 2016
View reviewed changes

MAINT: Simplified mtrand.pyx helpers

0b150b8

Refactored methods that broadcast arguments together by finding additional common ground between code in the if...else branches that involved a size parameter being passed in.

jaimefrio added a commit that referenced this pull request Jan 20, 2016

Merge pull request #6997 from gfyoung/mtrand_helpers_compress

9ad54ae

MAINT: Simplify mtrand.pyx helpers

jaimefrio merged commit 9ad54ae into numpy:master Jan 20, 2016

gfyoung deleted the mtrand_helpers_compress branch January 20, 2016 20:40

gfyoung mentioned this pull request Jan 21, 2016

TST, DOC: Added Broadcasting Tests in test_random.py #7082

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAINT: Simplify mtrand.pyx helpers #6997

MAINT: Simplify mtrand.pyx helpers #6997

gfyoung commented Jan 12, 2016

gfyoung commented Jan 14, 2016

bashtage Jan 15, 2016

gfyoung Jan 15, 2016

bashtage commented Jan 15, 2016

gfyoung commented Jan 15, 2016

gfyoung commented Jan 19, 2016

jaimefrio commented Jan 20, 2016

gfyoung commented Jan 20, 2016

jaimefrio commented Jan 20, 2016

charris commented Jan 20, 2016

charris commented Jan 20, 2016

gfyoung commented Jan 20, 2016

jaimefrio Jan 20, 2016

gfyoung Jan 20, 2016

seberg Jan 20, 2016

gfyoung Jan 20, 2016

gfyoung Jan 20, 2016

rkern Jan 20, 2016

gfyoung Jan 20, 2016

bashtage commented Jan 20, 2016

jaimefrio commented Jan 20, 2016

jaimefrio commented Jan 20, 2016

gfyoung commented Jan 20, 2016

jaimefrio commented Jan 20, 2016

MAINT: Simplify mtrand.pyx helpers #6997

MAINT: Simplify mtrand.pyx helpers #6997

Conversation

gfyoung commented Jan 12, 2016

gfyoung commented Jan 14, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bashtage commented Jan 15, 2016

gfyoung commented Jan 15, 2016

gfyoung commented Jan 19, 2016

jaimefrio commented Jan 20, 2016

gfyoung commented Jan 20, 2016

jaimefrio commented Jan 20, 2016

charris commented Jan 20, 2016

charris commented Jan 20, 2016

gfyoung commented Jan 20, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bashtage commented Jan 20, 2016

jaimefrio commented Jan 20, 2016

jaimefrio commented Jan 20, 2016

gfyoung commented Jan 20, 2016

jaimefrio commented Jan 20, 2016