BUG: One Element Array Inputs Return Scalars in np.random #7055

gfyoung · 2016-01-18T20:38:57Z

Fixes issue raised in #4263 in which np.random methods returned scalars when passed one-element array inputs. This is because one-element ndarrays can be cast to integers / floats, which is what functions like PyFloat_AsDouble do before converting to the intended data type.

This commit changes the check used to determine whether the inputs are purely scalar by, as suggested by @rkern, converting all inputs to arrays and checking if the resulting shape is an empty tuple (scalar) or not (array).

gfyoung · 2016-01-18T20:39:47Z

PR is currently WIP because I still need to add a test to make sure that this behavior does not resurface. However, I just want to make sure that my changes aren't breaking any existing tests.

jaimefrio · 2016-01-19T00:39:50Z

numpy/random/mtrand/mtrand.pyx

@@ -1551,18 +1551,17 @@ cdef class RandomState:
        cdef double flow, fhigh, fscale
        cdef object temp

-        flow = PyFloat_AsDouble(low)
-        fhigh = PyFloat_AsDouble(high)
+        if np.array(low).shape == np.array(high).shape == ():


Is this double comparison valid Cython? Or does it end comparing True (or False) to ()?

It seems to have worked. I tested the methods briefly by hand (more formal testing is coming though!), and I didn't have any issues with the double (or even triple) comparison with regards to output data type.

I should add that while I do not know the validity of chained comparisons in Cython, I know it is valid in pure Python. I guess since you can compile pure Python with Cython, it should then be valid by extension? A quick look over the Cython docs does not indicate any issues with handling such chains.

Yup, it is the Python thing, guess what confused me was the double ==, but it does what it is expected to:

In [184]: 0 == 0 == 0 Out[184]: True In [185]: (0 == 0) == 0 Out[185]: False In [186]: 0 == (0 == 0) Out[186]: False

gfyoung · 2016-01-19T10:14:06Z

Tests have been added, and Travis + Appveyor are happy. Should be good to merge as is, as I can uncomment the randint test after a rebase in #6938.

jaimefrio · 2016-01-20T03:48:24Z

This change makes a call to np.random.uniform(0, 1) 5x slower in my system:

In [39]: %timeit np.random.uniform(0, 1)
# current master
The slowest run took 19.17 times longer than the fastest. This could mean that an intermediate result is being cached 
1000000 loops, best of 3: 199 ns per loop
# this PR
The slowest run took 26.30 times longer than the fastest. This could mean that an intermediate result is being cached 
1000000 loops, best of 3: 988 ns per loop

Not sure if this is really relevant, as bulk calling e.g. np.random.uniform(0, 1, 1000), only shows a much smaller slowdown (15us to 16us), but would again like to hear other opinions.

gfyoung · 2016-01-22T17:05:07Z

@jaimefrio : Should we wait longer, or can this be merged?

jaimefrio · 2016-01-23T05:08:27Z

You can get some of the speed back if you change the condition:

if np.array(low).shape == np.array(high).shape == ():

to:

if (not (isinstance(low, np.ndarray) and low.shape) and
                not (isinstance(high, np.ndarray) and high.shape))

In [2]: %timeit np.random.uniform(0, 1)
# this PR
1000000 loops, best of 3: 1.04 µs per loop
# my changes
1000000 loops, best of 3: 358 ns per loop

In [3]: %timeit np.random.uniform([0], [1])
# this PR
100000 loops, best of 3: 4.95 µs per loop
# my changes
100000 loops, best of 3: 3.76 µs per loop

Not sure if the extra complexity is worth the gains, what do you think?

gfyoung · 2016-01-23T05:12:06Z

@jaimefrio : IMHO, simplicity wins in this case. It seems that the performance scales relatively well from the timing analysis you provided earlier, so obfuscating the code in order to optimize the smallest input case possible does not seem worthwhile here.

jaimefrio · 2016-01-23T06:18:32Z

If we go this route, I think we should move the array creation up, rather than redoing it:

olow = <ndarray>PyArray_FROM_OTF(low, NPY_DOUBLE, NPY_ARRAY_ALIGNED)
ohigh = <ndarray>PyArray_FROM_OTF(high, NPY_DOUBLE, NPY_ARRAY_ALIGNED)
if olow.shape == ohigh.shape == ():
    ...

It will make a noticeable difference if low and high are large arrays.

gfyoung · 2016-01-23T14:44:26Z

Ah, yes, good point. Done.

homu · 2016-01-24T00:22:15Z

☔ The latest upstream changes (presumably #7082) made this pull request unmergeable. Please resolve the merge conflicts.

jaimefrio · 2016-01-24T00:25:16Z

This is going to be painful to rebase, but once you get it done, it should be ready to go.

…andom Fixes bug in np.random methods that would return scalars when passed one-element array inputs. This is because one-element ndarrays can be cast to integers / floats, which is what functions like PyFloat_AsDouble do before converting to the intended data type. This commit changes the check used to determine whether the inputs are purely scalar by converting all inputs to arrays and checking if the resulting shape is an empty tuple (scalar) or not (array). Closes gh-4263.

gfyoung · 2016-01-24T16:01:00Z

@jaimefrio : Rebase is done, and Travis + Appveyor are happy.

gfyoung · 2016-01-24T16:01:49Z

Accidentally clicked the wrong button when I wanted to comment. Whoops.

BUG: One Element Array Inputs Return Scalars in np.random

jaimefrio · 2016-01-25T00:20:01Z

Thanks Greg, one more in...

gfyoung · 2016-01-25T00:20:37Z

👍

gfyoung mentioned this pull request Jan 18, 2016

ENH: Allow Randint to Broadcast Arguments #6938

Closed

jaimefrio reviewed Jan 19, 2016
View reviewed changes

charris added 00 - Bug component: numpy.random labels Jan 19, 2016

gfyoung closed this Jan 24, 2016

gfyoung reopened this Jan 24, 2016

jaimefrio added a commit that referenced this pull request Jan 25, 2016

Merge pull request #7055 from gfyoung/single_elt_array_return_matching

d641eed

BUG: One Element Array Inputs Return Scalars in np.random

jaimefrio merged commit d641eed into numpy:master Jan 25, 2016

gfyoung deleted the single_elt_array_return_matching branch January 25, 2016 00:19

charris mentioned this pull request Sep 17, 2016

random.normal does not always preserve shape of size-1 arrays #7983

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: One Element Array Inputs Return Scalars in np.random #7055

BUG: One Element Array Inputs Return Scalars in np.random #7055

gfyoung commented Jan 18, 2016

gfyoung commented Jan 18, 2016

jaimefrio Jan 19, 2016

gfyoung Jan 19, 2016

gfyoung Jan 19, 2016

jaimefrio Jan 19, 2016

gfyoung commented Jan 19, 2016

jaimefrio commented Jan 20, 2016

gfyoung commented Jan 22, 2016

jaimefrio commented Jan 23, 2016

gfyoung commented Jan 23, 2016

jaimefrio commented Jan 23, 2016

gfyoung commented Jan 23, 2016

homu commented Jan 24, 2016

jaimefrio commented Jan 24, 2016

gfyoung commented Jan 24, 2016

gfyoung commented Jan 24, 2016

jaimefrio commented Jan 25, 2016

gfyoung commented Jan 25, 2016

BUG: One Element Array Inputs Return Scalars in np.random #7055

BUG: One Element Array Inputs Return Scalars in np.random #7055

Conversation

gfyoung commented Jan 18, 2016

gfyoung commented Jan 18, 2016

jaimefrio Jan 19, 2016

Choose a reason for hiding this comment

gfyoung Jan 19, 2016

Choose a reason for hiding this comment

gfyoung Jan 19, 2016

Choose a reason for hiding this comment

jaimefrio Jan 19, 2016

Choose a reason for hiding this comment

gfyoung commented Jan 19, 2016

jaimefrio commented Jan 20, 2016

gfyoung commented Jan 22, 2016

jaimefrio commented Jan 23, 2016

gfyoung commented Jan 23, 2016

jaimefrio commented Jan 23, 2016

gfyoung commented Jan 23, 2016

homu commented Jan 24, 2016

jaimefrio commented Jan 24, 2016

gfyoung commented Jan 24, 2016

gfyoung commented Jan 24, 2016

jaimefrio commented Jan 25, 2016

gfyoung commented Jan 25, 2016