Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix issue #8250 when np.array gets called on an invalid sequence #8570

Merged

Conversation

evanlimanto
Copy link
Contributor

@evanlimanto evanlimanto commented Feb 6, 2017

Fixes the issue in #8250. We should let np.random.shuffle should handle the intricacies of the different types passed in.

@stefanv
Copy link
Contributor

stefanv commented Feb 6, 2017

Thanks, @evanlimanto. Could you please add a test to verify this behavior?

@evanlimanto evanlimanto force-pushed the random-permutation-shuffle-fix branch 2 times, most recently from 2f062d1 to 41e789a Compare February 7, 2017 02:06
@evanlimanto
Copy link
Contributor Author

evanlimanto commented Feb 7, 2017

Added tests. I know I could've made a helper function (in duplicating the tests for shuffle), but I didn't want to disrupt the structure of the code.

@eric-wieser
Copy link
Member

Looks good to me. I'll merge in a few days if there are no other objections

@evanlimanto
Copy link
Contributor Author

Can this be merged soon?

@@ -4871,7 +4871,7 @@ cdef class RandomState:
if isinstance(x, (int, long, np.integer)):
arr = np.arange(x)
else:
arr = np.array(x)
arr = x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hang on, this means that x is now modified in place, rather than copied first. This isn't ok. Perhaps a copy.copy here instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its also not good, that the tests did not pick it up.

@evanlimanto evanlimanto force-pushed the random-permutation-shuffle-fix branch from 41e789a to 39d7d8c Compare March 8, 2017 09:54
@evanlimanto
Copy link
Contributor Author

Fixed, and added a test to check for mutation.

@@ -4871,7 +4871,8 @@ cdef class RandomState:
if isinstance(x, (int, long, np.integer)):
arr = np.arange(x)
else:
arr = np.array(x)
import copy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this go at the top with the other imports?

desired = conv([0, 1, 9, 6, 2, 4, 5, 8, 7, 3])
assert_array_equal(actual, desired)
# check that array is not mutated.
assert_array_equal(alist, conv([1, 2, 3, 4, 5, 6, 7, 8, 9, 0]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be clearer immediately under the permutation call

…rray gets called on an invalid sequence.
@evanlimanto evanlimanto force-pushed the random-permutation-shuffle-fix branch from 39d7d8c to 7a73bad Compare March 11, 2017 01:02
@evanlimanto
Copy link
Contributor Author

Fixed, thanks!

@eric-wieser
Copy link
Member

Looks good to me, as do the tests. The source of array_copy shows that np.copy(arr) is basically equivalent to copy.copy(arr), so I think we're all good here.

@eric-wieser eric-wieser merged commit cf349c9 into numpy:master Mar 12, 2017
@eric-wieser
Copy link
Member

Thanks @evanlimanto

@jreback
Copy link

jreback commented Mar 12, 2017

This broke one of our tests downstream because this is now not returning an array (but a list). I don't think this is what you want?

You are testing this input, but not that it actually returns an ndarray (just that it is equal).

https://travis-ci.org/pandas-dev/pandas/jobs/210272568

In [5]: np.__version__
Out[5]: '1.13.0.dev0+cf349c9'

In [6]: np.random.permutation([(0, 1), (1, 0)])
Out[6]: [(0, 1), (1, 0)]
In [1]: np.__version__
Out[1]: '1.12.0'

In [2]: np.random.permutation([(0, 1), (1, 0)])
Out[2]: 
array([[1, 0],
       [0, 1]])

@seberg
Copy link
Member

seberg commented Mar 12, 2017

Hmmm, maybe we should just revert and just change the documentation, its a safer bet. Is there much gain in allowing general sequences here?

@charris
Copy link
Member

charris commented Mar 12, 2017

I think reversion would be the quick fix. I'm not sure there is a better fix.

@eric-wieser
Copy link
Member

How about the following?

try:
    arr = np.copy(arr)
except:
    arr = copy.copy(arr)

@charris
Copy link
Member

charris commented Mar 13, 2017

I think the problem is that an array should be returned by default, as previously happened. So at a minimum we still need arr = array(arr). Note that the example given does work if dtype=object is used in the array conversion.

In [7]: %paste
N = 4
A = np.arange(N)[:,None]
A = np.concatenate((A,A,A,A), axis = 1)
B = range(N)
c = list(zip(B,A))

## -- End pasted text --

In [8]: np.random.permutation(array(c, dtype=object))
Out[8]: 
array([[0, array([0, 0, 0, 0])],
       [3, array([3, 3, 3, 3])],
       [2, array([2, 2, 2, 2])],
       [1, array([1, 1, 1, 1])]], dtype=object)

So perhaps use that as the fallback, although that may also lead to hard to interpret errors.

charris added a commit to charris/numpy that referenced this pull request Mar 13, 2017
BUG: fix issue numpy#8250 where np.random.permutation fail.

This reverts commit 7a73bad.

Closes numpy#8776.
charris added a commit that referenced this pull request Mar 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants