Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for np.random.choice #2727

Closed
wants to merge 5 commits into from
Closed

Fixes for np.random.choice #2727

wants to merge 5 commits into from

Conversation

seberg
Copy link
Member

@seberg seberg commented Nov 13, 2012

As mentioned on the mailinglist this changes the default for the size argument to None and allows tuples as well, just as other random functions do. This part would need to be in the 1.7. released to avoid changing it lateron.

The second commit fixes a bug that I noticed while doing the first thing. I know that tests are missing (and if someone feels like adding them please do so), but I think it would be better to raise an Exception then to keep the current behaviour (unless I am seeing things wrong). However np.unique returns new sorted, but it must be kept exactly in the original order as far as I can see.

It does not adress a possible axis argument (see also #2724) as it should not matter for 1.7.

The size argument to random.choice should work like it does for all
other functions in random as well.
Random choice used np.unique to find new indices when replace
was False and p given. This is wrong since unique will sort the
indices. This solves the bug, but likely not ideal.
@njsmith
Copy link
Member

njsmith commented Nov 13, 2012

Looks reasonable, but needs tests like noted.

You need to regenerate the .c file from the .pyx, that doesn't happen automatically. (Another reason for tests, I think this means the current patch may actually not be tested...?)

@seberg
Copy link
Member Author

seberg commented Nov 13, 2012

Yes, had not really run the tests (at least not for all changes), and lots of small things I overlooked as I did not have the new things tested. Forgot to note in the commit, but thanks to @alan-isaac for pointing out the 0-d vs. scalar thing. The code to do this right is a bit awkward to be honest, maybe someone knows a better way, but the different branches need different special cases to give an array or scalar consistent here.

I have added tests for this too. I have run the tests locally (I think the ternary operator in the cython file should not matter for 2.4 as its translated to C). I did not commit mtrand.c since I am currently using cython 1.6, so it would be nice if someone else could do that after looking over this (though if that is inconvenient I guess I could also install a newer version).

Especially for the replace=False with p given case, someone should have a very careful look if it is correct, since the test for this will only make sure that behaviour for the test case does not change.

@seberg
Copy link
Member Author

seberg commented Nov 13, 2012

Sorry, did a rebase if anyone noticed. mtrand.c with a wrong version sneaked in there... Note that the Travis tests have to fail since mtrand.c is recreated.

@seberg
Copy link
Member Author

seberg commented Nov 17, 2012

Regenerated mtrand.c using cython 0.17.1. @certik just so its not forgotten, these changes are meant for 1.7.

@njsmith
Copy link
Member

njsmith commented Nov 17, 2012

@josef-pkt: I know you keep an eye on a lot of the RNG stuff, want to review this? (Note @seberg's comment above: "Especially for the replace=False with p given case, someone should have a very careful look if it is correct, since the test for this will only make sure that behaviour for the test case does not change.")

@certik
Copy link
Contributor

certik commented Dec 6, 2012

@seberg, see #2793.

@rgommers
Copy link
Member

rgommers commented Dec 7, 2012

See also http://projects.scipy.org/numpy/ticket/2246, maybe you could take that along?

@seberg
Copy link
Member Author

seberg commented Dec 7, 2012

@rgommers thats not hard, but I don't see much of a reason to try hard to squeeze it into 1.7.

@rgommers
Copy link
Member

rgommers commented Dec 7, 2012

Agreed, that's what I commented on the ticket too.

certik added a commit that referenced this pull request Dec 13, 2012
@seberg
Copy link
Member Author

seberg commented Dec 13, 2012

Closing here, since it was pushed into master, but I would feel better if someone had at least a look at the stuff and think its ok... I cannot see a flaw in the bug fix (without the sort the results are obviously wrong) and I looked at the results and they seem reasonable, but still...

@seberg seberg closed this Dec 13, 2012
@seberg
Copy link
Member Author

seberg commented Dec 14, 2012

@njsmith, @josef-pkt maybe it is not a huge issue, but I just wondered if the function as is is actually 100% right. The thing is it uses things like np.random.permutation to call other RNG functions. Is this correct, or could this mean that the wrong random state is used for some funnier use cases? I have really no idea if this can or cannot be a problem and self.permutation must be used instead?

Acutally only one case in there and that is with np.random.random(shape)

@njsmith
Copy link
Member

njsmith commented Dec 14, 2012

!!

If a method on RandomState is calling into the shared global RandomState via np.random.foo, then that is a huge bug!

Really we should have tests that scramble the global random seed, and then test that RandomState(<fixed seed>).<method> always returns the same value. It's critical that these methods be deterministic (given the seed) in order to allow people to, e.g., reproduce scientific results across numpy releases.

@certik certik mentioned this pull request Dec 15, 2012
@certik
Copy link
Contributor

certik commented Dec 15, 2012

Ok, so it looks like I merged in a huge bug. My apologies for that. I have created a high priority issue for this: #2826.
@seberg, would you have time to fix this? That would be a huge help. If not, I'll just revert these patches.

masasin added a commit to masasin/numpy that referenced this pull request Jul 7, 2016
Backwards compatible by adding an `axis` keyword to the end of the
signature for choice. Default is 0, which selects by row. The default
keyword arguments make it functionally equivalent to random.choice from
Python's standard library. Note, however, that a list of tuples will
have the tuples returned as lists.

Issue numpy#2724 was originally opened with that request, but numpy#2727 seems to
have closed it accidentally.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants