Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop modifying global numpy random seed #220

Closed
mheilman opened this issue Dec 10, 2014 · 4 comments
Closed

Stop modifying global numpy random seed #220

mheilman opened this issue Dec 10, 2014 · 4 comments

Comments

@mheilman
Copy link
Contributor

In learner.train and learner.cross_validate, the global numpy random seed is set. This seems unnecessary since the randomized algorithms in scikit-learn generally take random seed arguments. Can we remove these?

np.random.seed(rand_seed)

np.random.seed(rand_seed)

@dan-blanchard
Copy link
Contributor

Thanks for making this issue; this is something I meant to do a long time ago. We should switch to using random state objects like the scikit-learn documentation recommends. (I would provide links, but their site is actually down at the moment.)

@mheilman
Copy link
Contributor Author

I think this link suffices:
http://c2.com/cgi/wiki?GlobalVariablesAreBad

@dan-blanchard
Copy link
Contributor

Haha, true.

Here's what the scikit-learn page said though:

If your code relies on a random number generator, it should never use functions like numpy.random.random or numpy.random.normal. This approach can lead to repeatability issues in unit tests. Instead, a numpy.random.RandomState object should be used, which is built from a random_state argument passed to the class or function. The function check_random_state, below, can then be used to create a random number generator object.

check_random_state: create a np.random.RandomState object from a parameter random_state.

  • If random_state is None or np.random, then a randomly-initialized RandomState object is returned.
  • If random_state is an integer, then it is used to seed a new RandomState object.
  • If random_state is a RandomState object, then it is passed through.
    For example:
>> from sklearn.utils import check_random_state
>> random_state = 0
>> random_state = check_random_state(random_state)
>> random_state.rand(4)
array([ 0.5488135 ,  0.71518937,  0.60276338,  0.54488318])

@desilinguist
Copy link
Member

Addressed by #245.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants