-
-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support RandomState._bit_generator
as input in check_random_state
for users with numpy version >= 1.17.0
#20669
Comments
np.random.BitGenerator
in check_random_state
for users with numpy version >= 1.17.0RandomState._bit_generator
as input in check_random_state
for users with numpy version >= 1.17.0
Thanks for the proposal @zoj613 , tsne.set_params(random_state=rng._bit_generator) The |
@rkern What is your take on the fragility of the however based on @NicolasHug 's comment, im not sure if this is the best option. |
Note that the case in the code snippet would actually be Now for the opposite case, turning For detecting |
Ugh, except for numpy 1.18. |
Thank you @rkern for the detailed explanation, I was not aware that |
I might be missing something but I don't see how the
|
The proposal is to not replace See the snippet below @NicolasHug : In [1]: import numpy as np
In [2]: new_rng = np.random.default_rng(0)
In [3]: old_rng = np.random.RandomState(new_rng.bit_generator)
In [4]: dir(old_rng)
Out[4]:
['__class__',
'__delattr__',
'__dir__',
'__doc__',
'__eq__',
'__format__',
'__ge__',
'__getattribute__',
'__getstate__',
'__gt__',
'__hash__',
'__init__',
'__init_subclass__',
'__le__',
'__lt__',
'__ne__',
'__new__',
'__pyx_vtable__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__setattr__',
'__setstate__',
'__sizeof__',
'__str__',
'__subclasshook__',
'_bit_generator',
'_poisson_lam_max',
'beta',
'binomial',
'bytes',
'chisquare',
'choice',
'dirichlet',
'exponential',
'f',
'gamma',
'geometric',
'get_state',
'gumbel',
'hypergeometric',
'laplace',
'logistic',
'lognormal',
'logseries',
'multinomial',
'multivariate_normal',
'negative_binomial',
'noncentral_chisquare',
'noncentral_f',
'normal',
'pareto',
'permutation',
'poisson',
'power',
'rand',
'randint',
'randn',
'random',
'random_integers',
'random_sample',
'rayleigh',
'seed',
'set_state',
'shuffle',
'standard_cauchy',
'standard_exponential',
'standard_gamma',
'standard_normal',
'standard_t',
'tomaxint',
'triangular',
'uniform',
'vonmises',
'wald',
'weibull',
'zipf']
The code above uses |
oh I see, thanks. That's an interesting development. One good thing is that we shouldn't expect a breaking change (i.e. a different RNG) if one day we choose to natively support Generators: In [39]: np.random.RandomState(np.random.default_rng(0).bit_generator).random(10)
Out[39]:
array([0.63696169, 0.26978671, 0.04097352, 0.01652764, 0.81327024,
0.91275558, 0.60663578, 0.72949656, 0.54362499, 0.93507242])
In [40]: np.random.default_rng(0).random(10)
Out[40]:
array([0.63696169, 0.26978671, 0.04097352, 0.01652764, 0.81327024,
0.91275558, 0.60663578, 0.72949656, 0.54362499, 0.93507242]) Is this correct @rkern ? Also, does that mean that a |
As far as I am aware, numpy uses the By nice properties, Im assuming you're referring to the new methods and parameters added in the One nice property this proposal does bring is that random number generation in |
I'm referring to #16988 (comment):
|
|
Yes, that too. The other bitgenerators are faster and higher quality than |
RandomState._bit_generator
as input in check_random_state
for users with numpy version >= 1.17.0RandomState.bit_generator
as input in check_random_state
for users with numpy version >= 1.17.0
Thanks for the replies, now I see that the underlying implementation is not always the same: In [18]: np.random.RandomState(np.random.default_rng(0).bit_generator).standard_normal(10)
Out[18]:
array([-1.35785352, 0.80783308, 1.49674359, 0.6954632 , 0.72872164,
0.07306939, -0.78215698, 0.55381447, 0.12696293, 1.11212987])
In [19]: np.random.default_rng(0).standard_normal(10)
Out[19]:
array([ 0.12573022, -0.13210486, 0.64042265, 0.10490012, -0.53566937,
0.36159505, 1.30400005, 0.94708096, -0.70373524, -1.26542147]) and the 2 methods don't even have the same API (one has extra parameters). So we should expect RNG changes if we ever choose to natively support Generators. |
Yes, NEP 19 lays out the policy and the reasons for it. It was well-deliberated. |
Correct, the EDIT: oops didnt realize its already answered above. |
RandomState.bit_generator
as input in check_random_state
for users with numpy version >= 1.17.0RandomState._bit_generator
as input in check_random_state
for users with numpy version >= 1.17.0
Describe the workflow you want to enable
Since numpy version 1.17.0,
np.random.RandomState
can accept the._bit_generator
attribute as input in the constructor. This can be a plus for those who usenp.random.Generator
in their code and want to use the same bitgenerator with sklearn's estimators. Currently this is not possible, see:this leads to the error:
Describe your proposed solution
I propose we add a conditional in
check_random_state
that supports an instance ofBitGenerator
, see:scikit-learn/sklearn/utils/validation.py
Lines 926 to 944 in 2beed55
something like
Describe alternatives you've considered, if relevant
I know there is an issue regarding supporting the new numpy Generator interface but I feel this is slightly different since it does not attempt to replace
RandomState
.The text was updated successfully, but these errors were encountered: