Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using integer random_state in IterativeImputer #15611

Closed
david-cortes opened this issue Nov 12, 2019 · 3 comments
Closed

Consider using integer random_state in IterativeImputer #15611

david-cortes opened this issue Nov 12, 2019 · 3 comments
Labels

Comments

@david-cortes
Copy link
Contributor

david-cortes commented Nov 12, 2019

Lots of algorithms that rely on randomization offer an argument random_state through which it's possible to pass them RNG seeds that would ensure reproducible results. The IterativeImputer class will take some regressor/classifier and set their attribute random_state to a NumPy MT19937 object class if the object has such attribute, in such a way that this object is modified implicitly once the regressor/classifier to which it was assigned produces some random number.

This is fine for SciKit-Learn’s own classes, but other libraries which provide different regressors/classifiers which are SciKit-Learn-compatible (at least according to tests) might not be able to use such RandomState objects (e.g. if they generate random numbers in C++ with some method other than MT19937).

For example, XGBoost and LightGBM SciKit-Learn-compatible classes will only accept integers as parameters for their random_state, and there’s potentially many others too which will only accept integers.

Would be nice if IterativeImputer would instead generate a random integer seed to pass to the regressor/classifier at each iteration, so that it could be used with more external libraries.

@rth
Copy link
Member

rth commented Nov 14, 2019

+1 there is a related discussion in #14042 about dropping non integer random_state object input althogether.

Would you like to make a PR for IterativeImputer ?

@david-cortes
Copy link
Contributor Author

Created a PR for setting random_state to an integer. Not sure if it'd be more appropriate to do this, or to avoid setting any random_state at all for the estimator inside InterativeImputer, since the estimator can also be passed a random_state in their initialization arguments.

david-cortes added a commit to david-cortes/scikit-learn that referenced this issue Nov 20, 2019
david-cortes added a commit to david-cortes/scikit-learn that referenced this issue Nov 20, 2019
david-cortes added a commit to david-cortes/scikit-learn that referenced this issue Dec 13, 2019
david-cortes added a commit to david-cortes/scikit-learn that referenced this issue Dec 13, 2019
david-cortes added a commit to david-cortes/scikit-learn that referenced this issue Dec 13, 2019
david-cortes added a commit to david-cortes/scikit-learn that referenced this issue Jan 11, 2020
david-cortes added a commit to david-cortes/scikit-learn that referenced this issue Jul 12, 2020
david-cortes added a commit to david-cortes/scikit-learn that referenced this issue Aug 16, 2020
@david-cortes
Copy link
Contributor Author

Fix was merged, closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants