Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promote SKLEARN_SEED for consideration in __init__.py itself #15727

Open
yarikoptic opened this issue Nov 28, 2019 · 1 comment
Open

Promote SKLEARN_SEED for consideration in __init__.py itself #15727

yarikoptic opened this issue Nov 28, 2019 · 1 comment
Labels

Comments

@yarikoptic
Copy link
Member

Inspired by scikit-learn/enhancement_proposals#24 I decided to look into either it is possible to seed RNG used by sklearn "globally". Apparently it was even me who, after a heated discussion, added treatment of SKLEARN_SEED environment variable within setup_module() fixture back in 2012 4915d4d to provide means for reproducible testing.
AFAIK there is no generic treatment of SKLEARN_SEED or any other env variable as to define a starting point of RNG for an arbitrary script which (directly or indirectly) uses sklearn. It would be useful for the cases where there is a script which uses scikit-learn functionality and has no explicit seeding handling built-in. Setting the seed via env variable provides a chance to provide reproducible results if I re-run that script with the same env variable value without doing any modifications to the script or any underlying library which actually interfaces to scikit-learn and/or other libraries.
This is a strategy we used in PyMVPA and started to collect similar cases (see https://github.com/ReproNim/reproseed/blob/master/reproseed.sh#L24 for the short for now list) with the hope to be able to seed all relevant tools once in case when it is necessary to make some issue or result reproducible.
I wondered if there would be any interest to pursue this direction.

@jnothman
Copy link
Member

jnothman commented Nov 28, 2019 via email

@cmarmo cmarmo added the RFC label Mar 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants