Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] add support for lists of dictionaries to RandomizedSearchCV #14549

Merged
merged 5 commits into from Aug 6, 2019

Conversation

@amueller
Copy link
Member

amueller commented Aug 1, 2019

Follow up on #12759 with a slightly simplified interface.
This makes the API of RandomizedSearchCV a superset of GridSearchCV which makes it more convenient to use.

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Aug 2, 2019

This makes the API of RandomizedSearchCV a superset of GridSearchCV which makes it more convenient to use.

Awesome!

Copy link
Member

thomasjpfan left a comment

Add whats new?

Dictionary with parameters names (string) as keys and distributions
or lists of parameters to try. Distributions must provide a ``rvs``
method for sampling (such as those from scipy.stats.distributions).
If a list is given, it is sampled uniformly.
If a list of dicts is given, for each parameter, one of the dicts

This comment has been minimized.

Copy link
@thomasjpfan

thomasjpfan Aug 2, 2019

Member

This is slightly unclear. It looks like, first dicts are sampled uniformly, then the parameters are sampled based on that dict.

sklearn/model_selection/_search.py Show resolved Hide resolved
sklearn/model_selection/_search.py Outdated Show resolved Hide resolved
amueller added 2 commits Aug 2, 2019
Copy link
Member

jnothman left a comment

Otherwise lgtm

for _ in range(self.n_iter):
dist = self.param_distributions[
rnd.randint(len(self.param_distributions))]

This comment has been minimized.

Copy link
@jnothman

jnothman Aug 3, 2019

Member

This is an awkwardly numpy way of expressing random.choose

This comment has been minimized.

Copy link
@amueller

amueller Aug 5, 2019

Author Member

fixed

all_lists = all(
all(not hasattr(v, "rvs") for v in dist.values())
for dist in self.param_distributions)
rng = check_random_state(self.random_state)

This comment has been minimized.

Copy link
@amueller

amueller Aug 5, 2019

Author Member

renamed this to be more consistent with the rest of the library

Copy link
Contributor

NicolasHug left a comment

Super nitpic feel free to merge without addressing

@@ -210,6 +210,9 @@ Changelog
plot model scalability (see learning_curve example).
:pr:`13938` by :user:`Hadrien Reboul <H4dr1en>`.

- |Enhancement| :class:`model_selection.RandomizedSearchCV` now accepts lists
of parameter distributions. :pr:`14549` by `Andreas Müller`_.

This comment has been minimized.

Copy link
@NicolasHug

NicolasHug Aug 5, 2019

Contributor

maybe

lists of dicts to sample from multiple parameter spaces

?

This comment has been minimized.

Copy link
@amueller

amueller Aug 6, 2019

Author Member

I'm unconvinced ;)

for key in dist:
if (not isinstance(dist[key], Iterable)
and not hasattr(dist[key], 'rvs')):
raise TypeError('Parameter value is not iterable '

This comment has been minimized.

Copy link
@NicolasHug

NicolasHug Aug 5, 2019

Contributor

... must be an iterable or a distribution?

This comment has been minimized.

Copy link
@amueller

amueller Aug 6, 2019

Author Member

this is copy & pasted from ParameterGrid. Not sure if your version is any clearer and I think being semi-consistent between the two is good.

@jnothman jnothman merged commit 98e1c0f into scikit-learn:master Aug 6, 2019
17 checks passed
17 checks passed
LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No new or fixed alerts
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.66%)
Details
codecov/project Absolute coverage decreased by -0.56% but relative coverage increased by +3.33% compared to 53f76d1
Details
scikit-learn.scikit-learn Build #20190805.23 succeeded
Details
scikit-learn.scikit-learn (Linux py35_conda_openblas) Linux py35_conda_openblas succeeded
Details
scikit-learn.scikit-learn (Linux py35_ubuntu_atlas) Linux py35_ubuntu_atlas succeeded
Details
scikit-learn.scikit-learn (Linux pylatest_conda_mkl_pandas) Linux pylatest_conda_mkl_pandas succeeded
Details
scikit-learn.scikit-learn (Linux32 py35_ubuntu_atlas_32bit) Linux32 py35_ubuntu_atlas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py35_pip_openblas_32bit) Windows py35_pip_openblas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py37_conda_mkl) Windows py37_conda_mkl succeeded
Details
scikit-learn.scikit-learn (macOS pylatest_conda_mkl) macOS pylatest_conda_mkl succeeded
Details
@amueller

This comment has been minimized.

Copy link
Member Author

amueller commented Aug 7, 2019

OH YEAH!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.