-
-
Notifications
You must be signed in to change notification settings - Fork 26.3k
DOC add a more complex example to gridsearch for nested parameters #14548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC add a more complex example to gridsearch for nested parameters #14548
Conversation
Codecov Report
@@ Coverage Diff @@
## master #14548 +/- ##
==========================================
- Coverage 96.66% 96.65% -0.02%
==========================================
Files 397 397
Lines 73015 73015
Branches 7980 7980
==========================================
- Hits 70583 70575 -8
- Misses 2419 2427 +8
Partials 13 13
Continue to review full report at Codecov.
|
doc/modules/grid_search.rst
Outdated
>>> search = GridSearchCV(pipe, param_grid, cv=5) | ||
>>> search.fit(X, y) | ||
GridSearchCV(cv=5, | ||
estimator=Pipeline(steps=[('select', SelectKBest()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite wide. Does doing something like this hide too much?
GridSearchCV(cv=5,
estimator=Pipeline(steps=[...]),
param_grid={'model__base_estimator__max_depth': [2, 4, 6, 8],
'select__k': [1, 2]})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice to format this differently but pytests doesn't allow me to.
We could change our repr lol.
I can also hide more with ellipsis but I think that's not super nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IGNORE_WHITESPACE doesn't allow changes in line breaks?? Who knew?
More practically, maybe just fit on the previous line and don't show the repr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it allows replacing any whitespace by any other whitespace. There's no whitespace here to replace :-/
Co-Authored-By: Thomas J Fan <thomasjpfan@gmail.com>
doc/modules/grid_search.rst
Outdated
|
||
Here, ``<estimator>`` is the parameter name of the nested estimator, | ||
in this case ``base_estimator``. | ||
The `Pipeline` class has a slightly different notation, as explained in :ref:`pipeline`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not Pipeline alone... ColumnTransformer, VotingClassifier, etc, do this too... It's more like "If, like a Pipeline, the meta-estimator is constructed as a collection of other estimators, ..."
The other thing that's confusing about this is that, according to get_params
, the step names are parameter names of the Pipeline. They're just not constructor params...
doc/modules/grid_search.rst
Outdated
>>> search = GridSearchCV(pipe, param_grid, cv=5) | ||
>>> search.fit(X, y) | ||
GridSearchCV(cv=5, | ||
estimator=Pipeline(steps=[('select', SelectKBest()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IGNORE_WHITESPACE doesn't allow changes in line breaks?? Who knew?
More practically, maybe just fit on the previous line and don't show the repr?
Good that you've identified this... it's very hard to make sure there aren't gaps in the user guide when instead of fixing it, people go off writing books that cover the same material :D j/k |
Co-Authored-By: Joel Nothman <joel.nothman@gmail.com>
doc/modules/grid_search.rst
Outdated
|
||
Here, ``<estimator>`` is the parameter name of the nested estimator, | ||
in this case ``base_estimator``. | ||
If the meta-estimator is constructed as a collection of estimators as in `Pipeline`, then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jnothman better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pipeline should resolve to the Pipeline estimator doc, not to the user guide.
@jnothman if I had realized there was a gap here I would have fixed it before writing the book ;) |
>>> search.fit(X, y) | ||
GridSearchCV(cv=5, | ||
estimator=CalibratedClassifierCV(base_estimator=RandomForestClassifier(n_estimators=10)), | ||
param_grid={'base_estimator__max_depth': [2, 4, 6, 8]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc/modules/grid_search.rst
Outdated
----------------------------------------- | ||
`GridSearchCV` and `RandomizedSearchCV` allow searching over parameters of | ||
composite or nested estimators such as `Pipeline`, `ColumnTransformer`, | ||
`VotingClasssifier` or `CalibratedClassifierCV` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is VotingClassifier
not resolved? Did I misspell it? Also: we can fix that later :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pipeline should resolve to the Pipeline estimator doc, not to the user guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Isn't the user guide better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, not when all the others links are estimators and when the link is actually a title section that often doesn't make sense in the context they're used in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should fix our title handles then? hm not sure, I don't have a strong opinion. I can use pipeline.Pipeline
that should work, right? or use :ref:pipeline.Pipeline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I also think they all should link to comparable pages, i.e. the API page in this case.
:class:`ensemble.VotingClassifier`
should work for VotingClassifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a typo ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad magic link resolution :(
Other than that LGTM.
doc/modules/grid_search.rst
Outdated
----------------------------------------- | ||
`GridSearchCV` and `RandomizedSearchCV` allow searching over parameters of | ||
composite or nested estimators such as `Pipeline`, `ColumnTransformer`, | ||
`VotingClasssifier` or `CalibratedClassifierCV` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pipeline should resolve to the Pipeline estimator doc, not to the user guide.
doc/modules/grid_search.rst
Outdated
|
||
Here, ``<estimator>`` is the parameter name of the nested estimator, | ||
in this case ``base_estimator``. | ||
If the meta-estimator is constructed as a collection of estimators as in `Pipeline`, then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pipeline should resolve to the Pipeline estimator doc, not to the user guide.
Co-Authored-By: Nicolas Hug <contact@nicolas-hug.com>
Co-Authored-By: Nicolas Hug <contact@nicolas-hug.com>
Pipeline still points to the user guide, doesn't it? |
…it-learn into metaestimator_search_steps # Conflicts: # doc/modules/grid_search.rst
doc/modules/grid_search.rst
Outdated
Composite estimators and parameter spaces | ||
----------------------------------------- | ||
`GridSearchCV` and `RandomizedSearchCV` allow searching over parameters of | ||
composite or nested estimators such as `Pipeline`, `ColumnTransformer`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pipeline.Pipeline
here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other lgtm
Co-Authored-By: Joel Nothman <joel.nothman@gmail.com>
Related to what was brought up in #8710:
I don't think we explicitly describe how to do grid-searches on meta-estimators, and
Pipeline
actually works different from other meta-estimators (it uses the step names, not the parameter names).