-
Notifications
You must be signed in to change notification settings - Fork 62
Initial implementation of VotingRegressor. #390
Conversation
…Remove unnecessary comments.
""" | ||
Tells if the predictor depends on other predictors. | ||
""" | ||
return self._has_implicit_predictors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return self._has_implicit_predictors [](start = 8, length = 36)
Didnt get it. What is the purpose of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method was added to the base class of predictors so that the code in Pipeline can determine which predictors have implicit predictors (the voting ensembles) and thus generate the graph in a different way.
This was done to avoid having if classname == "VotingRegressor"
line in Pipeline.
In reply to: 361724371 [](ancestors = 361724371)
pass | ||
|
||
def test_ensemble_supports_get_fit_info(self): | ||
# TODO: fill this in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: fill this in [](start = 8, length = 20)
One more test where Pipeline has few transforms before VotingClassifier #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/python/nimbusml/pipeline.py
Outdated
elif label_column: | ||
learner.label_column_name = label_column | ||
elif y is None: | ||
if label_column is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is crazy :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same code that was there before my change. The only modification I made was to unindent this code by 4 spaces to remove an unnecessary if
statement.
I will look in to what can be done to clean this up. Though, I might put that in a different pull request.
In reply to: 361725035 [](ancestors = 361725035)
…e used before the predictor.
models_regressionensemble | ||
|
||
|
||
class VotingRegressor(BasePredictor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think VotingRegressor
may be the wrong name. Voting is one option to combine model outputs for classification. One doesn't ensembe using voting for regression; they tend to use {median, average, weighted average, stacking}, with options for model selection of {outlier removal, diversity, best-performance}.
I might suggest naming as EnsembleRegressor
, or RegressionEnsembler
:
class VotingRegressor(BasePredictor, | |
class EnsembleRegressor(BasePredictor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, if you make a stacked ensemble regression example/unit-test, I'd recommend a final model of LogisticRegression w/ non-negative set. The non-negative helps by ensuring you don't bet against the sub-models: dotnet/machinelearning#1651 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name VotingRegressor
was chosen to match the class in scikit-learn
. See VotingRegressor
.
Though, everyone that has seen these changes has made similar comments about the name. Perhaps it would be best to change it.
There is already an EnsembleRegressor
class in NimbusML (which is actually being updated to offer the same functionality as this class; this class is just a temporary class to add support for multi-predictor ensembling functionality until the existing class modifications are completed). RegressionEnsembler
seems like it might be too similar to EnsembleRegressor
and might lead to confusion.
@justinormont Thanks for your input. I will continue looking for other names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a decision on naming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have decided to keep the naming. Arguments are:
- EnsembleRegressor already exists in NimbusML (where its restricted to the only base learner Online Gradient Descent)
- To be on par with scikit-learn where its named also VotingRegressor.
Once we are able to fix existing EnsembleRegressor to take any base learner we would obsolete this VotingRegressor in future.
In reply to: 370876925 [](ancestors = 370876925)
The cross validation additions should be refactored if/when the Pipeline/EntryPoint/Graph code is refactored. |
@@ -502,21 +508,39 @@ def fit( | |||
implicit_nodes) | |||
|
|||
# Add learner node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls describe how CV is split when VotingEnsemble is used
Combine regression models into an ensemble | ||
|
||
:param estimators: The predictors to combine into an ensemble. | ||
:param model_combiner: The combiner used to combine the scores. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:param [](start = 3, length = 7)
Pls add bit more description and point to sklearn VotingRegressor for more info
No description provided.