-
Notifications
You must be signed in to change notification settings - Fork 91
Closed
Labels
enhancementAn improvement to an existing feature.An improvement to an existing feature.epicIssues which are epics, containing other issues. #792bb5Issues which are epics, containing other issues. #792bb5needs designIssues requiring design documentation.Issues requiring design documentation.
Description
In #229 and related slack discussion today, @jeremyliweishih and @angela97lin were chatting about whether we should require pipelines to have exactly one estimator.
Currently, it appears we expect a pipeline to consist of a single string of components to be evaluated linearly, with exactly one estimator at the end. In PipelineBase::predict, we call transform on every component in the line, then call predict on the estimator. self.estimator is currently used in predict, predict_proba and feature_importances in ways which are consistent with that expectation.
We can imagine pipelines where:
- The estimator isn't the final component in the pipeline
- There are multiple estimators in the pipeline (Perhaps its easier or more clear to do that instead of defining a custom component to implement that)
- There are multiple preprocessing pathways of components (which presumably eventually merge together)
- There's no estimator at all; the pipeline in question is just doing some preprocessing or feature extraction (this wouldn't be useful in the automl leaderboard as we currently envision it)
Questions:
- Which of those cases should we plan to support?
- What invariants should we check on pipeline structure?
- How can we change our code to enforce them but allow some of the cases described here?
- The answers here may take a while to implement. What if anything do we do in the meantime?
Metadata
Metadata
Assignees
Labels
enhancementAn improvement to an existing feature.An improvement to an existing feature.epicIssues which are epics, containing other issues. #792bb5Issues which are epics, containing other issues. #792bb5needs designIssues requiring design documentation.Issues requiring design documentation.