New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support training-only components in pipelines and component graphs #2776
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2776 +/- ##
=======================================
+ Coverage 99.8% 99.8% +0.1%
=======================================
Files 298 298
Lines 27681 27731 +50
=======================================
+ Hits 27613 27663 +50
Misses 68 68
Continue to review full report at Codecov.
|
…681_training_only
evalml/pipelines/component_graph.py
Outdated
@@ -288,7 +285,9 @@ def transform(self, X, y=None): | |||
"Cannot call transform() on a component graph because the final component is not a Transformer." | |||
) | |||
|
|||
outputs = self._compute_features(self.compute_order, X, y, False) | |||
outputs = self._compute_features( | |||
self.compute_order, X, y, fit=False, evaluate_training=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be misunderstanding something here - why should evaluate_training
be true during transform
, but not during fit
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eccabay Good question! I think the language here is a bit confusing, but evaluate_training
is a flag for determining whether or not we want to compute "training-only" components, such as the samplers or the DropRowsComponent when we are not fitting.
We decided that there should be a difference between transform
, which takes the pipeline and calls transform
on every component vs predict
, where we want predictions on unseen data and should not evaluate training-only components. This enables us to keep our current behavior, but also allow users to use pipelines as just a sequence of transformations.
Does that make sense? (or did I confuse you further?) 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, that makes a lot of sense. Thanks! 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the confusing thing though is that it's not clear from the name that evaluate_training
only applies during transform
and predict
and not fit
.
Maybe this is what @eccabay was getting at, but when I read this code, I see evaluate_training=False
in ComponentGraph.fit
which makes me think that the training only components are not evaluated during fit but that's not true because evaluate_training
only applies if fit=False
:
if fit:
output = component_instance.fit_transform(x_inputs, y_input)
elif component_instance.training_only and evaluate_training is False:
output = x_inputs, y_input
else:
output = component_instance.transform(x_inputs, y_input)
Maybe we should make this clearer somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@angela97lin Thank you for doing this! I'm surprised this wasn't a bigger diff honestly!
I left some coverage suggestions as well as a comment on @eccabay 's original question - maybe we can make the parameter name/docstring clearer about what's happening?
I do think this change (the addition of training-only components) is the kind of thing that will eventually merit a redesign/refactor of the component graph. I think right now we encode whether we should run training-only components via the method that's called:
fit
- yestransform
- yespredict
- nocompute_estimator_features
- nofit_features
- yes
That kind of behavior is not really clear from the method names and we have a lot of methods for generating features that it's hard to keep it all straight (fit_features
vs compute_estimator_features
vs _fit_transform_features_helper
vs _compute_features
) especially in light of the training-only requirement.
I think this is good to merge but I wonder what your thoughts are on addressing the tech debt that I think has been accruing.
@@ -13,6 +13,7 @@ class DropRowsTransformer(Transformer): | |||
|
|||
name = "Drop Rows Transformer" | |||
modifies_target = True | |||
training_only = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this an abstract method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data coming out of the pipeline could be completely wrong if we default to false and that's wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made this abstract, but Transformer
still sets this value to False, so any subclass of Transformer
will have a default value of False... which might defeat the purpose of having this as abstract 😅
I think this is okay though, since most of our components will have training_only
as False for now... but open to thoughts and concerns :)
evalml/pipelines/component_graph.py
Outdated
@@ -288,7 +285,9 @@ def transform(self, X, y=None): | |||
"Cannot call transform() on a component graph because the final component is not a Transformer." | |||
) | |||
|
|||
outputs = self._compute_features(self.compute_order, X, y, False) | |||
outputs = self._compute_features( | |||
self.compute_order, X, y, fit=False, evaluate_training=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the confusing thing though is that it's not clear from the name that evaluate_training
only applies during transform
and predict
and not fit
.
Maybe this is what @eccabay was getting at, but when I read this code, I see evaluate_training=False
in ComponentGraph.fit
which makes me think that the training only components are not evaluated during fit but that's not true because evaluate_training
only applies if fit=False
:
if fit:
output = component_instance.fit_transform(x_inputs, y_input)
elif component_instance.training_only and evaluate_training is False:
output = x_inputs, y_input
else:
output = component_instance.transform(x_inputs, y_input)
Maybe we should make this clearer somehow?
@@ -2754,3 +2755,80 @@ def test_pipeline_transform_with_final_estimator( | |||
), | |||
): | |||
pipeline.transform(X, y) | |||
|
|||
|
|||
@patch("evalml.pipelines.components.LogisticRegressionClassifier.fit") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@angela97lin Can you please add coverage for fit_features
and compute_estimator_features
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expectation is that the training only components are run during fit_features
and not compute_estimator_features
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, great idea 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just a nit about using the same X,y three times, but very clear overall. Thanks for doing this!
X = pd.DataFrame( | ||
{ | ||
"a": [i for i in range(9)] + [np.nan], | ||
"b": [i % 3 for i in range(10)], | ||
"c": [i % 7 for i in range(10)], | ||
} | ||
) | ||
y = pd.Series([0] * 5 + [1] * 5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Admittedly this is on my backburner's backburner - but are able to either use an existing X/y pair for these tests or create a module level test fixture for these additional tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for holding me accountable, done!
parameters={"Drop Rows Transformer": {"indices_to_drop": [0, 9]}}, | ||
) | ||
pipeline.fit(X, y) | ||
assert len(mock_fit.call_args[0][0]) == 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a time when your brains are too big for me. This is one of them, haha. Why should this be true if the issue is completed successfully?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hahaha, I think this issue is pretty confusing! But it's 8 here because during fit
/ training time, we evaluate all components, regardless of whether or not they're training_only. So the DropRowsTransformer will drop the two rows during fit --> a total of 8 rows left!
) | ||
pipeline.fit(X, y) | ||
preds = pipeline.predict(X) | ||
assert len(preds) == 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the jist here that no rows have been dropped during prediction since DropRowsTransformer is evaluate_training=True
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@freddyaboulton I think your instinct is on point! It seems like there's a lot of confusion amongst everyone, which probably indicates that something is not super clear, either in the implementation or in the design 😅 I renamed |
Closes #2681