Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support training-only components in pipelines and component graphs #2776

Merged
merged 14 commits into from
Sep 17, 2021

Conversation

angela97lin
Copy link
Contributor

Closes #2681

@angela97lin angela97lin self-assigned this Sep 13, 2021
@codecov
Copy link

codecov bot commented Sep 13, 2021

Codecov Report

Merging #2776 (54de18b) into main (d1e6afb) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #2776     +/-   ##
=======================================
+ Coverage   99.8%   99.8%   +0.1%     
=======================================
  Files        298     298             
  Lines      27681   27731     +50     
=======================================
+ Hits       27613   27663     +50     
  Misses        68      68             
Impacted Files Coverage Δ
evalml/pipelines/component_graph.py 99.8% <100.0%> (-<0.1%) ⬇️
evalml/pipelines/components/component_base.py 100.0% <100.0%> (ø)
...valml/pipelines/components/estimators/estimator.py 100.0% <100.0%> (ø)
...ransformers/preprocessing/drop_rows_transformer.py 100.0% <100.0%> (ø)
...s/components/transformers/samplers/base_sampler.py 89.4% <100.0%> (+0.3%) ⬆️
...l/pipelines/components/transformers/transformer.py 100.0% <100.0%> (ø)
evalml/tests/component_tests/test_components.py 100.0% <100.0%> (ø)
...valml/tests/pipeline_tests/test_component_graph.py 100.0% <100.0%> (ø)
evalml/tests/pipeline_tests/test_pipelines.py 99.9% <100.0%> (+0.1%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d1e6afb...54de18b. Read the comment docs.

@@ -288,7 +285,9 @@ def transform(self, X, y=None):
"Cannot call transform() on a component graph because the final component is not a Transformer."
)

outputs = self._compute_features(self.compute_order, X, y, False)
outputs = self._compute_features(
self.compute_order, X, y, fit=False, evaluate_training=True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be misunderstanding something here - why should evaluate_training be true during transform, but not during fit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eccabay Good question! I think the language here is a bit confusing, but evaluate_training is a flag for determining whether or not we want to compute "training-only" components, such as the samplers or the DropRowsComponent when we are not fitting.

We decided that there should be a difference between transform, which takes the pipeline and calls transform on every component vs predict, where we want predictions on unseen data and should not evaluate training-only components. This enables us to keep our current behavior, but also allow users to use pipelines as just a sequence of transformations.

Does that make sense? (or did I confuse you further?) 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, that makes a lot of sense. Thanks! 😃

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the confusing thing though is that it's not clear from the name that evaluate_training only applies during transform and predict and not fit.

Maybe this is what @eccabay was getting at, but when I read this code, I see evaluate_training=False in ComponentGraph.fit which makes me think that the training only components are not evaluated during fit but that's not true because evaluate_training only applies if fit=False:

if fit:
    output = component_instance.fit_transform(x_inputs, y_input)
elif component_instance.training_only and evaluate_training is False:
    output = x_inputs, y_input
else:
    output = component_instance.transform(x_inputs, y_input)

Maybe we should make this clearer somehow?

Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angela97lin Thank you for doing this! I'm surprised this wasn't a bigger diff honestly!

I left some coverage suggestions as well as a comment on @eccabay 's original question - maybe we can make the parameter name/docstring clearer about what's happening?

I do think this change (the addition of training-only components) is the kind of thing that will eventually merit a redesign/refactor of the component graph. I think right now we encode whether we should run training-only components via the method that's called:

  1. fit - yes
  2. transform - yes
  3. predict - no
  4. compute_estimator_features - no
  5. fit_features - yes

That kind of behavior is not really clear from the method names and we have a lot of methods for generating features that it's hard to keep it all straight (fit_features vs compute_estimator_features vs _fit_transform_features_helper vs _compute_features) especially in light of the training-only requirement.

I think this is good to merge but I wonder what your thoughts are on addressing the tech debt that I think has been accruing.

@@ -13,6 +13,7 @@ class DropRowsTransformer(Transformer):

name = "Drop Rows Transformer"
modifies_target = True
training_only = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this an abstract method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data coming out of the pipeline could be completely wrong if we default to false and that's wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made this abstract, but Transformer still sets this value to False, so any subclass of Transformer will have a default value of False... which might defeat the purpose of having this as abstract 😅

I think this is okay though, since most of our components will have training_only as False for now... but open to thoughts and concerns :)

@@ -288,7 +285,9 @@ def transform(self, X, y=None):
"Cannot call transform() on a component graph because the final component is not a Transformer."
)

outputs = self._compute_features(self.compute_order, X, y, False)
outputs = self._compute_features(
self.compute_order, X, y, fit=False, evaluate_training=True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the confusing thing though is that it's not clear from the name that evaluate_training only applies during transform and predict and not fit.

Maybe this is what @eccabay was getting at, but when I read this code, I see evaluate_training=False in ComponentGraph.fit which makes me think that the training only components are not evaluated during fit but that's not true because evaluate_training only applies if fit=False:

if fit:
    output = component_instance.fit_transform(x_inputs, y_input)
elif component_instance.training_only and evaluate_training is False:
    output = x_inputs, y_input
else:
    output = component_instance.transform(x_inputs, y_input)

Maybe we should make this clearer somehow?

@@ -2754,3 +2755,80 @@ def test_pipeline_transform_with_final_estimator(
),
):
pipeline.transform(X, y)


@patch("evalml.pipelines.components.LogisticRegressionClassifier.fit")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angela97lin Can you please add coverage for fit_features and compute_estimator_features?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expectation is that the training only components are run during fit_features and not compute_estimator_features

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, great idea 😁

Copy link
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a nit about using the same X,y three times, but very clear overall. Thanks for doing this!

Comment on lines 2762 to 2769
X = pd.DataFrame(
{
"a": [i for i in range(9)] + [np.nan],
"b": [i % 3 for i in range(10)],
"c": [i % 7 for i in range(10)],
}
)
y = pd.Series([0] * 5 + [1] * 5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly this is on my backburner's backburner - but are able to either use an existing X/y pair for these tests or create a module level test fixture for these additional tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for holding me accountable, done!

parameters={"Drop Rows Transformer": {"indices_to_drop": [0, 9]}},
)
pipeline.fit(X, y)
assert len(mock_fit.call_args[0][0]) == 8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a time when your brains are too big for me. This is one of them, haha. Why should this be true if the issue is completed successfully?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hahaha, I think this issue is pretty confusing! But it's 8 here because during fit / training time, we evaluate all components, regardless of whether or not they're training_only. So the DropRowsTransformer will drop the two rows during fit --> a total of 8 rows left!

)
pipeline.fit(X, y)
preds = pipeline.predict(X)
assert len(preds) == 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the jist here that no rows have been dropped during prediction since DropRowsTransformer is evaluate_training=True?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup!

@angela97lin
Copy link
Contributor Author

angela97lin commented Sep 17, 2021

@freddyaboulton I think your instinct is on point! It seems like there's a lot of confusion amongst everyone, which probably indicates that something is not super clear, either in the implementation or in the design 😅

I renamed evaluate_training to evaluate_training_only_components to be as explicit as possible about what the flag is, and filed #2795 with some of my thoughts about the different methods. TLDR is, I wonder if there's a need for all of these methods, when in reality, we've been using some for just time series work and internal implementations. I'll add test coverage in this PR for those methods regardless though since we still have them around 😁

@angela97lin angela97lin merged commit 0440bec into main Sep 17, 2021
@angela97lin angela97lin deleted the 2681_training_only branch September 17, 2021 04:20
@chukarsten chukarsten mentioned this pull request Oct 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support training-only components in ComponentGraph
4 participants