Skip to content

Refactor ComponentGraph methods #2902

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 14, 2021
Merged

Refactor ComponentGraph methods #2902

merged 8 commits into from
Oct 14, 2021

Conversation

bchen1116
Copy link
Contributor

fix #2795

@bchen1116 bchen1116 self-assigned this Oct 12, 2021
self._feature_provenance = self._get_feature_provenance(X.columns)
return self

def fit_features(self, X, y):
"""Fit all components save the final one, usually an estimator.
def fit_and_transform_all_but_final(self, X, y):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to leave this method as is rather than trying to move it to time_series_base because it allows us to easily implement other pipeline types in the future that could benefit from this method. While we currently only use it for the time_series_base file, I believe the method makes sense existing where it currently is with the ComponentGraph class.

Additionally, I wouldn't want to delete this method and have time_series_base call the underlying private method since that would go against the design and structure of using private/public methods and naming conventions

@codecov
Copy link

codecov bot commented Oct 12, 2021

Codecov Report

Merging #2902 (2abcaf9) into main (3607e03) will not change coverage.
The diff coverage is 100.0%.

Impacted file tree graph

@@          Coverage Diff          @@
##            main   #2902   +/-   ##
=====================================
  Coverage   99.7%   99.7%           
=====================================
  Files        302     302           
  Lines      28394   28394           
=====================================
  Hits       28301   28301           
  Misses        93      93           
Impacted Files Coverage Δ
...alml/model_understanding/permutation_importance.py 100.0% <100.0%> (ø)
...nderstanding/prediction_explanations/explainers.py 100.0% <100.0%> (ø)
evalml/pipelines/classification_pipeline.py 100.0% <100.0%> (ø)
evalml/pipelines/component_graph.py 99.8% <100.0%> (ø)
evalml/pipelines/pipeline_base.py 98.4% <100.0%> (ø)
.../pipelines/time_series_classification_pipelines.py 99.0% <100.0%> (ø)
evalml/pipelines/time_series_pipeline_base.py 99.1% <100.0%> (ø)
...s/prediction_explanations_tests/test_explainers.py 100.0% <100.0%> (ø)
...understanding_tests/test_permutation_importance.py 100.0% <100.0%> (ø)
...ation_pipeline_tests/test_binary_classification.py 100.0% <100.0%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3607e03...2abcaf9. Read the comment docs.

Copy link
Collaborator

@jeremyliweishih jeremyliweishih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good but let's add the API changes as breaking changes!

@@ -10,6 +10,7 @@ Release Notes
* Changes
* Deleted scikit-learn ensembler :pr:`2819`
* Refactored pipeline building logic out of ``AutoMLSearch`` and into ``IterativeAlgorithm`` :pr:`2854`
* Refactored names for methods in ``ComponentGraph`` :pr:`2902`
* Documentation Changes
* Updated ``install.ipynb`` to reflect flexibility for ``cmdstan`` version installation :pr:`2880`
* Testing Changes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add the API changes to breaking changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call!

Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bchen1116 I think this looks good to me! I think the new names for the ComponentGraph methods make sense.

Maybe we should get @angela97lin 's thoughts on this before merge. One thing that comes to mind is that maybe we should change the name of the PipelineBase.compute_estimator_features to be transform_all_but_final? No need to do this change unless we all agree though.

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this @bchen1116! Agreed with @freddyaboulton that I like transform_all_but_final but I see you've already made that change 😂

LGTM 🚢

@bchen1116 bchen1116 merged commit 3948c5d into main Oct 14, 2021
@chukarsten chukarsten mentioned this pull request Oct 14, 2021
@freddyaboulton freddyaboulton deleted the bc_2795_refactor branch May 13, 2022 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor component graph impl
4 participants