-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add universal error for predict/transform before fitting #969
Conversation
Codecov Report
@@ Coverage Diff @@
## main #969 +/- ##
==========================================
+ Coverage 99.68% 99.85% +0.17%
==========================================
Files 178 178
Lines 9163 9282 +119
==========================================
+ Hits 9134 9269 +135
+ Misses 29 13 -16
Continue to review full report at Codecov.
|
@dsherry PR should be ready but there's this weird lint issue going on (where I'm not getting locally but failing on CircleCI). |
@dsherry I think we should give the same treatment for pipelines by metaclass wrappers for predict etc. as well but I'm pushing that out of the scope of this PR. However, as it currently stands if someone class predict before fit on a pipeline the error will surface from the component itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeremyliweishih This is so cool! Never seen metaclasses in action before. My only blocking comment is about whether there are better alternatives than defining a static NO_FITTING_REQUIRED
list. I can see pros and cons so looking forward to hear what you think. Apart from that, nothing blocking.
|
||
from evalml.exceptions import UnfitComponentError | ||
|
||
NO_FITTING_REQUIRED = ['DropColumns', 'SelectColumns'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative to this would be adding a _no_fitting_required
flag to ComponentBase
like you did with _has_fit
. Might be nice to let developers turn the check_for_fit
functionality off if they need to without having them modify this file. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@freddyaboulton oh that's a neat idea. So like this?
class NonLearningComponent(Transformer):
fitting_required = False
...
# the call below won't error
NonLearningComponent().predict(X)
I think that's cool. I think it would be good to do this in a separate PR since there's already a lot in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea that's what I was thinking and it makes sense to move it to another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@freddyaboulton I like the idea! We used to have a field on components called needs_fitting
and from what I recall we removed it and made the assumption that all components needs "fitting" even if fitting doesn't do anything. From the components I had to alter (Drop Columns etc.) it seems like there has been more components where it would be more clear to add a field back in denoting fitting.
I'll file another issue after this PR is merged to add that back in @dsherry!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeremyliweishih wow, it's so cool that you got this working!! Especially for the properties. Very nice 🎉
I think we should keep all the metaclass code in the same file. I think its fine if that's component_base.py
. I also think we should keep the wrapper-making functions as classmethods on the metaclass.
I left a few naming suggestions. I agree we should delete the old test coverage left over from individual components.
Will approve once we round off those conversations! I left other suggestions but nothing else which was blocking.
parameters = {'param_a': param_a, 'param_b': param_b} | ||
super().__init__(parameters=parameters, | ||
component_obj=None, | ||
random_state=0) | ||
|
||
def fit(self, X, y=None): | ||
self.is_fitted = True | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will cause issues with codecov. Try
def fit(self, X, y=None):
"""Docstring"""
I vaguely remember talking with @angela97lin a couple months back and realizing this would resolve a codecov issue we were seeing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But its weird because the exceptions use pass
and are fine in codecov! Idk 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... I remember using docstrings as a way to cover this case after reading this SO post: https://stackoverflow.com/questions/9202723/excluding-abstractproperties-from-coverage-reports
Not sure why exceptions are fine, but maybe a guess is the way python executes code? That is, when we raise an exception, what's within the class gets scanned? Idk, no idea 🤷♀️
trans.transform(X) | ||
|
||
|
||
def test_all_components_check_fit(X_y_binary): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeremyliweishih what's the runtime of this test? If its high, we could consider mocking the impl methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we don't care what happens in the actual fit
/transform
/predict
, just that the wrappers get called correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Cool stuff!!
Just left a blocking comment about updating tests :D
evalml/pipelines/components/transformers/preprocessing/datetime_featurizer.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeremyliweishih This is awesome! Thanks for the hard work on this. My comments have been addressed. The only question I have is whether we need to check the X is None and y is None
case in check_for_fit
klass = type(self).__name__ | ||
if not self._is_fitted and klass not in cls.NO_FITTING_REQUIRED: | ||
raise ComponentNotYetFittedError('This {klass} is not fitted yet. You must fit {klass} before calling {method.__name__}.') | ||
elif X is None and y is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to check this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@freddyaboulton I consolidated the property version (that I had before) into one method so this is needed for properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! Thanks for clarifying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Nice going
Fixes #851
Make new issues about pipelines/
needs_fitting
/mocking.