-
Notifications
You must be signed in to change notification settings - Fork 92
Polynomial Detrending Component #1992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1992 +/- ##
=========================================
- Coverage 100.0% 100.0% -0.0%
=========================================
Files 276 278 +2
Lines 22571 22714 +143
=========================================
+ Hits 22565 22705 +140
- Misses 6 9 +3
Continue to review full report at Codecov.
|
b83517e to
5a6083b
Compare
| klass = type(self).__name__ | ||
| if not self._is_fitted and self.needs_fitting: | ||
| raise ComponentNotYetFittedError(f'This {klass} is not fitted yet. You must fit {klass} before calling {method.__name__}.') | ||
| elif method.__name__ == 'inverse_transform': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a short-term fix to make the component metaclass aware of the inverse_transform method. The current implementation will not work because if y is None, the method will run with only the X argument which is not what we want.
I think we can change this implementation to something like:
def _check_for_fit(self, *args, **kwargs):
klass = type(self).__name__
if not self._is_fitted and self.needs_fitting:
raise ComponentNotYetFittedError(f'This {klass} is not fitted yet. You must fit {klass} before calling {method.__name__}.')
return method(self, *args, **kwargs)I think this would be more maintainable in the long-term. I will file an issue for this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense! Just for my own edification, why would this new line be necessary? Why not raise a ValueError in transform for when y is None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ParthivNaresh I think we're currently raising an exception in inverse_transform if y is None. The problem is that the meta class doesn't pass the y argument if it's None to inverse_transform in the current implementation. This causes an error because we're expecting a y argument, even if it's None.
092a02e to
f27b798
Compare
| y_series = _convert_woodwork_types_wrapper(y_dt.to_series()) | ||
| y_t = self._component_obj.transform(y_series) | ||
| y_t = ww.DataColumn(pd.Series(y_t, index=y_series.index)) | ||
| return X, y_t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm returning X and y because our ComponentGraph assumes targets are returned as the second value in a tuple
|
FYI it's impossible to get 100% coverage on the patch: # We don't expect to install sktime in python 3.9. Let's verify it's not present:
if 'sktime' in module and is_running_py_39_or_above:
with pytest.raises(ModuleNotFoundError):
import_module(module)
continueThose lines only run when python 3.9 is installed and we don't measure coverage for python 3.9. But the fact that the 3.9 tests pass means that those lines are run 😄 |
f27b798 to
29b973d
Compare
ParthivNaresh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks great, nicely done!
| If 2, quadratic model is fit, etc. | ||
| random_seed (int): Seed for the random number generator. Defaults to 0. | ||
| """ | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we error check for degree if passed in as a float? Perhaps casting it to an int and then adding a test for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea!
| Arguments: | ||
| X (ww.DataTable, pd.DataFrame, optional): Ignored. | ||
| y (ww.DataColumn, pd.Series): Target variable to detrend. Ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is y ignored here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo 😂 Good catch!
| component_with_different_kwargs = component_class(diff_test_arg="test") | ||
| assert component.parameters['test_arg'] == "test" | ||
| assert component._component_obj.test_arg == "test" | ||
| if not isinstance(component, PolynomialDetrender): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is PolynomialDetrender left out of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raises an sklearn exception!
| klass = type(self).__name__ | ||
| if not self._is_fitted and self.needs_fitting: | ||
| raise ComponentNotYetFittedError(f'This {klass} is not fitted yet. You must fit {klass} before calling {method.__name__}.') | ||
| elif method.__name__ == 'inverse_transform': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense! Just for my own edification, why would this new line be necessary? Why not raise a ValueError in transform for when y is None?
|
|
||
| @pytest.mark.parametrize("input_type", ['np', 'pd', 'ww']) | ||
| @pytest.mark.parametrize("use_int_index", [True, False]) | ||
| @pytest.mark.parametrize("degree", [1, 2, 3]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
bchen1116
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me! The test coverage is solid.
angela97lin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, pretty darn cool stuff--the tests were great in understanding what's going on. Left some nitpicky comments per usual but 👍👍👍
| "\n", | ||
| "## Python 3.9 support\n", | ||
| "\n", | ||
| "Evalml can still be installed with pip in python 3.9 but note that `sktime`, one of our dependencies, will not be installed because that library does not yet support python 3.9. This means the ``PolynomialDetrending`` component will not be usable in python 3.9. You can try to install `sktime` [from source](https://www.sktime.org/en/latest/installation.html#building-from-source) in python 3.9 to use the ``PolynomialDetrending`` component but be warned that we only test it in python 3.7 and 3.8." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍👍
evalml/pipelines/components/transformers/preprocessing/polynomial_detrender.py
Show resolved
Hide resolved
| def test_polynomial_detrender_raises_value_error_target_is_none(ts_data): | ||
| X, y = ts_data | ||
|
|
||
| with pytest.raises(ValueError, match="y cannot be None for PolynomialDetrender!"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Realllly nit-picking now but do we want to add tests for fit/transform too for completeness :')
| if input_type == 'np': | ||
| X = X_input.values | ||
| y = y_input.values | ||
| elif input_type == 'ww': | ||
| X = ww.DataTable(X_input) | ||
| y = ww.DataColumn(y_input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use the make_data_type fixture for this I believe? hehe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tip! I didn't know about make_data_type! I'm not sure if it's worth it in this case because there is no pandas to numpy conversion and the woodwork conversion is straight-forward enough in this case. But I will keep it in mind in the future.
29b973d to
b9e825d
Compare
| category_encoders>=2.0.0 | ||
| statsmodels >= 0.12.2 | ||
| imbalanced-learn>=0.7.0 | ||
| sktime>=0.5.3;python_version<"3.9" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 yep good call putting this in requirements and note core-requirements, lgtm
|
|
||
| @pytest.fixture | ||
| def is_running_py_39_or_above(): | ||
| return sys.version_info >= (3, 9) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nifty!
| if module == 'sktime' and is_running_py_39_or_above: | ||
| with pytest.raises(ModuleNotFoundError): | ||
| import_module(module) | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. So the reason this doesn't appear in our coverage report is because we're currently only generating coverage on python 3.8. 👍
| detrender = detrend.Detrender(trend.PolynomialTrendForecaster(degree=degree)) | ||
|
|
||
| super().__init__(parameters=params, | ||
| component_obj=detrender, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we need to set component_obj for this component? It appears you're doing custom logic for fit/transform/fit_transform, and that was the main utility of passing component_obj down to the base class
dsherry
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢 !
Pull Request Description
Fix #1944
After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of
docs/source/release_notes.rstto include this pull request by adding :pr:123.