-
Notifications
You must be signed in to change notification settings - Fork 89
Integrate Decomposition into AutoMLSearch #3781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…different classes. Moved the test to project the seasonal signal up to the parent Decomposer class. Moved the testing for the seasonal projection to the decomposer test module.
… to move that up to the base Decomposer class.
* Updated get_trend_df() to work out of sample. * Fixed transform() to work with in sample, but not spanning the sample. * Fixed inverse_transform to work with smaller than sample, in sample data.
…ple. Also updated test for transform to return same if y is None and moved that to parent class.
…etter reflect what's going on. Docstring changes.
…e seasonal sample to match the STLDecomposer.
…s and regression pipelines.
@@ -116,6 +117,16 @@ def _add_training_data_to_X_Y(self, X, y, X_train, y_train): | |||
self.gap, | |||
) | |||
|
|||
# Properly fill in the dates in the gap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a pre-existing bug where we fill in the gap between the training and testing data with the last row of training data, but that means creating an irregularly spaced time index.
X_schema = X.ww.schema | ||
y_schema = y.ww.schema | ||
X = X.set_index(X[self.time_index]) | ||
y = y.set_axis(X[self.time_index]) | ||
X.ww.init(schema=X_schema) | ||
y.ww.init(schema=y_schema) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels weird to be saving the time index in the data's indices when we're trying to drop it, but the decomposer needs the time index data to decompose successfully. If anyone has suggestions on a better or cleaner way to get around this I am very open to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we drop after fit or transform in the decomposer then? The problem is non-time-series native estimators will fail with a datetime column. Maybe we can amend the logic here to save the time index only when having the decomposer in the pipeline and then dropping the time index later?
Codecov Report
@@ Coverage Diff @@
## main #3781 +/- ##
=======================================
- Coverage 99.7% 99.7% -0.0%
=======================================
Files 343 343
Lines 35680 35754 +74
=======================================
+ Hits 35544 35617 +73
- Misses 136 137 +1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to figure out when and where we're dropping the time index so that the pipelines don't fail for estimators that can't accept time indexes but everything else LGTM!
X_schema = X.ww.schema | ||
y_schema = y.ww.schema | ||
X = X.set_index(X[self.time_index]) | ||
y = y.set_axis(X[self.time_index]) | ||
X.ww.init(schema=X_schema) | ||
y.ww.init(schema=y_schema) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we drop after fit or transform in the decomposer then? The problem is non-time-series native estimators will fail with a datetime column. Maybe we can amend the logic here to save the time index only when having the decomposer in the pipeline and then dropping the time index later?
can we also add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm - we're good but think we should add to test_automl_supports_time_series_regression
still!
Closing in favor of #3785 |
Includes some other small code changes to allow integration to be successful!