Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates #3788

Merged
merged 5 commits into from Oct 27, 2022
Merged

Updates #3788

merged 5 commits into from Oct 27, 2022

Conversation

chukarsten
Copy link
Collaborator

@chukarsten chukarsten commented Oct 27, 2022

Small updates to fix some integration tests.

@chukarsten chukarsten marked this pull request as ready for review October 27, 2022 05:04
@codecov
Copy link

codecov bot commented Oct 27, 2022

Codecov Report

Merging #3788 (9e4986a) into main (cc4f652) will increase coverage by 0.1%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##            main   #3788     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        343     343             
  Lines      35793   35797      +4     
=======================================
+ Hits       35656   35660      +4     
  Misses       137     137             
Impacted Files Coverage Δ
evalml/pipelines/time_series_pipeline_base.py 99.2% <100.0%> (+0.1%) ⬆️
...peline_tests/test_time_series_baseline_pipeline.py 100.0% <100.0%> (ø)
.../tests/pipeline_tests/test_time_series_pipeline.py 99.9% <100.0%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@@ -151,7 +151,9 @@ def _drop_time_index(self, X, y):
if self.should_drop_time_index and self.time_index in X.columns:
X_schema = X.ww.schema
y_schema = y.ww.schema
index_name = X.index.name
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed to preserve the index name of the index here. The resulting error I was seeing was complaining about the index of X being renamed to "date" when "date" was also a column, so the subsequent access of X[self.time_index] was coming up as ambiguous. All these KeyError(0)'s here are that manifestation of that error that is fixed by these two lines.

@@ -121,7 +121,7 @@ def _add_training_data_to_X_Y(self, X, y, X_train, y_train):
time_index = self.pipeline_params["time_index"]
freq = pd.infer_freq(X_train[time_index])
correct_range = pd.date_range(
start=X_train["date"].iloc[-1],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was just a bug that failed a single integration test.


# TODO: Replace this with a better test that reproduces the issue with bike_sharing
# causing the name of the internal X to shift.
X_train_t = clf._drop_time_index(X_train, y_train)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can someone file an issue to get some better testing around this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filed #3790 3790

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants