Skip to content

Statically set woodwork typing in tests#3697

Merged
eccabay merged 25 commits into
mainfrom
3651_ww_hardening
Sep 16, 2022
Merged

Statically set woodwork typing in tests#3697
eccabay merged 25 commits into
mainfrom
3651_ww_hardening

Conversation

@eccabay
Copy link
Copy Markdown
Contributor

@eccabay eccabay commented Sep 7, 2022

Closes #3651

This is not a woodwork 0.17.2 upgrade. The goal was simply to add explicit woodwork typing throughout the tests to make them more resilient to woodwork inference changes. The end result of this does have fewer tests failing in 0.17.2 since I used the upgrade as a check for general stability.
This does include some setting of woodwork types in components though, related to my previous work about being explicit about woodwork typing to reduce the number of times we re-infer within components (plus it makes tests pass 😁).

Changes in this PR:

  • Consolidation of the ts_data and get_ts_X_y test pytest fixtures into one fixture, named ts_data but with data from get_ts_X_y.
  • Updated the fixtures ts_data, imputer_test_data, X_y_binary, X_y_multi, X_y_regression, X_y_categorical_classification, and X_y_categorical_regression to explicitly set woodwork types.
  • Updated time series featurizer and data check test files to explicitly set woodwork types.
  • downcast_nullable_types now works with either a DataFrame or a Series as input data (instead of just dataframes), and there's now a test for it.
  • Removed a call to ww.init() in infer_feature_types in the case where we already have a valid schema for the input data. This has performance implications and there are performance test results for this branch.

Before vs after woodwork upgrade comparison:
Main/0.16.4 -> 0 failing tests
Main/0.17.2 -> 287 failing tests
Main /0.18.0 -> 291 failing tests

Branch/0.16.4 -> 0 failing tests
Branch/0.17.2 -> 23 failing tests
Branch/0.18.0 -> 22 failing tests

I apologize in advance for the length of this PR 😬

@codecov
Copy link
Copy Markdown

codecov Bot commented Sep 7, 2022

Codecov Report

Merging #3697 (56eb074) into main (3128e44) will decrease coverage by 0.1%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##            main   #3697     +/-   ##
=======================================
- Coverage   99.7%   99.7%   -0.0%     
=======================================
  Files        339     339             
  Lines      34465   34386     -79     
=======================================
- Hits       34338   34254     -84     
- Misses       127     132      +5     
Impacted Files Coverage Δ
...omponent_tests/test_stacked_ensemble_classifier.py 100.0% <ø> (ø)
...ansformers/preprocessing/time_series_featurizer.py 100.0% <100.0%> (ø)
evalml/pipelines/time_series_pipeline_base.py 100.0% <100.0%> (ø)
...ts/automl_tests/parallel_tests/test_automl_dask.py 96.1% <100.0%> (-0.2%) ⬇️
evalml/tests/automl_tests/test_automl.py 99.5% <100.0%> (-<0.1%) ⬇️
...ts/automl_tests/test_automl_iterative_algorithm.py 100.0% <100.0%> (ø)
.../automl_tests/test_automl_search_classification.py 96.4% <100.0%> (-<0.1%) ⬇️
...ests/automl_tests/test_automl_search_regression.py 95.0% <100.0%> (-<0.1%) ⬇️
...valml/tests/automl_tests/test_default_algorithm.py 100.0% <100.0%> (ø)
evalml/tests/automl_tests/test_engine_base.py 100.0% <100.0%> (ø)
... and 44 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@eccabay eccabay marked this pull request as ready for review September 7, 2022 17:19
Comment thread evalml/pipelines/components/transformers/imputers/time_series_imputer.py Outdated
@chukarsten chukarsten marked this pull request as draft September 13, 2022 17:30
@chukarsten
Copy link
Copy Markdown
Contributor

Moving back to draft

Comment on lines -261 to +263
logical_types={col: "Double" for col in cols_derived_from_categoricals},
logical_types={col: "Double" for col in lagged_features.columns},
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not confident in the correctness of this change, as to how we handle all types and if everything does in fact become a double here. If anyone knows better about this, please let me know.

@eccabay eccabay marked this pull request as ready for review September 13, 2022 19:38
Copy link
Copy Markdown
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job Becca. Really huge amount of work and very impressive that you lowered the sensitivity of the codebase to inference by so much!!

Comment on lines -4549 to -4565
if problem_type == ProblemTypes.TIME_SERIES_MULTICLASS:
X, y = ts_data_multi
elif problem_type == ProblemTypes.TIME_SERIES_BINARY:
X, y = ts_data_binary
else:
X, y = ts_data
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice refactoring 👍

ans["natural language col"] = pd.NA
X_df.iloc[-1, :] = ans
assert_frame_equal(result, X_df)
assert_frame_equal(result, X_df, check_dtype=False)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Classic

Copy link
Copy Markdown
Collaborator

@jeremyliweishih jeremyliweishih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on making the tests less reliant on inference. Just had a comment about nan behavior and down casting for series. Lmk what you think!



def test_test_downcast_nullable_types_can_handle_no_schema():
def test_downcast_nullable_types_series():
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to add the test case with NaNs as well?


X_bool_nullable_cols = X.ww.select("BooleanNullable")
X_int_nullable_cols = X.ww.select(["IntegerNullable", "AgeNullable"])
if isinstance(data, pd.Series):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the series contains null values should we still downcast into Boolean or Double? I think we should follow what we have and use ignore_null_cols to differentiate.

@eccabay eccabay merged commit fc982d7 into main Sep 16, 2022
@eccabay eccabay deleted the 3651_ww_hardening branch September 16, 2022 18:53
@chukarsten chukarsten mentioned this pull request Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Audit the Unit Test Suite for Woodwork Inference Sensitivity

4 participants