-
Notifications
You must be signed in to change notification settings - Fork 92
Upgrade WW to 0.4.1 #2379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade WW to 0.4.1 #2379
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2379 +/- ##
=======================================
+ Coverage 99.7% 99.7% +0.1%
=======================================
Files 283 283
Lines 25240 25244 +4
=======================================
+ Hits 25140 25144 +4
Misses 100 100
Continue to review full report at Codecov.
|
| X = X_df.copy() | ||
| X.ww.init(logical_types={0: logical_type}) | ||
| except (ww.exceptions.TypeConversionError, TypeError): | ||
| except (ww.exceptions.TypeConversionError, TypeError, ValueError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be fixed by alteryx/woodwork#991
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for filing!
|
|
||
| y = infer_feature_types(y) | ||
| is_supported_type = y.ww.logical_type in numeric_and_boolean_ww + [ | ||
| is_supported_type = type(y.ww.logical_type) in numeric_and_boolean_ww + [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be isinstance(y.ww.logical_type, tuple(numeric_and_boolean_ww)) if you'd like here and in the other places where a similar check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tamargrey I think the other option is to use y.ww.logical_type.type_string and then update our lists, equality/member checks to use the type_strings. What are your thoughts? I think both do the same thing but using type_string may be backwards compatible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@freddyaboulton I like that idea! It does seem like it would be backwards compatible, which is nice.
It's definitely different from how we've tended to compare logical types inside of Woodwork, so my one worry is that it's using the type_string for something that it wasn't intended for. I'm not sure that that's actually a problem, though.
evalml/model_understanding/graphs.py
Outdated
| is_type = type(X.ww.logical_types[X.columns[feature]]) == ltype | ||
| else: | ||
| is_type = X.ww.logical_types[feature] == ltype | ||
| is_type = type(X.ww.logical_types[feature]) == ltype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are also places where we can use isinstance, which I think is slightly better from a timing perspective than == checks.
freddyaboulton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeremyliweishih Thanks for this! I think it looks great. I have a question on the diff for _retain_woodwork_types... and I think it may be better to follow @tamargrey 's suggestion to use is_instance or my suggestion to use type_string rather than wrapping everything in type. I'm ok with either of those options but @tamargrey 's opinion should take precedence over mine hehe.
| shap>=0.36.0 | ||
| texttable>=1.6.2 | ||
| woodwork==0.3.1 | ||
| woodwork>=0.4.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can close PR #2373 right?
| X = X_df.copy() | ||
| X.ww.init(logical_types={0: logical_type}) | ||
| except (ww.exceptions.TypeConversionError, TypeError): | ||
| except (ww.exceptions.TypeConversionError, TypeError, ValueError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for filing!
evalml/utils/woodwork_utils.py
Outdated
| if ltypes_to_ignore is None: | ||
| ltypes_to_ignore = [] | ||
| old_logical_types = { | ||
| k: v if inspect.isclass(v) else type(v) for k, v in old_logical_types.items() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do need isclass here? The types should always be instances from now on right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC some tests were failing without this change but I'll double check this!
|
|
||
| y = infer_feature_types(y) | ||
| is_supported_type = y.ww.logical_type in numeric_and_boolean_ww + [ | ||
| is_supported_type = type(y.ww.logical_type) in numeric_and_boolean_ww + [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tamargrey I think the other option is to use y.ww.logical_type.type_string and then update our lists, equality/member checks to use the type_strings. What are your thoughts? I think both do the same thing but using type_string may be backwards compatible?
|
FYI I'll file a follow-up issue to use the new functionality added in this release: |
|
@freddyaboulton can you also add using |
|
Taking a slight calculated risk and merging this before the docs build completes. We have a lot queued right now. I hope #2404 will help us here! The build_conda_pkg failure is expected, due to the version mismatch with the latest branch in the conda feedstock repo. |
Fixes #2327.
This PR mostly consists of fixing unit tests and functionality where we deal with WW logical types since they are now instantiated instead of returning the class.
Perf tests look good with no changes in model performance and somelittle change in init time and fit time but all within a couple percentage points. Ran with 50 iterations and 3 trials.
ww_upgrade_perf_results.zip