Skip to content

Provide details on how to fix error when ww schema is broken#2466

Merged
freddyaboulton merged 4 commits intomainfrom
2353-raise-informative-ww-init-error-msg
Jul 7, 2021
Merged

Provide details on how to fix error when ww schema is broken#2466
freddyaboulton merged 4 commits intomainfrom
2353-raise-informative-ww-init-error-msg

Conversation

@freddyaboulton
Copy link
Contributor

Pull Request Description

Fixes #2353

This is the new stacktrace:

from evalml.pipelines import ComponentGraph
from evalml.demos import load_fraud
cg = ComponentGraph({"Imputer": ["Imputer"],
                     "DateTime": ["DateTime Featurization Component", "Imputer.x"],
                     "OneHot": ["One Hot Encoder", "DateTime.x"],
                     "TargetImputer": ["Target Imputer", "OneHot.x", "OneHot.y"],
                     "Logistic": ["Logistic Regression Classifier", "TargetImputer.x", "TargetImputer.y"]})
cg.instantiate({})
X, y = load_fraud(1000)
X["card_id"] = X["provider"].astype("category")
cg.fit(X, y)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-74772725e1f8> in <module>
      9 X, y = load_fraud(1000)
     10 X["card_id"] = X["provider"].astype("category")
---> 11 cg.fit(X, y)

~/sources/evalml/evalml/pipelines/component_graph.py in fit(self, X, y)
    194             y (pd.Series): The target training data of length [n_samples]
    195         """
--> 196         X = infer_feature_types(X)
    197         y = infer_feature_types(y)
    198         self._compute_features(self.compute_order, X, y, fit=True)

~/sources/evalml/evalml/utils/woodwork_utils.py in infer_feature_types(data, feature_types)
     75                     f"get rid of this message. This is a more detailed message about the mismatch: {ww_error}"
     76                 )
---> 77             raise ValueError(ww_error)
     78         data.ww.init(schema=data.ww.schema)
     79         return data

ValueError: Dataframe types are not consistent with logical types. This usually happens when a data transformation does not go through the ww accessor. Call df.ww.init() to get rid of this message. This is a more detailed message about the mismatch: dtype mismatch for column card_id between DataFrame dtype, category, and Integer dtype, int64

After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

@codecov
Copy link

codecov bot commented Jul 2, 2021

Codecov Report

Merging #2466 (08a1084) into main (2ce178e) will increase coverage by 0.1%.
The diff coverage is 100.0%.

❗ Current head 08a1084 differs from pull request most recent head dd87381. Consider uploading reports for the commit dd87381 to get more accurate results
Impacted file tree graph

@@           Coverage Diff           @@
##            main   #2466     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        283     283             
  Lines      25566   25577     +11     
=======================================
+ Hits       25464   25475     +11     
  Misses       102     102             
Impacted Files Coverage Δ
evalml/tests/utils_tests/test_woodwork_utils.py 100.0% <100.0%> (ø)
evalml/utils/woodwork_utils.py 100.0% <100.0%> (ø)
evalml/objectives/fraud_cost.py 100.0% <0.0%> (ø)
...alml/objectives/binary_classification_objective.py 100.0% <0.0%> (ø)
...alml/tests/objective_tests/test_fraud_detection.py 100.0% <0.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2ce178e...dd87381. Read the comment docs.

@freddyaboulton freddyaboulton changed the title Provide details on how to fix error when ww schema is broken by a data transformation Provide details on how to fix error when ww schema is broken Jul 2, 2021
@freddyaboulton freddyaboulton marked this pull request as ready for review July 6, 2021 15:36
Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix! The message is clear!

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for doing this! 😁

@freddyaboulton freddyaboulton force-pushed the 2353-raise-informative-ww-init-error-msg branch from a65be86 to 0314145 Compare July 7, 2021 15:20
Copy link
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Send it.

@freddyaboulton freddyaboulton force-pushed the 2353-raise-informative-ww-init-error-msg branch from 08a1084 to dd87381 Compare July 7, 2021 18:16
@freddyaboulton freddyaboulton merged commit aff4b75 into main Jul 7, 2021
@freddyaboulton freddyaboulton deleted the 2353-raise-informative-ww-init-error-msg branch July 7, 2021 18:44
@chukarsten chukarsten mentioned this pull request Jul 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update our Woodwork error when user has invalidated schema

4 participants