Skip to content

Rename HighlyNullDataCheck to NullDataCheck and update data check to return impute action for non-highly null columns.#3197

Merged
angela97lin merged 105 commits into
update_data_check_action_APIfrom
3144_null_data_check
Jan 13, 2022
Merged

Rename HighlyNullDataCheck to NullDataCheck and update data check to return impute action for non-highly null columns.#3197
angela97lin merged 105 commits into
update_data_check_action_APIfrom
3144_null_data_check

Conversation

@angela97lin

@angela97lin angela97lin commented Jan 7, 2022

Copy link
Copy Markdown
Contributor

Closes #3144.

  • Renames HighlyNullDataCheck to NullDataCheck
  • Previously, HighlyNullDataCheck only detected highly null columns (columns that have more than a threshold percentage of null values) and returned a suggested action to drop highly null columns. This PR expands on this data check to detect columns with any null values. For any columns with null values but are not considered highly null (to prevent overlap), this data check will now have a warning and suggest an imputation action for those columns.

Note: this might get unwieldy for wide datasets that may just have one or two null values?--More UX to think about?

Comment thread evalml/data_checks/null_data_check.py Outdated
if below_highly_null_cols:
results["warnings"].append(
DataCheckWarning(
message="Columns {} have null values".format(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: would it make more sense to say "Column(s)" instead of "Columns"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I was trying to make this consistent with the highly null message, but I'll just change both to use "Column(s)" instead 😂

Comment thread evalml/data_checks/null_data_check.py Outdated
Comment on lines +268 to +273
if col_in_df.ww.schema.is_numeric
else ["mode"],
"type": "category",
"default_value": "mean"
if col_in_df.ww.schema.is_numeric
else "mode",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our linter definitely made this formatting, but it's kind of hard to read. Does it look cleaner/clearer if you pull the if/else logic out of this block?

@angela97lin angela97lin Jan 13, 2022

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so! 😁 Edit: okay, not the best... but it's trying lol
image

)
y = pd.Series([1, 0, 0, 1, 1])
data_check = HighlyNullDataCheck()
data_check = NullDataCheck()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angela97lin I think this test is showing there's some logic missing in this PR:

  • Shouldn't the pipeline have an imputer for lots_of_null?
  • The ActionOption from the NullDataCheck has mode as a valid parameter but our imputers use most_frequent ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton Haha sharp eye! The reason why this is true is because make_pipeline_from_actions doesn't know how to handle the imputation action yet. That's to come in this PR: #3237

Separated out mostly because this was getting large already and wanted to make these PRs easier to review :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For more clarity:

If we were to put a debugger in, actions would look something like [{'code': 'DROP_COL', 'data_check_name': 'NullDataCheck', 'metadata': {'columns': ['all_null'], 'rows': None, 'parameters': {}}}, {'code': 'IMPUTE_COL', 'data_check_name': 'NullDataCheck', 'metadata': {'columns': ['lots_of_null'], 'rows': None, 'is_target': False, 'parameters': {'impute_strategies': {'lots_of_null': {'impute_strategy': 'mean'}}}}}] which has the imputation information we want!

It's make_pipeline_from_actions which won't actually convert that action to anything useful in the pipeline right now, since it's not sure what to do with a IMPUTE_COL action where is_target is False (we currently support just target imputation). It's coming up though!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angela97lin Thanks for the context! Yep I agree the actions are returning the right thing (I'm still not sure about mode though) I just thought it was in-scope to modify make_pipeline_from_actions in this PR. I'll wait for #3237 !

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton Oops totally missed the mode comment, you're right 😝 Let me update that to most_frequent!

@freddyaboulton freddyaboulton left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me @angela97lin !

@ParthivNaresh ParthivNaresh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@angela97lin angela97lin merged commit 90d4f04 into update_data_check_action_API Jan 13, 2022
@angela97lin angela97lin deleted the 3144_null_data_check branch January 13, 2022 18:48
angela97lin added a commit that referenced this pull request Jan 25, 2022
…d add functionality to suggest and take action on columns with null values (#3182)

* Update `validate()` API (#3142)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* fix doctests

* fix ts splitting test and impl

* fix more ts splitting tests

* fix doctest for ts data check

* release notes

* update target leakage data check docstring for consistency

* doctest linting and cleanup

* linting

* fix merging main issues

* Update `validate()` API to use `DataCheckActionOption` instead of `DataCheckAction` (#3152)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* add more tests

* Rename `HighlyNullDataCheck` to `NullDataCheck` and update data check to return impute action for non-highly null columns. (#3197)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* init rename

* retrigger

* revert release nots

* add new logic for detecting null cols

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix test and iteration for null data check

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* oops fix merge

* add more tests

* fix null data check tests

* update to use per column strategy and fix tests

* fix tests for data checks

* fix tests and doctests

* release notes

* fix release notes

* oops update action code

* oops fix test

* update wording of messages

* move logic out of dict

* update mode to most_frequent for impute strategies

* oops fix linting and doctests

* Flatten data check action ``validate`` API (#3244)

* init

* more cleanup

* begin to clean up tests

* fix more tests

* updating naming to action_options and fixing more tests

* fix no variance

* fix another test

* fixing automl tests

* oops actually fix automl tests

* fix the other automl tests

* fixing notebook

* fix data check tests

* fix doctests and docs

* integration tests and cleanup

* cleanup based on comments

* linting

* clean up notebook and tests

* Update `make_pipeline_from_actions` to handle null column imputation (#3237)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* init rename

* retrigger

* revert release nots

* add new logic for detecting null cols

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix test and iteration for null data check

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* oops fix merge

* add more tests

* fix null data check tests

* update to use per column strategy and fix tests

* fix tests for data checks

* fix tests and doctests

* release notes

* fix release notes

* init

* init and fix testing

* fix integration test

* updates test

* clean up docs

* lint notebook

* fix tests with types

* linting

* release notes

* update wording

* update impl for natural language and datetimes, remove old tests

* fix tests

* fix doctest

* release notes

* minor cleanup

* update release notes

* remove impute all

* linting
@angela97lin angela97lin mentioned this pull request Jan 25, 2022
angela97lin added a commit that referenced this pull request Jan 26, 2022
…3260)

* Update `validate()` API (#3142)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* fix doctests

* fix ts splitting test and impl

* fix more ts splitting tests

* fix doctest for ts data check

* release notes

* update target leakage data check docstring for consistency

* doctest linting and cleanup

* linting

* fix merging main issues

* Update `validate()` API to use `DataCheckActionOption` instead of `DataCheckAction` (#3152)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* add more tests

* Rename `HighlyNullDataCheck` to `NullDataCheck` and update data check to return impute action for non-highly null columns. (#3197)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* init rename

* retrigger

* revert release nots

* add new logic for detecting null cols

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix test and iteration for null data check

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* oops fix merge

* add more tests

* fix null data check tests

* update to use per column strategy and fix tests

* fix tests for data checks

* fix tests and doctests

* release notes

* fix release notes

* oops update action code

* oops fix test

* update wording of messages

* move logic out of dict

* update mode to most_frequent for impute strategies

* oops fix linting and doctests

* Flatten data check action ``validate`` API (#3244)

* init

* more cleanup

* begin to clean up tests

* fix more tests

* updating naming to action_options and fixing more tests

* fix no variance

* fix another test

* fixing automl tests

* oops actually fix automl tests

* fix the other automl tests

* fixing notebook

* fix data check tests

* fix doctests and docs

* integration tests and cleanup

* cleanup based on comments

* linting

* clean up notebook and tests

* Update `make_pipeline_from_actions` to handle null column imputation (#3237)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* init rename

* retrigger

* revert release nots

* add new logic for detecting null cols

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix test and iteration for null data check

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* oops fix merge

* add more tests

* fix null data check tests

* update to use per column strategy and fix tests

* fix tests for data checks

* fix tests and doctests

* release notes

* fix release notes

* init

* init and fix testing

* fix integration test

* updates test

* clean up docs

* lint notebook

* fix tests with types

* linting

* release notes

* update wording

* update impl for natural language and datetimes, remove old tests

* fix tests

* fix doctest

* release notes

* minor cleanup

* init

* release notes

* oops delete file

* oops adding line removal

Co-authored-by: chukarsten <64713315+chukarsten@users.noreply.github.com>
angela97lin added a commit that referenced this pull request Jan 26, 2022
* Update `validate()` API (#3142)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* fix doctests

* fix ts splitting test and impl

* fix more ts splitting tests

* fix doctest for ts data check

* release notes

* update target leakage data check docstring for consistency

* doctest linting and cleanup

* linting

* fix merging main issues

* Update `validate()` API to use `DataCheckActionOption` instead of `DataCheckAction` (#3152)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* add more tests

* Rename `HighlyNullDataCheck` to `NullDataCheck` and update data check to return impute action for non-highly null columns. (#3197)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* init rename

* retrigger

* revert release nots

* add new logic for detecting null cols

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix test and iteration for null data check

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* oops fix merge

* add more tests

* fix null data check tests

* update to use per column strategy and fix tests

* fix tests for data checks

* fix tests and doctests

* release notes

* fix release notes

* oops update action code

* oops fix test

* update wording of messages

* move logic out of dict

* update mode to most_frequent for impute strategies

* oops fix linting and doctests

* Flatten data check action ``validate`` API (#3244)

* init

* more cleanup

* begin to clean up tests

* fix more tests

* updating naming to action_options and fixing more tests

* fix no variance

* fix another test

* fixing automl tests

* oops actually fix automl tests

* fix the other automl tests

* fixing notebook

* fix data check tests

* fix doctests and docs

* integration tests and cleanup

* cleanup based on comments

* linting

* clean up notebook and tests

* Update `make_pipeline_from_actions` to handle null column imputation (#3237)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* init rename

* retrigger

* revert release nots

* add new logic for detecting null cols

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix test and iteration for null data check

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* oops fix merge

* add more tests

* fix null data check tests

* update to use per column strategy and fix tests

* fix tests for data checks

* fix tests and doctests

* release notes

* fix release notes

* init

* init and fix testing

* fix integration test

* updates test

* clean up docs

* lint notebook

* fix tests with types

* linting

* release notes

* update wording

* update impl for natural language and datetimes, remove old tests

* fix tests

* fix doctest

* release notes

* minor cleanup

* init

* lint and release notes

* move release note

* clean up notebook
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update HighlyNullDataCheck to detect when columns have any null values

4 participants