Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update validate() API #3142

Merged
merged 33 commits into from
Jan 5, 2022
Merged

Conversation

angela97lin
Copy link
Contributor

@angela97lin angela97lin commented Dec 10, 2021

Part of #3116, note that this is not being merged to main but instead, a separate branch (update_data_check_action_API). Separated out from #3152 to make it easier to review, but will merge in after to avoid multiple API changes.

This PR just updates the "action" key returned to be a dictionary that looks like "actions": {"action_list":[], "default_action": None}, rather than a list of actions.

@angela97lin angela97lin self-assigned this Dec 10, 2021
@codecov
Copy link

codecov bot commented Dec 10, 2021

Codecov Report

Merging #3142 (6f974db) into update_data_check_action_API (1343a56) will not change coverage.
The diff coverage is 100.0%.

Impacted file tree graph

@@                     Coverage Diff                      @@
##           update_data_check_action_API   #3142   +/-   ##
============================================================
  Coverage                          99.7%   99.7%           
============================================================
  Files                               324     324           
  Lines                             31232   31232           
============================================================
  Hits                              31128   31128           
  Misses                              104     104           
Impacted Files Coverage Δ
...ta_checks_tests/test_class_imbalance_data_check.py 100.0% <ø> (ø)
...ta_checks_tests/test_datetime_format_data_check.py 100.0% <ø> (ø)
...s/data_checks_tests/test_highly_null_data_check.py 100.0% <ø> (ø)
...ts/data_checks_tests/test_id_columns_data_check.py 100.0% <ø> (ø)
..._checks_tests/test_multicollinearity_data_check.py 100.0% <ø> (ø)
...s/data_checks_tests/test_no_variance_data_check.py 100.0% <ø> (ø)
...ests/data_checks_tests/test_outliers_data_check.py 100.0% <ø> (ø)
...ests/data_checks_tests/test_sparsity_data_check.py 100.0% <ø> (ø)
...hecks_tests/test_target_distribution_data_check.py 100.0% <ø> (ø)
...ts/data_checks_tests/test_uniqueness_data_check.py 100.0% <ø> (ø)
... and 27 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1343a56...6f974db. Read the comment docs.

@angela97lin angela97lin changed the base branch from main to update_data_check_action_API January 2, 2022 02:53
@angela97lin angela97lin marked this pull request as ready for review January 3, 2022 15:49
Copy link
Contributor

@ParthivNaresh ParthivNaresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look great! I think going from data_check_output["actions"] to data_check_output["actions"]["action_list"] is another reason that we also need to look at a cleaner way to access warnings, errors, and actions from data checks for us and users, but that's another issue entirely. Great work!

@@ -49,7 +49,7 @@ def test_search_data_check_error(
pd.testing.assert_series_equal(target, infer_feature_types(y))


def test_n_splits_passed_to_ts_splitting_data_check(ts_data):
def test_n_splits_passed_to_ts_splitting_data_check():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noice

Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a nit that applies to multiple files, but otherwise LGTM!

Do we have a plan on filling in default_actions? Currently, they're all None, and I'm curious 1) when their values are expected to change and 2) what values they'll hold. They don't seem too useful as of now

@@ -42,7 +42,9 @@ def validate(self, X, y):
... "details": {"columns": None, "rows": None}
... }],
... "warnings": [],
... "actions": []}
... "actions": {"action_list":[], "default_action": None}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big nit: Sometimes there's a space between "action_list": and [] and sometimes there isn't. Can we standardize this to have the space? Occurs in multiple files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sobs in wishing there were a doc linter

Copy link
Contributor

@eccabay eccabay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just left some style nitpicks!

Out of curiosity, is there currently a plan laid out for passing useful information through the "default_action" entries?

Comment on lines 33 to 34
...
...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these necessary? I'm not sure I understand the details of doctest syntax.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think so, removing!

@@ -115,12 +116,12 @@ def validate(self, X, y):
... 'num_null_rows': 2,
... 'pct_null_rows': 50.0},
... 'code': 'TARGET_HAS_NULL'}],
... 'actions': [{'code': 'IMPUTE_COL',
... 'actions': {"action_list": [{'code': 'IMPUTE_COL',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mega nit: double quotes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sobs in wishing there were a doc linter

Just went through and changed everything to double quotes :))

... 'data_check_name': 'InvalidTargetDataCheck',
... 'metadata': {'columns': None,
... 'rows': None,
... 'is_target': True,
... 'impute_strategy': 'mean'}}]}
... 'impute_strategy': 'mean'}}], "default_action": None}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another mega nit: can the "default_action" go on a new line for readability? (Same applies in a couple other places, but this is the most confusing one I think)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

... 'warnings': [{'message': "Columns 'leak', 'x' are 80.0% or more correlated with the target",
... "warnings": [{'message': "Columns 'leak', 'x' are 80.0% or more correlated with the target",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love that you fixed this, but now the other quotes are inconsistent 😭

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I must have done a replace-all somewhere--I'll go back and try to fix the other places 😭

@angela97lin
Copy link
Contributor Author

@bchen1116 @eccabay Good question about the default action! As you're probably noticing, this PR still uses DataCheckActions as the output. The next step will be to update the internals to use DataCheckActionOption classes in #3152 and then add the default action, if needed. Just decided to break this down to make it easier to review, and then will merge the branch altogether :)

@angela97lin angela97lin merged commit c939fa6 into update_data_check_action_API Jan 5, 2022
@angela97lin angela97lin deleted the 3116_update_validate branch January 5, 2022 05:27
angela97lin added a commit that referenced this pull request Jan 25, 2022
…d add functionality to suggest and take action on columns with null values (#3182)

* Update `validate()` API (#3142)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* fix doctests

* fix ts splitting test and impl

* fix more ts splitting tests

* fix doctest for ts data check

* release notes

* update target leakage data check docstring for consistency

* doctest linting and cleanup

* linting

* fix merging main issues

* Update `validate()` API to use `DataCheckActionOption` instead of `DataCheckAction` (#3152)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* add more tests

* Rename `HighlyNullDataCheck` to `NullDataCheck` and update data check to return impute action for non-highly null columns. (#3197)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* init rename

* retrigger

* revert release nots

* add new logic for detecting null cols

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix test and iteration for null data check

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* oops fix merge

* add more tests

* fix null data check tests

* update to use per column strategy and fix tests

* fix tests for data checks

* fix tests and doctests

* release notes

* fix release notes

* oops update action code

* oops fix test

* update wording of messages

* move logic out of dict

* update mode to most_frequent for impute strategies

* oops fix linting and doctests

* Flatten data check action ``validate`` API (#3244)

* init

* more cleanup

* begin to clean up tests

* fix more tests

* updating naming to action_options and fixing more tests

* fix no variance

* fix another test

* fixing automl tests

* oops actually fix automl tests

* fix the other automl tests

* fixing notebook

* fix data check tests

* fix doctests and docs

* integration tests and cleanup

* cleanup based on comments

* linting

* clean up notebook and tests

* Update `make_pipeline_from_actions` to handle null column imputation (#3237)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* init rename

* retrigger

* revert release nots

* add new logic for detecting null cols

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix test and iteration for null data check

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* oops fix merge

* add more tests

* fix null data check tests

* update to use per column strategy and fix tests

* fix tests for data checks

* fix tests and doctests

* release notes

* fix release notes

* init

* init and fix testing

* fix integration test

* updates test

* clean up docs

* lint notebook

* fix tests with types

* linting

* release notes

* update wording

* update impl for natural language and datetimes, remove old tests

* fix tests

* fix doctest

* release notes

* minor cleanup

* update release notes

* remove impute all

* linting
@angela97lin angela97lin mentioned this pull request Jan 25, 2022
angela97lin added a commit that referenced this pull request Jan 26, 2022
…3260)

* Update `validate()` API (#3142)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* fix doctests

* fix ts splitting test and impl

* fix more ts splitting tests

* fix doctest for ts data check

* release notes

* update target leakage data check docstring for consistency

* doctest linting and cleanup

* linting

* fix merging main issues

* Update `validate()` API to use `DataCheckActionOption` instead of `DataCheckAction` (#3152)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* add more tests

* Rename `HighlyNullDataCheck` to `NullDataCheck` and update data check to return impute action for non-highly null columns. (#3197)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* init rename

* retrigger

* revert release nots

* add new logic for detecting null cols

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix test and iteration for null data check

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* oops fix merge

* add more tests

* fix null data check tests

* update to use per column strategy and fix tests

* fix tests for data checks

* fix tests and doctests

* release notes

* fix release notes

* oops update action code

* oops fix test

* update wording of messages

* move logic out of dict

* update mode to most_frequent for impute strategies

* oops fix linting and doctests

* Flatten data check action ``validate`` API (#3244)

* init

* more cleanup

* begin to clean up tests

* fix more tests

* updating naming to action_options and fixing more tests

* fix no variance

* fix another test

* fixing automl tests

* oops actually fix automl tests

* fix the other automl tests

* fixing notebook

* fix data check tests

* fix doctests and docs

* integration tests and cleanup

* cleanup based on comments

* linting

* clean up notebook and tests

* Update `make_pipeline_from_actions` to handle null column imputation (#3237)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* init rename

* retrigger

* revert release nots

* add new logic for detecting null cols

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix test and iteration for null data check

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* oops fix merge

* add more tests

* fix null data check tests

* update to use per column strategy and fix tests

* fix tests for data checks

* fix tests and doctests

* release notes

* fix release notes

* init

* init and fix testing

* fix integration test

* updates test

* clean up docs

* lint notebook

* fix tests with types

* linting

* release notes

* update wording

* update impl for natural language and datetimes, remove old tests

* fix tests

* fix doctest

* release notes

* minor cleanup

* init

* release notes

* oops delete file

* oops adding line removal

Co-authored-by: chukarsten <64713315+chukarsten@users.noreply.github.com>
angela97lin added a commit that referenced this pull request Jan 26, 2022
* Update `validate()` API (#3142)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* fix doctests

* fix ts splitting test and impl

* fix more ts splitting tests

* fix doctest for ts data check

* release notes

* update target leakage data check docstring for consistency

* doctest linting and cleanup

* linting

* fix merging main issues

* Update `validate()` API to use `DataCheckActionOption` instead of `DataCheckAction` (#3152)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* add more tests

* Rename `HighlyNullDataCheck` to `NullDataCheck` and update data check to return impute action for non-highly null columns. (#3197)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* init rename

* retrigger

* revert release nots

* add new logic for detecting null cols

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix test and iteration for null data check

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* oops fix merge

* add more tests

* fix null data check tests

* update to use per column strategy and fix tests

* fix tests for data checks

* fix tests and doctests

* release notes

* fix release notes

* oops update action code

* oops fix test

* update wording of messages

* move logic out of dict

* update mode to most_frequent for impute strategies

* oops fix linting and doctests

* Flatten data check action ``validate`` API (#3244)

* init

* more cleanup

* begin to clean up tests

* fix more tests

* updating naming to action_options and fixing more tests

* fix no variance

* fix another test

* fixing automl tests

* oops actually fix automl tests

* fix the other automl tests

* fixing notebook

* fix data check tests

* fix doctests and docs

* integration tests and cleanup

* cleanup based on comments

* linting

* clean up notebook and tests

* Update `make_pipeline_from_actions` to handle null column imputation (#3237)

* init

* init

* start updating tests

* add validation code for option

* remove data check updates for now

* start to clean up tests and add validate_parameter tests

* revert highly null dc

* add in more valueerror test checking

* fix test and logic for column parameters

* init

* update some tests to new API

* fix more tests

* fix doc and sparsity test

* fix integration tests

* fix doctests

* init, update no variance

* update highly null dc

* fix id column dc

* update target leakage dc

* update sparsity dc

* outliers and uniqueness dc

* update target dis dc

* update invalid target data check

* fix doctests

* fix ts splitting test and impl

* fix dc validate and tests

* fix more ts splitting tests

* fix data check actions notebook

* update to remove columns to drop

* freeze

* remove rows to drop

* update dc tests

* update doctests

* fix integration tests

* revert requirements

* update parameters to set to empty dict

* fix tests

* fix doctests

* retrigger

* revert parameter for data check option

* fix data check option test

* fix more tests from updating default parameters

* release notes

* use empty instead of none

* release notes

* cleanup unnecessary code

* move logic to data check option class and rename

* init rename

* retrigger

* revert release nots

* add new logic for detecting null cols

* update integration tests

* add tests and some cleanup

* fix tests

* fix pipeline util test

* remove unnecessary conditional

* fix test and iteration for null data check

* fix naming, need to fix tests

* fix tests

* fix doctest

* add new enums

* add logic for enums

* update files to use enum

* add in testing for invalid enum

* fix doctest by updating to_dict impl

* linting

* try with different base

* oops revert yaml

* fix tests

* remove outdated code

* oops fix merge

* add more tests

* fix null data check tests

* update to use per column strategy and fix tests

* fix tests for data checks

* fix tests and doctests

* release notes

* fix release notes

* init

* init and fix testing

* fix integration test

* updates test

* clean up docs

* lint notebook

* fix tests with types

* linting

* release notes

* update wording

* update impl for natural language and datetimes, remove old tests

* fix tests

* fix doctest

* release notes

* minor cleanup

* init

* lint and release notes

* move release note

* clean up notebook
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants