Flatten data check action validate API#3244
Merged
angela97lin merged 19 commits intoupdate_data_check_action_APIfrom Jan 18, 2022
Merged
Flatten data check action validate API#3244angela97lin merged 19 commits intoupdate_data_check_action_APIfrom
validate API#3244angela97lin merged 19 commits intoupdate_data_check_action_APIfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## update_data_check_action_API #3244 +/- ##
==============================================================
- Coverage 99.8% 99.8% -0.0%
==============================================================
Files 326 326
Lines 31589 31586 -3
==============================================================
- Hits 31498 31495 -3
Misses 91 91
Continue to review full report at Codecov.
|
validate APIvalidate API
eccabay
reviewed
Jan 18, 2022
Contributor
eccabay
left a comment
There was a problem hiding this comment.
I'm a big fan of this simplification!
freddyaboulton
approved these changes
Jan 18, 2022
Contributor
freddyaboulton
left a comment
There was a problem hiding this comment.
@angela97lin I agree the new api format is easier to understand by treating actions/warnings/errors the same. If the team is ok with the breaking change then this looks good to me!
ParthivNaresh
approved these changes
Jan 18, 2022
Contributor
ParthivNaresh
left a comment
There was a problem hiding this comment.
Awesome I love it!
angela97lin
added a commit
that referenced
this pull request
Jan 25, 2022
…d add functionality to suggest and take action on columns with null values (#3182) * Update `validate()` API (#3142) * init * init * start updating tests * add validation code for option * remove data check updates for now * start to clean up tests and add validate_parameter tests * revert highly null dc * add in more valueerror test checking * fix test and logic for column parameters * init * update some tests to new API * fix more tests * fix doc and sparsity test * fix integration tests * fix doctests * fix doctests * fix ts splitting test and impl * fix more ts splitting tests * fix doctest for ts data check * release notes * update target leakage data check docstring for consistency * doctest linting and cleanup * linting * fix merging main issues * Update `validate()` API to use `DataCheckActionOption` instead of `DataCheckAction` (#3152) * init * init * start updating tests * add validation code for option * remove data check updates for now * start to clean up tests and add validate_parameter tests * revert highly null dc * add in more valueerror test checking * fix test and logic for column parameters * init * update some tests to new API * fix more tests * fix doc and sparsity test * fix integration tests * fix doctests * init, update no variance * update highly null dc * fix id column dc * update target leakage dc * update sparsity dc * outliers and uniqueness dc * update target dis dc * update invalid target data check * fix doctests * fix ts splitting test and impl * fix dc validate and tests * fix more ts splitting tests * fix data check actions notebook * update to remove columns to drop * freeze * remove rows to drop * update dc tests * update doctests * fix integration tests * revert requirements * update parameters to set to empty dict * fix tests * fix doctests * retrigger * revert parameter for data check option * fix data check option test * fix more tests from updating default parameters * release notes * use empty instead of none * release notes * cleanup unnecessary code * move logic to data check option class and rename * update integration tests * add tests and some cleanup * fix tests * fix pipeline util test * remove unnecessary conditional * fix naming, need to fix tests * fix tests * fix doctest * add new enums * add logic for enums * update files to use enum * add in testing for invalid enum * fix doctest by updating to_dict impl * linting * try with different base * oops revert yaml * fix tests * remove outdated code * add more tests * Rename `HighlyNullDataCheck` to `NullDataCheck` and update data check to return impute action for non-highly null columns. (#3197) * init * init * start updating tests * add validation code for option * remove data check updates for now * start to clean up tests and add validate_parameter tests * revert highly null dc * add in more valueerror test checking * fix test and logic for column parameters * init * update some tests to new API * fix more tests * fix doc and sparsity test * fix integration tests * fix doctests * init, update no variance * update highly null dc * fix id column dc * update target leakage dc * update sparsity dc * outliers and uniqueness dc * update target dis dc * update invalid target data check * fix doctests * fix ts splitting test and impl * fix dc validate and tests * fix more ts splitting tests * fix data check actions notebook * update to remove columns to drop * freeze * remove rows to drop * update dc tests * update doctests * fix integration tests * revert requirements * update parameters to set to empty dict * fix tests * fix doctests * retrigger * revert parameter for data check option * fix data check option test * fix more tests from updating default parameters * release notes * use empty instead of none * release notes * cleanup unnecessary code * move logic to data check option class and rename * init rename * retrigger * revert release nots * add new logic for detecting null cols * update integration tests * add tests and some cleanup * fix tests * fix pipeline util test * remove unnecessary conditional * fix test and iteration for null data check * fix naming, need to fix tests * fix tests * fix doctest * add new enums * add logic for enums * update files to use enum * add in testing for invalid enum * fix doctest by updating to_dict impl * linting * try with different base * oops revert yaml * fix tests * remove outdated code * oops fix merge * add more tests * fix null data check tests * update to use per column strategy and fix tests * fix tests for data checks * fix tests and doctests * release notes * fix release notes * oops update action code * oops fix test * update wording of messages * move logic out of dict * update mode to most_frequent for impute strategies * oops fix linting and doctests * Flatten data check action ``validate`` API (#3244) * init * more cleanup * begin to clean up tests * fix more tests * updating naming to action_options and fixing more tests * fix no variance * fix another test * fixing automl tests * oops actually fix automl tests * fix the other automl tests * fixing notebook * fix data check tests * fix doctests and docs * integration tests and cleanup * cleanup based on comments * linting * clean up notebook and tests * Update `make_pipeline_from_actions` to handle null column imputation (#3237) * init * init * start updating tests * add validation code for option * remove data check updates for now * start to clean up tests and add validate_parameter tests * revert highly null dc * add in more valueerror test checking * fix test and logic for column parameters * init * update some tests to new API * fix more tests * fix doc and sparsity test * fix integration tests * fix doctests * init, update no variance * update highly null dc * fix id column dc * update target leakage dc * update sparsity dc * outliers and uniqueness dc * update target dis dc * update invalid target data check * fix doctests * fix ts splitting test and impl * fix dc validate and tests * fix more ts splitting tests * fix data check actions notebook * update to remove columns to drop * freeze * remove rows to drop * update dc tests * update doctests * fix integration tests * revert requirements * update parameters to set to empty dict * fix tests * fix doctests * retrigger * revert parameter for data check option * fix data check option test * fix more tests from updating default parameters * release notes * use empty instead of none * release notes * cleanup unnecessary code * move logic to data check option class and rename * init rename * retrigger * revert release nots * add new logic for detecting null cols * update integration tests * add tests and some cleanup * fix tests * fix pipeline util test * remove unnecessary conditional * fix test and iteration for null data check * fix naming, need to fix tests * fix tests * fix doctest * add new enums * add logic for enums * update files to use enum * add in testing for invalid enum * fix doctest by updating to_dict impl * linting * try with different base * oops revert yaml * fix tests * remove outdated code * oops fix merge * add more tests * fix null data check tests * update to use per column strategy and fix tests * fix tests for data checks * fix tests and doctests * release notes * fix release notes * init * init and fix testing * fix integration test * updates test * clean up docs * lint notebook * fix tests with types * linting * release notes * update wording * update impl for natural language and datetimes, remove old tests * fix tests * fix doctest * release notes * minor cleanup * update release notes * remove impute all * linting
Merged
angela97lin
added a commit
that referenced
this pull request
Jan 26, 2022
…3260) * Update `validate()` API (#3142) * init * init * start updating tests * add validation code for option * remove data check updates for now * start to clean up tests and add validate_parameter tests * revert highly null dc * add in more valueerror test checking * fix test and logic for column parameters * init * update some tests to new API * fix more tests * fix doc and sparsity test * fix integration tests * fix doctests * fix doctests * fix ts splitting test and impl * fix more ts splitting tests * fix doctest for ts data check * release notes * update target leakage data check docstring for consistency * doctest linting and cleanup * linting * fix merging main issues * Update `validate()` API to use `DataCheckActionOption` instead of `DataCheckAction` (#3152) * init * init * start updating tests * add validation code for option * remove data check updates for now * start to clean up tests and add validate_parameter tests * revert highly null dc * add in more valueerror test checking * fix test and logic for column parameters * init * update some tests to new API * fix more tests * fix doc and sparsity test * fix integration tests * fix doctests * init, update no variance * update highly null dc * fix id column dc * update target leakage dc * update sparsity dc * outliers and uniqueness dc * update target dis dc * update invalid target data check * fix doctests * fix ts splitting test and impl * fix dc validate and tests * fix more ts splitting tests * fix data check actions notebook * update to remove columns to drop * freeze * remove rows to drop * update dc tests * update doctests * fix integration tests * revert requirements * update parameters to set to empty dict * fix tests * fix doctests * retrigger * revert parameter for data check option * fix data check option test * fix more tests from updating default parameters * release notes * use empty instead of none * release notes * cleanup unnecessary code * move logic to data check option class and rename * update integration tests * add tests and some cleanup * fix tests * fix pipeline util test * remove unnecessary conditional * fix naming, need to fix tests * fix tests * fix doctest * add new enums * add logic for enums * update files to use enum * add in testing for invalid enum * fix doctest by updating to_dict impl * linting * try with different base * oops revert yaml * fix tests * remove outdated code * add more tests * Rename `HighlyNullDataCheck` to `NullDataCheck` and update data check to return impute action for non-highly null columns. (#3197) * init * init * start updating tests * add validation code for option * remove data check updates for now * start to clean up tests and add validate_parameter tests * revert highly null dc * add in more valueerror test checking * fix test and logic for column parameters * init * update some tests to new API * fix more tests * fix doc and sparsity test * fix integration tests * fix doctests * init, update no variance * update highly null dc * fix id column dc * update target leakage dc * update sparsity dc * outliers and uniqueness dc * update target dis dc * update invalid target data check * fix doctests * fix ts splitting test and impl * fix dc validate and tests * fix more ts splitting tests * fix data check actions notebook * update to remove columns to drop * freeze * remove rows to drop * update dc tests * update doctests * fix integration tests * revert requirements * update parameters to set to empty dict * fix tests * fix doctests * retrigger * revert parameter for data check option * fix data check option test * fix more tests from updating default parameters * release notes * use empty instead of none * release notes * cleanup unnecessary code * move logic to data check option class and rename * init rename * retrigger * revert release nots * add new logic for detecting null cols * update integration tests * add tests and some cleanup * fix tests * fix pipeline util test * remove unnecessary conditional * fix test and iteration for null data check * fix naming, need to fix tests * fix tests * fix doctest * add new enums * add logic for enums * update files to use enum * add in testing for invalid enum * fix doctest by updating to_dict impl * linting * try with different base * oops revert yaml * fix tests * remove outdated code * oops fix merge * add more tests * fix null data check tests * update to use per column strategy and fix tests * fix tests for data checks * fix tests and doctests * release notes * fix release notes * oops update action code * oops fix test * update wording of messages * move logic out of dict * update mode to most_frequent for impute strategies * oops fix linting and doctests * Flatten data check action ``validate`` API (#3244) * init * more cleanup * begin to clean up tests * fix more tests * updating naming to action_options and fixing more tests * fix no variance * fix another test * fixing automl tests * oops actually fix automl tests * fix the other automl tests * fixing notebook * fix data check tests * fix doctests and docs * integration tests and cleanup * cleanup based on comments * linting * clean up notebook and tests * Update `make_pipeline_from_actions` to handle null column imputation (#3237) * init * init * start updating tests * add validation code for option * remove data check updates for now * start to clean up tests and add validate_parameter tests * revert highly null dc * add in more valueerror test checking * fix test and logic for column parameters * init * update some tests to new API * fix more tests * fix doc and sparsity test * fix integration tests * fix doctests * init, update no variance * update highly null dc * fix id column dc * update target leakage dc * update sparsity dc * outliers and uniqueness dc * update target dis dc * update invalid target data check * fix doctests * fix ts splitting test and impl * fix dc validate and tests * fix more ts splitting tests * fix data check actions notebook * update to remove columns to drop * freeze * remove rows to drop * update dc tests * update doctests * fix integration tests * revert requirements * update parameters to set to empty dict * fix tests * fix doctests * retrigger * revert parameter for data check option * fix data check option test * fix more tests from updating default parameters * release notes * use empty instead of none * release notes * cleanup unnecessary code * move logic to data check option class and rename * init rename * retrigger * revert release nots * add new logic for detecting null cols * update integration tests * add tests and some cleanup * fix tests * fix pipeline util test * remove unnecessary conditional * fix test and iteration for null data check * fix naming, need to fix tests * fix tests * fix doctest * add new enums * add logic for enums * update files to use enum * add in testing for invalid enum * fix doctest by updating to_dict impl * linting * try with different base * oops revert yaml * fix tests * remove outdated code * oops fix merge * add more tests * fix null data check tests * update to use per column strategy and fix tests * fix tests for data checks * fix tests and doctests * release notes * fix release notes * init * init and fix testing * fix integration test * updates test * clean up docs * lint notebook * fix tests with types * linting * release notes * update wording * update impl for natural language and datetimes, remove old tests * fix tests * fix doctest * release notes * minor cleanup * init * release notes * oops delete file * oops adding line removal Co-authored-by: chukarsten <64713315+chukarsten@users.noreply.github.com>
angela97lin
added a commit
that referenced
this pull request
Jan 26, 2022
* Update `validate()` API (#3142) * init * init * start updating tests * add validation code for option * remove data check updates for now * start to clean up tests and add validate_parameter tests * revert highly null dc * add in more valueerror test checking * fix test and logic for column parameters * init * update some tests to new API * fix more tests * fix doc and sparsity test * fix integration tests * fix doctests * fix doctests * fix ts splitting test and impl * fix more ts splitting tests * fix doctest for ts data check * release notes * update target leakage data check docstring for consistency * doctest linting and cleanup * linting * fix merging main issues * Update `validate()` API to use `DataCheckActionOption` instead of `DataCheckAction` (#3152) * init * init * start updating tests * add validation code for option * remove data check updates for now * start to clean up tests and add validate_parameter tests * revert highly null dc * add in more valueerror test checking * fix test and logic for column parameters * init * update some tests to new API * fix more tests * fix doc and sparsity test * fix integration tests * fix doctests * init, update no variance * update highly null dc * fix id column dc * update target leakage dc * update sparsity dc * outliers and uniqueness dc * update target dis dc * update invalid target data check * fix doctests * fix ts splitting test and impl * fix dc validate and tests * fix more ts splitting tests * fix data check actions notebook * update to remove columns to drop * freeze * remove rows to drop * update dc tests * update doctests * fix integration tests * revert requirements * update parameters to set to empty dict * fix tests * fix doctests * retrigger * revert parameter for data check option * fix data check option test * fix more tests from updating default parameters * release notes * use empty instead of none * release notes * cleanup unnecessary code * move logic to data check option class and rename * update integration tests * add tests and some cleanup * fix tests * fix pipeline util test * remove unnecessary conditional * fix naming, need to fix tests * fix tests * fix doctest * add new enums * add logic for enums * update files to use enum * add in testing for invalid enum * fix doctest by updating to_dict impl * linting * try with different base * oops revert yaml * fix tests * remove outdated code * add more tests * Rename `HighlyNullDataCheck` to `NullDataCheck` and update data check to return impute action for non-highly null columns. (#3197) * init * init * start updating tests * add validation code for option * remove data check updates for now * start to clean up tests and add validate_parameter tests * revert highly null dc * add in more valueerror test checking * fix test and logic for column parameters * init * update some tests to new API * fix more tests * fix doc and sparsity test * fix integration tests * fix doctests * init, update no variance * update highly null dc * fix id column dc * update target leakage dc * update sparsity dc * outliers and uniqueness dc * update target dis dc * update invalid target data check * fix doctests * fix ts splitting test and impl * fix dc validate and tests * fix more ts splitting tests * fix data check actions notebook * update to remove columns to drop * freeze * remove rows to drop * update dc tests * update doctests * fix integration tests * revert requirements * update parameters to set to empty dict * fix tests * fix doctests * retrigger * revert parameter for data check option * fix data check option test * fix more tests from updating default parameters * release notes * use empty instead of none * release notes * cleanup unnecessary code * move logic to data check option class and rename * init rename * retrigger * revert release nots * add new logic for detecting null cols * update integration tests * add tests and some cleanup * fix tests * fix pipeline util test * remove unnecessary conditional * fix test and iteration for null data check * fix naming, need to fix tests * fix tests * fix doctest * add new enums * add logic for enums * update files to use enum * add in testing for invalid enum * fix doctest by updating to_dict impl * linting * try with different base * oops revert yaml * fix tests * remove outdated code * oops fix merge * add more tests * fix null data check tests * update to use per column strategy and fix tests * fix tests for data checks * fix tests and doctests * release notes * fix release notes * oops update action code * oops fix test * update wording of messages * move logic out of dict * update mode to most_frequent for impute strategies * oops fix linting and doctests * Flatten data check action ``validate`` API (#3244) * init * more cleanup * begin to clean up tests * fix more tests * updating naming to action_options and fixing more tests * fix no variance * fix another test * fixing automl tests * oops actually fix automl tests * fix the other automl tests * fixing notebook * fix data check tests * fix doctests and docs * integration tests and cleanup * cleanup based on comments * linting * clean up notebook and tests * Update `make_pipeline_from_actions` to handle null column imputation (#3237) * init * init * start updating tests * add validation code for option * remove data check updates for now * start to clean up tests and add validate_parameter tests * revert highly null dc * add in more valueerror test checking * fix test and logic for column parameters * init * update some tests to new API * fix more tests * fix doc and sparsity test * fix integration tests * fix doctests * init, update no variance * update highly null dc * fix id column dc * update target leakage dc * update sparsity dc * outliers and uniqueness dc * update target dis dc * update invalid target data check * fix doctests * fix ts splitting test and impl * fix dc validate and tests * fix more ts splitting tests * fix data check actions notebook * update to remove columns to drop * freeze * remove rows to drop * update dc tests * update doctests * fix integration tests * revert requirements * update parameters to set to empty dict * fix tests * fix doctests * retrigger * revert parameter for data check option * fix data check option test * fix more tests from updating default parameters * release notes * use empty instead of none * release notes * cleanup unnecessary code * move logic to data check option class and rename * init rename * retrigger * revert release nots * add new logic for detecting null cols * update integration tests * add tests and some cleanup * fix tests * fix pipeline util test * remove unnecessary conditional * fix test and iteration for null data check * fix naming, need to fix tests * fix tests * fix doctest * add new enums * add logic for enums * update files to use enum * add in testing for invalid enum * fix doctest by updating to_dict impl * linting * try with different base * oops revert yaml * fix tests * remove outdated code * oops fix merge * add more tests * fix null data check tests * update to use per column strategy and fix tests * fix tests for data checks * fix tests and doctests * release notes * fix release notes * init * init and fix testing * fix integration test * updates test * clean up docs * lint notebook * fix tests with types * linting * release notes * update wording * update impl for natural language and datetimes, remove old tests * fix tests * fix doctest * release notes * minor cleanup * init * lint and release notes * move release note * clean up notebook
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #3243. This is being merged into my DC branch, not main!
Lots of lines touched but not much logic.