Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return list of DataCheckAction objects #3072

Closed
chukarsten opened this issue Nov 17, 2021 · 4 comments
Closed

Return list of DataCheckAction objects #3072

chukarsten opened this issue Nov 17, 2021 · 4 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement An improvement to an existing feature.

Comments

@chukarsten
Copy link
Collaborator

In #3050 we updated our docs to remove manual data cleaning from the suggested user workflow and replaced it with the utilization of our DataCheckActions API. We should continue refining the use of DCA to make it more streamlined. Enabling the option to return a list of DataCheckActions objects to be fed to make_pipeline_from_actions can be a good way to reduce the thinking required on the part of the user that wants to 1.) see how their data is unhealthy, 2.) see what they should do about it and 3.) do those things.

I would propose that search_iterative() gets a parameter like action_return_type="dict" that accepts a string with either "object" or "dict" to change the population of the "actions" key of the results dictionary. This will prevent the user from having to run the conversion themselves in python.

Also part of this story is updating the docs here to make it even simpler.

@chukarsten chukarsten added documentation Improvements or additions to documentation enhancement An improvement to an existing feature. labels Nov 17, 2021
@angela97lin
Copy link
Contributor

Love this!

What do you think about having the validate method on data checks return objects instead via a return_type parameter? We currently return dicts because those are easy to read and serialize. Would that be too big of a haul?

Also note that this will be affected by the DCA UX update work 😅

@chukarsten
Copy link
Collaborator Author

@angela97lin does the work you've done in the last two weeks change this?

@dsherry
Copy link
Contributor

dsherry commented Dec 2, 2021

Discussion notes

  • This will be fixed by @angela97lin 's upcoming data check / actions work
  • When she files issues for that work, she'll mark this as blocked on that work
  • We'll keep this issue open to track verification

@angela97lin
Copy link
Contributor

@chukarsten Closing this out. After the data check API refactor, we now have make_pipeline_from_data_check_output so that we can directly take the output of the data checks run and create a pipeline to address the actions.

https://evalml.alteryx.com/en/latest/user_guide/data_check_actions.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement An improvement to an existing feature.
Projects
None yet
Development

No branches or pull requests

3 participants