Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update data checks to return DataCheckResults object #1444
Update data checks to return DataCheckResults object #1444
Changes from 18 commits
f560617
b6a66ac
29284c1
f50c267
86b92ad
a3e4b8f
8031cc2
ad8a15e
455dfdc
cbd9f42
e4e4b78
aa101da
61489de
82b3c3a
ac7181f
86273fe
2b8e910
c4f12ab
b08447c
677edf1
2a801f0
1038edd
96ceaef
bc49716
a30bd71
aaf52ea
76aaaff
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the downside of using
DataCheckMessageType
as the keys is that the user needs to import the enum in order to look at the warnings and results. I think that can be annoying, especially if the user is running automl and trying to accessdata_checks_results
:DataCheckResults
class.The benefits of 1 is thatL
if self._data_check_results[DataCheckMessageType.ERROR]
. The nice thing about this is that if we ever add data to the dict, theis_empty
wouldn't break users code downstream but the dict would because they'd be checkingif results == {DataCheckMessageType.WARNING: [], DataCheckMessageType.ERROR: []}
The downside is that it's another class but our docs already show how to get data from the data checks results.
The benefit of 2 is that it'd be a small change but would definitely lead to typos lol.
My vote is for 1 I think it'd be fine to keep it as-is for now too. I saw that you left
to use DataCheckMessageType as key?
in the PR description so wanted to offer my thoughts lol.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooo I like this suggestion a lot. I agree, the reason why I left the comment about using
DataCheckMessageType
as key was because as I was updating the code, I too felt the inconvenience / frustration of having to import the enum everywhere, but as @dsherry had mentioned, since we already have the enums in place we might as well use them.That being said, I like the idea of creating a separate
DataCheckResults
class, and havingwarnings
anderrors
as attributes 🤔 That way, the user doesn't need to directly type in the keys as strings, and any typos will result in anAttributeError
instead.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@freddyaboulton @angela97lin sure I'm on board with having a
DataCheckResults
class! This would make it easy to access the errors and warnings. We could also define ato_json
method which returns native python types instead ofDataCheckError
/DataCheckWarning
instances.Is your proposal that we do that instead of merging this PR?
UPDATE: we discussed this, further changes tracked by #1430.