Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added parameter to InvalidTargetDataCheck to show only top unique values rather than all unique values #1485

Merged
merged 6 commits into from
Dec 3, 2020

Conversation

angela97lin
Copy link
Contributor

Closes #1460

@angela97lin angela97lin added this to the December 2020 milestone Dec 1, 2020
@angela97lin angela97lin self-assigned this Dec 1, 2020
@angela97lin angela97lin changed the title Update InvalidTargetDataCheck to show only top unique values rather than all unique values Added parameter to InvalidTargetDataCheck to show only top unique values rather than all unique values Dec 1, 2020
@angela97lin angela97lin changed the title Added parameter to InvalidTargetDataCheck to show only top unique values rather than all unique values Added parameter to InvalidTargetDataCheck to show only top unique values rather than all unique values Dec 1, 2020
@codecov
Copy link

codecov bot commented Dec 1, 2020

Codecov Report

Merging #1485 (5bed9de) into main (d05134f) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@            Coverage Diff            @@
##             main    #1485     +/-   ##
=========================================
+ Coverage   100.0%   100.0%   +0.1%     
=========================================
  Files         223      223             
  Lines       15135    15158     +23     
=========================================
+ Hits        15128    15151     +23     
  Misses          7        7             
Impacted Files Coverage Δ
evalml/data_checks/invalid_targets_data_check.py 100.0% <100.0%> (ø)
...ta_checks_tests/test_invalid_targets_data_check.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d05134f...5bed9de. Read the comment docs.

Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angela97lin I think this is good to merge! Thank you!

if self.n_unique is None:
details = {"target_values": unique_values}
else:
details = {"target_values": unique_values[:min(self.n_unique, len(unique_values))]}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit-pick: I think the min is redundant because if the end of the slice is larger than the list length, you'll get the whole list anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think if self.n_unique > len(unique_values), you get an IndexError? Though I think I should have written len(unique_values)-1` on second thought.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you'd get the index error when you slice a list. But it's a minor point and what you have now works and is clear!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woah, TIL slicing is magical and will just give you the whole list, thanks @freddyaboulton :')

(Will likely still keep this/something similar for clarity sake though)

docs/source/release_notes.rst Outdated Show resolved Hide resolved
@angela97lin angela97lin merged commit d1d2d52 into main Dec 3, 2020
@angela97lin angela97lin deleted the 1460_top_unique_values branch December 3, 2020 20:11
@dsherry dsherry mentioned this pull request Dec 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Show only top unique values rather than all unique values for InvalidTargetDataCheck
2 participants