Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop calculating top_values for Double columns with integer values #1692

Merged
merged 3 commits into from May 2, 2023

Conversation

tamargrey
Copy link
Contributor

@@ -1369,7 +1369,8 @@ def test_describe_callback(describe_df, mock_callback):
assert mock_callback.total_elapsed_time > 0


def test_describe_dict_extra_stats(describe_df):
@pytest.mark.parametrize("use_age", [True, False])
def test_describe_dict_extra_stats(use_age, describe_df):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to make sure to cover age usage and confirm its behavior matches int and double usage, but I recognize that this may be a bit overkill. If anyone would rather I make a completely separate test that more minimally tests age vs integer usage, lmk.

@tamargrey tamargrey marked this pull request as ready for review May 1, 2023 15:24
@codecov
Copy link

codecov bot commented May 1, 2023

Codecov Report

Merging #1692 (25f3778) into main (360fd10) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #1692   +/-   ##
=======================================
  Coverage   98.79%   98.79%           
=======================================
  Files          98       98           
  Lines       11805    11809    +4     
=======================================
+ Hits        11663    11667    +4     
  Misses        142      142           
Impacted Files Coverage Δ
woodwork/statistics_utils/_get_describe_dict.py 100.00% <100.00%> (ø)
woodwork/tests/accessor/test_statistics.py 100.00% <100.00%> (ø)

@gsheni gsheni requested a review from a team May 1, 2023 16:31
Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why the install for 3.11 is failing, but changes LGTM

Comment on lines +177 to +180
if isinstance(
column.logical_type,
(Age, AgeNullable, Integer, IntegerNullable),
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not something we need in this PR, but might be nice if we had an is_integer_type property on the column schema to use instead of having to maintain a list of logical types here.

@gsheni gsheni merged commit a9f8652 into main May 2, 2023
37 checks passed
@gsheni gsheni deleted the remove-double-top-values branch May 2, 2023 19:17
@ParthivNaresh ParthivNaresh mentioned this pull request May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove top_values calculation for Double columns with integer values
4 participants