Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: describe does not distinguish between Int64 and int64 #52576

Open
3 tasks done
phofl opened this issue Apr 10, 2023 · 2 comments
Open
3 tasks done

BUG: describe does not distinguish between Int64 and int64 #52576

phofl opened this issue Apr 10, 2023 · 2 comments
Labels
Bug ExtensionArray Extending pandas with custom dtypes or arrays. NA - MaskedArrays Related to pd.NA and nullable extension arrays Reduction Operations sum, mean, min, max, etc.

Comments

@phofl
Copy link
Member

phofl commented Apr 10, 2023

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

df = pd.DataFrame({"A": pd.Series([1,2,3], dtype="Int64"), "B": pd.Series([1,2,3], dtype="int64")})
df.describe(include="int64")
         A    B
count  3.0  3.0
mean   2.0  2.0
std    1.0  1.0
min    1.0  1.0
25%    1.5  1.5
50%    2.0  2.0
75%    2.5  2.5
max    3.0  3.0


### Issue Description

Should only include ``B``, not both

### Expected Behavior

remove A from result

### Installed Versions

<details>

main
</details>
@phofl phofl added Bug Needs Triage Issue that has not been reviewed by a pandas team member Reduction Operations sum, mean, min, max, etc. ExtensionArray Extending pandas with custom dtypes or arrays. NA - MaskedArrays Related to pd.NA and nullable extension arrays and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 10, 2023
@jbrockmendel
Copy link
Member

Notes for whoever comes along to handle this:

  1. In infer_dtype_from_object there is a kludge for BaseMaskedDtype if hasattr(dtype, "numpy_dtype"):
  2. In select_dtypes there is a kludge for ArrowDtype

I expect resolving this issue will require a significant surgery on these two functions, so avoiding these kludges should be part of the goal.

@phofl
Copy link
Member Author

phofl commented May 5, 2023

Yes, the kludge was a necessary fix for 2.0.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug ExtensionArray Extending pandas with custom dtypes or arrays. NA - MaskedArrays Related to pd.NA and nullable extension arrays Reduction Operations sum, mean, min, max, etc.
Projects
None yet
Development

No branches or pull requests

2 participants