Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[query] from_pandas needs to support Pandas missing types and support numpy nums better #11401

Merged
merged 5 commits into from
Feb 24, 2022

Conversation

johnc1231
Copy link
Contributor

@johnc1231 johnc1231 commented Feb 23, 2022

Pandas missingness is crazy. pd.isna tells you if something is "missing", which means either NaN, or a special NA sentinel value. For floats they use NaN, and specifically for the pandas special Int64DType and Int32DType and nothing else they use this NA value.

They don't have an easy way to distinguish between NA and NaN, so I first check if something isna, then check if it's a float to differentiate between the cases.

They also don't have a way to test if something is a Int32DType for some reason. So I use is_int64_dtype, and if that fails I fall back to is_integer_dtype, which is true for both Int64DType and Int32DType.

@johnc1231 johnc1231 marked this pull request as ready for review February 24, 2022 16:02
@danking danking merged commit 7fb79b0 into hail-is:main Feb 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants