Skip to content

Conversation

@johnc1231
Copy link
Contributor

@johnc1231 johnc1231 commented Feb 23, 2022

Pandas missingness is crazy. pd.isna tells you if something is "missing", which means either NaN, or a special NA sentinel value. For floats they use NaN, and specifically for the pandas special Int64DType and Int32DType and nothing else they use this NA value.

They don't have an easy way to distinguish between NA and NaN, so I first check if something isna, then check if it's a float to differentiate between the cases.

They also don't have a way to test if something is a Int32DType for some reason. So I use is_int64_dtype, and if that fails I fall back to is_integer_dtype, which is true for both Int64DType and Int32DType.

@johnc1231 johnc1231 marked this pull request as ready for review February 24, 2022 16:02
@danking danking merged commit 7fb79b0 into hail-is:main Feb 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants