Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
any/all reductions on boolean object-typed Series #27709
On implementing a boolean based ExtensionArray I stumbled on the case that boolean arrays with missing values (which can only be
The following case should return
pd.Series([False, None]).any(skipna=False) # None pd.Series([None, False]).any(skipna=False) # False pd.Series([False, np.nan]).any(skipna=False) # nan pd.Series([np.nan, False]).any(skipna=False) # nan
Whereas when you do the same operation on
pd.Series([np.nan, 0.]).any(skipna=False) # True pd.Series([0, np.nan]).any(skipna=False) # True
As I have not found a unit test for the above mentioned case with a boolean object column, I suspect that this is rather undefined behaviour then intended.
Three solutions come to my mind:
I've read through the open and closed PRs and issues and am still confused. The issues were in general more about support
# These return True any([np.nan, False]) any([False, np.nan]) # These return False any([None, False]) any([False, None])
The above operations yield different results, we would want to have
I would therefore actually adjust the code with options 2 but this would be a behaviour breaking change.