-
-
Notifications
You must be signed in to change notification settings - Fork 19k
Description
Pandas version checks
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandas.
- I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
df = pd.DataFrame([{"a": True}, {"b":True}])
df["a"]|df["b"]
Issue Description
The result of a bitwise or operator (|
or pd.Series.__or__()
) should not depend on the order of the operands. Or should be commutative.
But in this case, when one of the operands is NaN, pandas violates this law. This seems odd. Numpy doesn't show this inconsistency. Numpy complains that NaN cannot be used with Or.
Pandas silently casts, without any warning.
There's been a StackOverflow issue about this for a long time, but I couldn't find an issue here: https://stackoverflow.com/questions/39000907/pandas-column-selection-non-commutative-bitwise-or-when-selecting-on-str-and-na
Expected Behavior
The result should always be true in bitwise or comparison if one of the operands is true.
It's ok if the user gets a warning or there's an error. But silently making it go false is bad.
Actual behavior
> df
a b
0 True NaN
1 NaN True
> df["a"]|df["b"]
0 True
1 False
dtype: bool
Installed Versions
INSTALLED VERSIONS
commit : 2e218d1
python : 3.10.8.final.0
python-bits : 64
OS : Darwin
OS-release : 22.3.0
Version : Darwin Kernel Version 22.3.0: Thu Jan 5 20:48:54 PST 2023; root:xnu-8792.81.2~2/RELEASE_ARM64_T6000
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8
pandas : 1.5.3
numpy : 1.24.2
pytz : 2022.7.1
dateutil : 2.8.2
setuptools : 67.1.0
pip : 23.0
Cython : None
pytest : 7.2.1
hypothesis : None
...
xlrd : None
xlwt : None
zstandard : 0.19.0
tzdata : None