-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
Consider the following dataframe:
b c d e f g h
0 6.25 2018-04-01 True NaN 7 54.0 64.0
1 32.50 2018-04-01 True NaN 7 54.0 64.0
2 16.75 2018-04-01 True NaN 7 54.0 64.0
3 29.25 2018-04-01 True NaN 7 54.0 64.0
4 21.75 2018-04-01 True NaN 7 54.0 64.0
5 21.75 2018-04-01 True True 7 54.0 64.0
6 7.75 2018-04-01 True True 7 54.0 64.0
7 23.25 2018-04-01 True True 7 54.0 64.0
8 12.25 2018-04-01 True True 7 54.0 64.0
9 30.50 2018-04-01 True NaN 7 54.0 64.0
(copy and paste and use df = pd.read_clipboard()
to create the dataframe)
Finding the medians initially works with no problem:
df.median()
b 21.75
d 1.00
e 1.00
f 7.00
g 54.00
h 64.00
dtype: float64
However, if a column is dropped and then the median
is found, the median for column e
disappears:
new_df = df.drop(columns=['b'])
new_df.median()
d 1.0
f 7.0
g 54.0
h 64.0
dtype: float64
This behavior is a little unexpected and finding the median for column e by itself still works:
new_df['e'].median()
1.0
Using skipna=False
does not make a difference:
new_df.median(skipna=False)
d 1.0
f 7.0
g 54.0
h 64.0
dtype: float64
(it does for the original dataframe):
df.median(skipna=False)
b 21.75
d 1.00
e NaN
f 7.00
g 54.00
h 64.00
dtype: float64
The datatype of column e
is object
in both df
and new_df
and the only difference between the two dataframes is new_df
does not have column b
. Adding the column back into new_df
does not resolve the issue. This only occurs when the first column b
is dropped. It does not occur if column e
is a float or integer datatype.
This behavior is present in both pandas==0.22.0
and pandas==0.24.1
I also created a Stack Overflow Question based on this issue.