-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
from pandas import Timestamp
import pandas as pd
print ("pandas Version:", pd.__version__)
#dataframe
df = pd.DataFrame.from_dict({'filename': ['03_', '03_', '03_', '05_', '05_', '05_', '05_', '05_', '08_', '08_'],
'date_time': [Timestamp('2022-05-24 12:10:56'), Timestamp('2022-05-24 12:11:24'), Timestamp('2022-05-24 12:11:51'),
Timestamp('2022-05-24 12:41:54'), Timestamp('2022-05-24 12:42:21'), Timestamp('2022-05-24 12:42:49'),
Timestamp('2022-05-24 12:43:16'), Timestamp('2022-05-24 12:43:44'), Timestamp('2022-05-24 12:57:30'),
Timestamp('2022-05-24 12:57:58')],
'r': [80466.36, 71467.12, 72641.21, 76961.35, 86747.23, 81995.81, 74451.46, 69401.51, 73670.12, 78180.65]})
print ("df column types: ", df.info(),)
print ('\nWorks with: df.groupby(["filename"]).agg(["mean"])\n', df.groupby(["filename"]).agg(["mean"]))
print ('\nNot working with: df.groupby(["filename"]).agg("mean")\n', df.groupby(["filename"]).agg("mean"))
print ('\nNot working with: df.groupby(["filename"]).mean()\n', df.groupby(["filename"]).mean())
OUT:
pandas Version: 1.3.5
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 filename 10 non-null object
1 date_time 10 non-null datetime64[ns]
2 r 10 non-null float64
dtypes: datetime64[ns](1), float64(1), object(1)
memory usage: 368.0+ bytes
df column types: None
Works with: df.groupby(["filename"]).agg(["mean"]) #See date_time column appearing
date_time r
mean mean
filename
03_ 2022-05-24 12:11:23.666666752 74858.230
05_ 2022-05-24 12:42:48.800000000 77911.472
08_ 2022-05-24 12:57:44.000000000 75925.385
Not working with: df.groupby(["filename"]).agg("mean") #date_time column is gone
r
filename
03_ 74858.230
05_ 77911.472
08_ 75925.385
Not working with: df.groupby(["filename"]).mean() #date_time column is gone
r
filename
03_ 74858.230
05_ 77911.472
08_ 75925.385
Issue Description
I expected the same behavior from
- df.groupby(["filename"]).agg(["mean"])
- df.groupby(["filename"]).agg("mean")
- df.groupby(["filename"]).mean()
Instead, when used with a df that has a column with datetime64[ns] data, only .agg(["mean"]) works, while .agg("mean") and .mean() drop the datetime64[ns] column
Expected Behavior
I expect that agg(["mean"]), agg("mean"), and mean(), behave the same.
Installed Versions
pandas : 1.3.5
numpy : 1.21.6
pytz : 2022.1
dateutil : 2.8.2
pip : 21.1.3
setuptools : 57.4.0
Cython : 0.29.30
pytest : 3.6.4
hypothesis : None
sphinx : 1.8.6
blosc : None
feather : 0.4.1
xlsxwriter : None
lxml.etree : 4.2.6
html5lib : 1.0.1
pymysql : None
psycopg2 : 2.7.6.1 (dt dec pq3 ext lo64)
jinja2 : 2.11.3
IPython : 5.5.0
pandas_datareader: 0.9.0
bs4 : 4.6.3
bottleneck : 1.3.4
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.2.2
numexpr : 2.8.1
odfpy : None
openpyxl : 3.0.10
pandas_gbq : 0.13.3
pyarrow : 6.0.1
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.4.36
tables : 3.7.0
tabulate : 0.8.9
xarray : 0.20.2
xlrd : 1.1.0
xlwt : 1.3.0
numba : 0.51.2
None