-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: reset_index() looses the frequency of a DatetimeIndex #59273
Comments
take |
I did some digging, and it seems it's intended that pandas/pandas/core/internals/blocks.py Lines 2158 to 2160 in 18a3eec
The above was added in this PR #41425, which mentions that "The long-term behavior is definitely going to always drop the freq (more specifically, DTA/TDA won't have freq, xref #31218). So this PR standardizes always-dropping." @annika-rudolph What do you think? |
Thanks for digging into this! It is what I suspected :) From a user perspective I can say that frequencies in DatetimeIndices are quite important, even more so since some functionality (like businessday and resample) will be dropped for Periodindices -- which for us means that we recently moved everything to DatetimeIndices. Thus, it would be nice if they could cover the same functionality as Periodindices and specifically, the frequency attribute could be retained in all transformations. It seems to me that the decision on always dropping the frequency was taken some time ago (before deciding to drop PeriodIndex functionality?), so maybe it could be reconsidered? |
I encountered this issue and did some debugging. It is the reshape below that leads to loss of freq. pandas/pandas/core/internals/blocks.py Line 2313 in 1fa5025
But as mentioned by @aram-cinnamon , I think this behaviour is expected. |
With the deprecation of PeriodIndex functionality progressing and the current recommendation to use a DateTimeIndex in place, I think the frequency property of a DateTimeIndex should not be dropped. If you drop the frequency property the object no longer holds information on the periods, so it could not be used to replace the PeriodIndex. @jbrockmendel it would be great to hear your opinion on that |
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
When doing reset_index() on a DatetimeIndex this leads to the frequency being lost. Although the newly created column is a DatetimeArray, it does not seem to carry the freq attribute. As a result, when doing reset_index() -> set_index() I cannot restore the original index which potentially creates issues.
Expected Behavior
I would expect that reset_index().set_index() let's me recover the original index :)
Installed Versions
INSTALLED VERSIONS
commit : bfe5be0
python : 3.10.12
python-bits : 64
OS : Linux
OS-release : 5.15.153.1-microsoft-standard-WSL2
Version : #1 SMP Fri Mar 29 23:14:13 UTC 2024
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8
pandas : 0+untagged.34794.gbfe5be0
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0.post0
pip : 22.0.2
Cython : 3.0.10
sphinx : 7.3.7
IPython : 8.23.0
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
blosc : None
bottleneck : 1.3.8
fastparquet : 2024.2.0
fsspec : 2024.3.1
html5lib : 1.1
hypothesis : 6.100.1
gcsfs : 2024.3.1
jinja2 : 3.1.3
lxml.etree : 5.2.1
matplotlib : 3.8.4
numba : 0.59.1
numexpr : 2.10.0
odfpy : None
openpyxl : 3.1.2
psycopg2 : 2.9.9
pymysql : 1.4.6
pyarrow : 16.0.0
pyreadstat : 1.2.7
pytest : 8.1.1
python-calamine : None
pyxlsb : 1.0.10
s3fs : 2024.3.1
scipy : 1.13.0
sqlalchemy : 2.0.29
tables : 3.9.2
tabulate : 0.9.0
xarray : 2024.3.0
xlrd : 2.0.1
xlsxwriter : 3.2.0
zstandard : 0.22.0
tzdata : 2024.1
qtpy : None
pyqt5 : None
The text was updated successfully, but these errors were encountered: