Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import numpy as np
import pandas as pd
if __name__ == '__main__':
values = [
-3.333333e-01,
0.,
1.000000e+15,
-4.874588e+14,
6.103516e-05,
0.,
1.100000e+00,
0.,
0.,
-8.581237e+14,
0.,
1.000000e+15,
0.,
0.,
0.,
0.,
0.,
-5.96e-03,
0.,
0.,
]
s = pd.Series(values, dtype=np.float64)
window = 8
ground_truth = np.sum(values[-window:]) / window
rolling_mean_1 = s.rolling(window=window).mean()
rolling_mean_2 = s.iloc[1:].rolling(window=window).mean()
# >>> ground_truth
# np.float64(-0.000745)
# >>> rolling_mean_1.iloc[-1]
# np.float64(-0.01637)
# >>> rolling_mean_2.iloc[-1]
# np.float64(-0.000745)
# The first number is ~22 times greater than both the last one and the ground_truth value, which is significant.
# >>> rolling_mean_1.iloc[-1] / rolling_mean_2.iloc[-1]
# np.float64(21.973154362416107)
# >>> rolling_mean_1.iloc[-1] / ground_truth
# np.float64(21.973154362416107)
assert np.allclose(rolling_mean_2.iloc[-1], ground_truth) # passes
assert np.allclose(rolling_mean_1.iloc[-1], ground_truth, rtol=0, atol=1e-2) # fails
Issue Description
While using pandas' .rolling(window=...).mean()
, there is an unexpected discrepancy when comparing two different ways of computing the rolling mean:
rolling_mean_1 = s.rolling(window=window).mean()
rolling_mean_2 = s.iloc[1:].rolling(window=window).mean()
The discrepancy is significant, as rolling_mean_1.iloc[-1]
is approximately 22
times greater than rolling_mean_2.iloc[-1]
, even though the values required to compute the result for the last row are the same in both cases. Additionally, the reference value ground_truth
is almost identical to rolling_mean_2.iloc[-1]
.
Likely, it's related to rounding errors, but I don't understand how they could cause such a large discrepancy in this case, given that the absolute values of the original data are not that large (<= 1e15
)
Expected Behavior
assert np.allclose(rolling_mean_1.iloc[-1], ground_truth)
should pass
Installed Versions
INSTALLED VERSIONS
commit : 0691c5c
python : 3.12.8
python-bits : 64
OS : Darwin
OS-release : 23.4.0
Version : Darwin Kernel Version 23.4.0: Wed Feb 21 21:44:54 PST 2024; root:xnu-10063.101.15~2/RELEASE_ARM64_T6031
machine : arm64
processor : arm
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.2.3
numpy : 2.2.2
pytz : 2025.1
dateutil : 2.9.0.post0
pip : None
Cython : None
sphinx : None
IPython : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
html5lib : None
hypothesis : 6.125.2
gcsfs : None
jinja2 : None
lxml.etree : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : None
pyreadstat : None
pytest : 8.3.4
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.15.1
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
tzdata : 2025.1
qtpy : None
pyqt5 : None