-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this issue exists on the latest version of pandas.
-
I have confirmed this issue exists on the main branch of pandas.
Reproducible Example
import numpy as np
import pandas as pd
rng = np.random.default_rng()
df = pd.DataFrame(rng.integers(0, 100, size=(100_000, 500)), columns=np.arange(0,500))
%time df.to_excel(r"\to_excel issues\test.xlsx")
New: Wall time: 18min 35s, 18min 29s, 15min 27s, 15min 25s
Old: Wall time: 14min 2s, 12min 10s,14min 7s
The differences aren't that large for this size but when you start getting bigger it grows exponentially (100k, 1k) already takes over an hour or over two hours.
Installed Versions
New:
INSTALLED VERSIONS
commit : a671b5a
python : 3.11.7.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19045
machine : AMD64
processor : AMD64 Family 25 Model 80 Stepping 0, AuthenticAMD
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 2.1.4
numpy : 1.26.4
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 68.2.2
pip : 23.3.1
Cython : None
pytest : 7.4.0
hypothesis : None
sphinx : 5.0.2
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.9.3
html5lib : None
pymysql : 1.4.6
psycopg2 : None
jinja2 : 3.1.3
IPython : 8.20.0
pandas_datareader : None
bs4 : 4.12.2
bottleneck : 1.3.7
dataframe-api-compat: None
fastparquet : None
fsspec : 2023.10.0
gcsfs : None
matplotlib : 3.8.0
numba : 0.59.0
numexpr : 2.8.7
odfpy : None
openpyxl : 3.1.2
pandas_gbq : None
pyarrow : 14.0.2
pyreadstat : None
pyxlsb : None
s3fs : 2023.10.0
scipy : 1.11.4
sqlalchemy : 2.0.25
tables : 3.9.2
tabulate : 0.9.0
xarray : 2023.6.0
xlrd : None
zstandard : 0.19.0
tzdata : 2023.3
qtpy : 2.4.1
pyqt5 : None
Prior Performance
Old:
INSTALLED VERSIONS
Wall time: 14min 2s, 12min 10s,14min 7s
commit : 2cb9652
python : 3.8.8.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : AMD64 Family 25 Model 80 Stepping 0, AuthenticAMD
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 1.2.4
numpy : 1.20.1
pytz : 2021.1
dateutil : 2.8.1
pip : 21.0.1
setuptools : 52.0.0.post20210125
Cython : 0.29.23
pytest : 6.2.3
hypothesis : None
sphinx : 4.0.1
blosc : None
feather : None
xlsxwriter : 1.3.8
lxml.etree : 4.6.3
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.3
IPython : 7.22.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : 1.3.2
fsspec : 0.9.0
fastparquet : None
gcsfs : None
matplotlib : 3.3.4
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : 9.0.0
pyxlsb : None
s3fs : None
scipy : 1.6.2
sqlalchemy : None
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 2.0.1
xlwt : 1.3.0
numba : 0.53.1