-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
pd.DataFrame({'x': ['1372636858620000589']}).to_csv('x.csv', index=False)
# this works
pd.read_csv('x.csv', dtype_backend='numpy_nullable')['x'].round()
# this crashes
pd.read_csv('x.csv', dtype_backend='pyarrow')['x'].round()Issue Description
The Error message is:
ArrowInvalid: Integer value 1372636858620000589 not in range: -9007199254740992 to 9007199254740992
It seems the value range for int64[pyarrow] is too narrow. Shouldn't it be -9223372036854775807 to 9223372036854775807?
Expected Behavior
be able to manipulate int64[pyarrow] just like Int64
Installed Versions
INSTALLED VERSIONS
commit : 478d340
python : 3.9.16.final.0
python-bits : 64
OS : Darwin
OS-release : 22.4.0
Version : Darwin Kernel Version 22.4.0: Mon Mar 6 21:00:17 PST 2023; root:xnu-8796.101.5~3/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.0.0
numpy : 1.24.2
pytz : 2023.3
dateutil : 2.8.2
setuptools : 67.6.1
pip : 23.0.1
Cython : None
pytest : 7.3.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : 2.9.6
jinja2 : 3.1.2
IPython : 8.12.0
pandas_datareader: None
bs4 : 4.12.1
bottleneck : None
brotli : None
fastparquet : 0.8.3
fsspec : 2023.4.0
gcsfs : 2023.4.0
matplotlib : 3.6.3
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 11.0.0
pyreadstat : None
pyxlsb : None
s3fs : 2023.4.0
scipy : None
snappy : None
sqlalchemy : 2.0.9
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None