BUG: InvalidRange Error while reading csv file after modifications #47402
Labels
Bug
Closing Candidate
May be closeable, needs more eyeballs
IO Network
Local or Cloud (AWS, GCS, etc.) IO Issues
Needs Info
Clarification about behavior needed to assess issue
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
If a file is read twice from an ADLS Gen2 storage account (only cloud storage that I've tested with) in a single run and the file has changed in between the two reads, we get an InvalidRange error as pasted below. This has been tested with csv and excel, the number of rows were reduced between two consecutive reads.
Error:
Expected Behavior
Pandas should be able to read the file even after the changes.
Installed Versions
INSTALLED VERSIONS
commit : 2cb9652
python : 3.8.8.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.72-microsoft-standard-WSL2
Version : #1 SMP Wed Oct 28 23:40:43 UTC 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.2.4
numpy : 1.19.5
pytz : 2021.1
dateutil : 2.8.1
pip : 21.0.1
setuptools : 52.0.0.post20210125
Cython : 0.29.23
pytest : 6.2.3
hypothesis : None
sphinx : 4.0.1
blosc : None
feather : None
xlsxwriter : 1.3.8
lxml.etree : 4.6.3
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.3
IPython : 7.22.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : 1.3.2
fsspec : 2022.5.0
fastparquet : None
gcsfs : None
matplotlib : 3.3.4
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.6.2
sqlalchemy : 1.4.15
tables : 3.6.1
tabulate : 0.8.9
xarray : None
xlrd : 2.0.1
xlwt : 1.3.0
numba : 0.53.1
The text was updated successfully, but these errors were encountered: