Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_fwf: IndexError: list index out of range when using dtype and usecols parameters #54868

Closed
anmyachev opened this issue Aug 30, 2023 · 1 comment · Fixed by #54881
Closed
Labels
IO CSV read_csv, to_csv Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@anmyachev
Copy link
Contributor

Reproducer:

import pandas as pd
from io import BytesIO

pd.read_fwf(BytesIO(b"a       b           c          d\r\nid8141  360.242940  149.910199 11950.7\r\nid1594  444.953632  166.985655 11788.4\r\n"), usecols=["a", "b", "d"], dtype={"a": str})  # <- IndexError: list index out of range

pandas 2.1.0rc0

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "Miniconda3\envs\modin\lib\site-packages\pandas\io\parsers\readers.py", line 1393, in read_fwf
    return _read(filepath_or_buffer, kwds)
  File "Miniconda3\envs\modin\lib\site-packages\pandas\io\parsers\readers.py", line 611, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "Miniconda3\envs\modin\lib\site-packages\pandas\io\parsers\readers.py", line 1448, in __init__
    self._engine = self._make_engine(f, self.engine)
  File "Miniconda3\envs\modin\lib\site-packages\pandas\io\parsers\readers.py", line 1723, in _make_engine
    return mapping[engine](f, **self.options)
  File "Miniconda3\envs\modin\lib\site-packages\pandas\io\parsers\python_parser.py", line 1324, in __init__
    PythonParser.__init__(self, f, **kwds)
  File "Miniconda3\envs\modin\lib\site-packages\pandas\io\parsers\python_parser.py", line 162, in __init__
    self._no_thousands_columns = self._set_no_thousand_columns()
  File "Miniconda3\envs\modin\lib\site-packages\pandas\io\parsers\python_parser.py", line 1186, in _set_no_thousand_columns
    and self.columns[i] in self.dtype
IndexError: list index out of range
@dummerheld
Copy link

Upgraded to the just 2.1.0 and a prior working pd.read_excel exhibits the same problem.

@phofl phofl added Regression Functionality that used to work in a prior pandas version IO CSV read_csv, to_csv labels Aug 30, 2023
@phofl phofl added this to the 2.1.1 milestone Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants