BUG: IndexError on read_csv/read_table when using usecols/names parameters and omitting last column #5766

wrenoud · 2013-12-23T20:57:58Z

Example code:

from StringIO import StringIO
import pandas as pd

names = ["a","b","c"]

data = """\
0,1,2
3,4,5
6,7,8"""

# usecols works as expected if all columns are named
print pd.read_csv(StringIO(data), header=None, usecols=[1,2], names=names)
print pd.read_csv(StringIO(data), header=None, usecols=[0,1], names=names)

# naming only columns selected with usecols works when last column is included
print pd.read_csv(StringIO(data), header=None, usecols=[1,2], names=names[1:])
# causes IndexError
print pd.read_csv(StringIO(data), header=None, usecols=[0,1], names=names[:-1])

Output:

   b  c
0  1  2
1  4  5
2  7  8

[3 rows x 2 columns]
   a  b
0  0  1
1  3  4
2  6  7

[3 rows x 2 columns]
   b  c
0  1  2
1  4  5
2  7  8

[3 rows x 2 columns]
Traceback (most recent call last):
  File "pandas_test2.py", line 18, in <module>
    print pd.read_csv(StringIO(data), header=None, usecols=[0,1], names=names[:-1])
  File "/home/weston/pandas/pandas/io/parsers.py", line 404, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/home/weston/pandas/pandas/io/parsers.py", line 212, in _read
    return parser.read()
  File "/home/weston/pandas/pandas/io/parsers.py", line 610, in read
    ret = self._engine.read(nrows)
  File "/home/weston/pandas/pandas/io/parsers.py", line 1050, in read
    data = self._reader.read(nrows)
  File "parser.pyx", line 727, in pandas.parser.TextReader.read (pandas/parser.c:6475)
  File "parser.pyx", line 749, in pandas.parser.TextReader._read_low_memory (pandas/parser.c:6695)
  File "parser.pyx", line 824, in pandas.parser.TextReader._read_rows (pandas/parser.c:7517)
  File "parser.pyx", line 902, in pandas.parser.TextReader._convert_column_data (pandas/parser.c:8296)
  File "parser.pyx", line 1139, in pandas.parser.TextReader._get_column_name (pandas/parser.c:11353)
IndexError: list index out of range

print_versions.py output:

INSTALLED VERSIONS
------------------
Python: 2.7.3.final.0
OS: Linux 3.2.0-51-generic #77-Ubuntu SMP Wed Jul 24 20:21:10 UTC 2013 i686
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8

pandas: 0.13.0rc1-119-g2485e09
Cython: 0.15.1
Numpy: 1.6.1
Scipy: 0.9.0
statsmodels: Not installed
    patsy: Not installed
scikits.timeseries: Not installed
dateutil: 1.5
pytz: 2011k
bottleneck: Not installed
PyTables: Not Installed
    numexpr: Not Installed
matplotlib: 1.1.1rc
openpyxl: Not installed
xlrd: Not installed
xlwt: Not installed
xlsxwriter: Not installed
sqlalchemy: Not installed
lxml: Not installed
bs4: Not installed
html5lib: Not installed
bigquery: Not installed
apiclient: Not installed

The text was updated successfully, but these errors were encountered:

This is an issue in read_csv/read_table where there is no header and both usecols and names and assigned but the last column is not included. This caused an IndexError after reaching the last column specified in usecols.

wrenoud · 2013-12-23T23:32:19Z

This originated from the stackoverflow question "IndexError when trying to read_table with pandas"

wrenoud · 2013-12-24T02:03:28Z

Updated the example to the simplest case.

ghost · 2013-12-25T00:27:34Z

I can repro.
Bisected to d05f3b1 #4406
Looking into it.

ghost · 2013-12-25T02:05:13Z

merged #5770, just in time for christmas. Cheers.

wrenoud mentioned this issue Dec 24, 2013

BUG: IndexError on read_csv/read_table when using usecols/names parameters and omitting last column #5767

Closed

ghost mentioned this issue Dec 25, 2013

BUG: Fix an issue with the csv cparser when usecols is used #4406

Merged

ghost closed this as completed Dec 25, 2013

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: IndexError on read_csv/read_table when using usecols/names parameters and omitting last column #5766

BUG: IndexError on read_csv/read_table when using usecols/names parameters and omitting last column #5766

wrenoud commented Dec 23, 2013

wrenoud commented Dec 23, 2013

wrenoud commented Dec 24, 2013

ghost commented Dec 25, 2013

ghost commented Dec 25, 2013

Navigation Menu

BUG: IndexError on read_csv/read_table when using usecols/names parameters and omitting last column #5766

BUG: IndexError on read_csv/read_table when using usecols/names parameters and omitting last column #5766

Comments

wrenoud commented Dec 23, 2013

wrenoud commented Dec 23, 2013

wrenoud commented Dec 24, 2013

ghost commented Dec 25, 2013

ghost commented Dec 25, 2013