Skip to content

set_index breaks with multi keys but one empty #1971

hayd opened this Issue Sep 26, 2012 · 3 comments

3 participants

Python for Data member
hayd commented Sep 26, 2012

Migrated from StackOverflow:

df = DataFrame([
    dict(a=1, p=0), 
    dict(a=2, m=10), 
    dict(a=3, m=11, p=20), 
    dict(a=4, m=12, p=21)
], columns=('a', 'm', 'p', 'x'))
     a     m    p     x
  0  1   NaN    0   NaN
  1  2    10  NaN   NaN
  2  3    11   20   NaN
  3  4    12   21   NaN
# single column index on an empty column works
# two-columns index on non-empty columns works
df.set_index(['a', 'm'])
df.set_index(['a', 'p'])
df.set_index(['m', 'p'])

# but two-columns index including an empty column fails
df.set_index(['a', 'x'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Python/2.7/site-packages/pandas-0.8.2.dev_f5a74d4_20120725-py2.7-macosx-10.8-x86_64.egg/pandas/core/", line 2328, in set_index
    if verify_integrity and not index.is_unique:
  File "properties.pyx", line 27, in pandas.lib.cache_readonly.__get__ (pandas/src/tseries.c:95395)
  File "/Library/Python/2.7/site-packages/pandas-0.8.2.dev_f5a74d4_20120725-py2.7-macosx-10.8-x86_64.egg/pandas/core/", line 227, in is_unique
    return self._engine.is_unique
  File "engines.pyx", line 186, in pandas.lib.IndexEngine.is_unique.__get__ (pandas/src/tseries.c:115047)
  File "engines.pyx", line 215, in pandas.lib.IndexEngine._do_unique_check (pandas/src/tseries.c:115456)
  File "engines.pyx", line 228, in pandas.lib.IndexEngine._ensure_mapping_populated (pandas/src/tseries.c:115629)
  File "engines.pyx", line 231, in pandas.lib.IndexEngine.initialize (pandas/src/tseries.c:115678)
  File "engines.pyx", line 212, in pandas.lib.IndexEngine._get_index_values (pandas/src/tseries.c:115414)
  File "/Library/Python/2.7/site-packages/pandas-0.8.2.dev_f5a74d4_20120725-py2.7-macosx-10.8-x86_64.egg/pandas/core/", line 247, in <lambda>
    return self._engine_type(lambda: self.values, len(self))
  File "/Library/Python/2.7/site-packages/pandas-0.8.2.dev_f5a74d4_20120725-py2.7-macosx-10.8-x86_64.egg/pandas/core/", line 1363, in values
    for lev, lab in zip(self.levels, self.labels)]
  File "/Library/Python/2.7/site-packages/pandas-0.8.2.dev_f5a74d4_20120725-py2.7-macosx-10.8-x86_64.egg/pandas/core/", line 348, in ndtake
    return arr.take(_ensure_platform_int(indexer), axis=axis, out=out)
IndexError: index -1 is out of bounds for axis 0 with size 0
@wesm wesm closed this in 64e8878 Sep 27, 2012
Python for Data member
wesm commented Sep 27, 2012

fixed this. though there are other problems with using NA values in a multiindex that i can't solve right now

Python for Data member
wesm commented Sep 27, 2012

Indexing pretty much doesn't work at all

@yarikoptic yarikoptic added a commit to neurodebian/pandas that referenced this issue Sep 27, 2012
@yarikoptic yarikoptic Merge tag 'v0.9.0rc2' into debian
Version 0.9.0 Release Candidate 2

* tag 'v0.9.0rc2':
  DOC: release notes, bump to RC2
  DOC: missed a few for release notes 0.9
  DOC: add a few more notes on bug fixes in release.rst
  BUG: repr fix for all-NA index level. close #1971
  BLD: don't link against math library on windows
  TST: kludge around test failure on win64 python 3.2.2
  BLD: link against math library explicitly. close #1955
  DOC: Add line about resetting to default index
  DOC: Adding details on normalization for variance functions.
  DOC: Specify default merge behavior for on = None
  BUG: PeriodIndex slicing by datetime fails when either end out-of-bounds #1977
  BUG: read_table unicode bug #1975
  BUG: BlockManager.iget fails with non-unique MultiIndex #1970
  Better error message for DataFrame.apply if axis is not 0 or 1
  TST: fix up tzlocal test cases
  DOC: add level option in Series.reset_index to release notes
  ENH: level parameter for Series.reset_index
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.