Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No KeyError in partial indexing of unused label if placeholder is used in (only) some columns #20410

Closed
toobaz opened this issue Mar 19, 2018 · 3 comments · Fixed by #29760
Closed
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@toobaz
Copy link
Member

toobaz commented Mar 19, 2018

Code Sample, a copy-pastable example if possible

In [2]: mi = pd.MultiIndex(levels=[['a_lot', 'onlyone', 'notevenone'], [1970, '']],
   ...:            labels=[[1, 0], [1, 0]])
   ...: df = pd.DataFrame(-1, index=range(3), columns=mi)
   ...:            

In [3]: df
Out[3]: 
  onlyone a_lot
           1970
0      -1    -1
1      -1    -1
2      -1    -1

In [4]: df['notevenone']
Out[4]: 
Empty DataFrame
Columns: []
Index: [0, 1, 2]

Problem description

As in #19027 (but apparently a bit more subtle), a KeyError should be raised, as the presence of notevenone in the first level is an implementation detail.

Expected Output

In [5]: df['reallynot']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/home/pietro/nobackup/repo/pandas/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3011             try:
-> 3012                 return self._engine.get_loc(key)
   3013             except KeyError:

/home/pietro/nobackup/repo/pandas/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5729)()

/home/pietro/nobackup/repo/pandas/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5575)()

/home/pietro/nobackup/repo/pandas/pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:22215)()

/home/pietro/nobackup/repo/pandas/pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:22169)()

KeyError: 'reallynot'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-5-845e63863b1b> in <module>()
----> 1 df['reallynot']

/home/pietro/nobackup/repo/pandas/pandas/core/frame.py in __getitem__(self, key)
   2511             return self._getitem_frame(key)
   2512         elif is_mi_columns:
-> 2513             return self._getitem_multilevel(key)
   2514         else:
   2515             return self._getitem_column(key)

/home/pietro/nobackup/repo/pandas/pandas/core/frame.py in _getitem_multilevel(self, key)
   2555 
   2556     def _getitem_multilevel(self, key):
-> 2557         loc = self.columns.get_loc(key)
   2558         if isinstance(loc, (slice, Series, np.ndarray, Index)):
   2559             new_columns = self.columns[loc]

/home/pietro/nobackup/repo/pandas/pandas/core/indexes/multi.py in get_loc(self, key, method)
   2227 
   2228         if not isinstance(key, tuple):
-> 2229             loc = self._get_level_indexer(key, level=0)
   2230 
   2231             # _get_level_indexer returns an empty slice if the key has

/home/pietro/nobackup/repo/pandas/pandas/core/indexes/multi.py in _get_level_indexer(self, key, level, indexer)
   2486         else:
   2487 
-> 2488             loc = level_index.get_loc(key)
   2489             if isinstance(loc, slice):
   2490                 return loc

/home/pietro/nobackup/repo/pandas/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3012                 return self._engine.get_loc(key)
   3013             except KeyError:
-> 3014                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   3015 
   3016         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

/home/pietro/nobackup/repo/pandas/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5729)()

/home/pietro/nobackup/repo/pandas/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5575)()

/home/pietro/nobackup/repo/pandas/pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:22215)()

/home/pietro/nobackup/repo/pandas/pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas/_libs/hashtable.c:22169)()

KeyError: 'reallynot'

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-6-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8

pandas: 0.23.0.dev0+653.g7273ea070
pytest: 3.0.6
pip: 9.0.1
setuptools: 33.1.1
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.18.1
pyarrow: None
xarray: None
IPython: 5.2.2
sphinx: None
patsy: 0.4.1+dev
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.0
openpyxl: 2.3.0
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: None
lxml: 3.7.1
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented Mar 20, 2018

yep

@jreback jreback added Bug Difficulty Intermediate Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex labels Mar 20, 2018
@jreback jreback added this to the Next Major Release milestone Mar 20, 2018
@jreback
Copy link
Contributor

jreback commented Mar 20, 2018

you have a mixed level, may that is contributing.

@mroeschke
Copy link
Member

Looks like this work on master. Could use a test:

In [318]: In [4]: df['notevenone']
KeyError: 'notevenone'

In [319]: pd.__version__
Out[319]: '0.26.0.dev0+593.g9d45934af'

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Bug Difficulty Intermediate Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex labels Oct 21, 2019
@jreback jreback modified the milestones: Contributions Welcome, 1.0 Nov 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants