Groupby level fails to enumerate groups #15155

Closed
watercrossing opened this Issue Jan 18, 2017 · 1 comment

Comments

Projects
None yet
2 participants
Contributor

watercrossing commented Jan 18, 2017 edited

Code Sample

test = pd.DataFrame(data=np.arange(2,22,2), 
             index=pd.MultiIndex(levels=[pd.CategoricalIndex(["a", "b"]), range(10)],
                                 labels=[[0]*5 + [1]*5, range(10)],
                                 names = ["Index1", "Index2"]))
testGroupedBy = test.groupby(level=["Index1"])
testGroupedBy.get_group("a")

Problem description

instead of returning the data as expected, an error is thrown:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-1466-65ba7dd9f763> in <module>()
----> 1 testGroupedBy.get_group("a")

python2.7/site-packages/pandas/core/groupby.pyc in get_group(self, name, obj)
    614             obj = self._selected_obj
    615 
--> 616         inds = self._get_index(name)
    617         if not len(inds):
    618             raise KeyError(name)

python2.7/site-packages/pandas/core/groupby.pyc in _get_index(self, name)
    461     def _get_index(self, name):
    462         """ safe get index, translate keys for datelike to underlying repr """
--> 463         return self._get_indices([name])[0]
    464 
    465     @cache_readonly

python2.7/site-packages/pandas/core/groupby.pyc in _get_indices(self, names)
    428             return []
    429 
--> 430         if len(self.indices) > 0:
    431             index_sample = next(iter(self.indices))
    432         else:

python2.7/site-packages/pandas/core/groupby.pyc in __getattr__(self, attr)
    527 
    528         raise AttributeError("%r object has no attribute %r" %
--> 529                              (type(self).__name__, attr))
    530 
    531     plot = property(GroupByPlot)

AttributeError: 'DataFrameGroupBy' object has no attribute 'indices'

Expected Output

I guess an alternative way to get this output is test.loc["a"] - which yields the expected output below:

         0
Index2    
0        2
1        4
2        6
3        8
4       10

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-431.11.2.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_GB.utf8
LANG: en_GB.utf8
LOCALE: None.None

pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 32.3.1
Cython: None
numpy: 1.11.3
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.4
boto: None
pandas_datareader: None

Contributor

jreback commented Jan 18, 2017

This is a bug, which is hidden because of this missing method on CategoricalIndex. If you would like to do a PR which adds this (and tests of course).

diff --git a/pandas/indexes/category.py b/pandas/indexes/category.py
index 2c89f72..e3ffa40 100644
--- a/pandas/indexes/category.py
+++ b/pandas/indexes/category.py
@@ -255,6 +255,9 @@ class CategoricalIndex(Index, base.PandasDelegate):
     def ordered(self):
         return self._data.ordered
 
+    def _reverse_indexer(self):
+        return self._data._reverse_indexer()
+
     def __contains__(self, key):
         hash(key)
         return key in self.values

jreback added this to the 0.20.0 milestone Jan 18, 2017

jreback closed this in 4c65d5f Jan 19, 2017

@AnkurDedania AnkurDedania added a commit to AnkurDedania/pandas that referenced this issue Mar 21, 2017

@watercrossing @AnkurDedania watercrossing + AnkurDedania Bug in groupby.get_group on categoricalindex
closes #15155

Author: watercrossing <ingolf.becker@googlemail.com>

Closes #15163 from watercrossing/indexgroup and squashes the following commits:

742d4a5 [watercrossing] BUG: GroupBy.get_group failing with a categorical grouper (#15155)
a8f9ec5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment