SeriesGroupby.nunique raises an IndexError on empty Series #12553

Closed
Fixman opened this Issue Mar 7, 2016 · 4 comments

Comments

Projects
None yet
4 participants

Fixman commented Mar 7, 2016

Code Sample, a copy-pastable example if possible

In [18]: b = pandas.Series()

In [19]: g = b.groupby(level = 0)

In [20]: g.nunique()
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-20-fbbfc3108eac> in <module>()
----> 1 g.nunique()

/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in nunique(self, dropna)
   2693 
   2694         out = np.add.reduceat(inc, idx).astype('int64', copy=False)
-> 2695         return Series(out if ids[0] != -1 else out[1:],
   2696                       index=self.grouper.result_index,
   2697                       name=self.name)

IndexError: index 0 is out of bounds for axis 0 with size 0

Expected Output

0

output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-79-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.1
nose: 1.3.1
pip: 8.0.2
setuptools: 20.1.1
Cython: None
numpy: 1.10.4
scipy: 0.13.3
statsmodels: 0.5.0
IPython: 4.1.1
sphinx: None
patsy: 0.2.1
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: 3.1.1
numexpr: 2.2.2
matplotlib: 1.3.1
openpyxl: 1.7.0
xlrd: 0.9.2
xlwt: 0.7.5
xlsxwriter: None
lxml: None
bs4: 4.2.1
html5lib: 0.999
httplib2: 0.8
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
Jinja2: None
Contributor

jreback commented Mar 7, 2016

ok, pull-requests are welcome to fix

jreback added this to the 0.18.1 milestone Mar 7, 2016

Is this supposed to return 0 or an empty Series? I am fixing it right now but it seems awkward returning 0, while other calls of the nunique() return Series objects.

Contributor

jreback commented Mar 7, 2016

equiv to this, an empty Series should be returned. You should only do that null checking (the line where it errors), if the series has len

In [1]: s = Series([])

In [2]: s.groupby(s.index).sum()
Out[2]: Series([], dtype: float64)

Thanks, made a pull request:
pydata#12557
Edit: Output:

>>> reload(pandas)
<module 'pandas' from 'pandas/__init__.pyc'>
>>> pandas.Series().groupby(level = 0).nunique()
Series([], dtype: int64)
>>> pandas.Series([1],[1]).groupby(level = 0).nunique()
1    1
dtype: int64
>>>

@jreback jreback modified the milestone: 0.18.1, 0.18.2 Apr 25, 2016

@thanasis2028 thanasis2028 added a commit to thanasis2028/pandas that referenced this issue Jun 1, 2016

@thanasis2028 thanasis2028 BUG: Fix for issue #12553
Author:    Thanasis Katsios <thkatsios@gmail.com>
744e537

@mroeschke mroeschke added a commit to mroeschke/pandas that referenced this issue Nov 27, 2016

@mroeschke mroeschke BUG: Error upon Series.Groupby.nunique with empty Series (#12553)
Modified tests
4185fcc

@mroeschke mroeschke added a commit to mroeschke/pandas that referenced this issue Nov 30, 2016

@mroeschke mroeschke BUG: Error upon Series.Groupby.nunique with empty Series (#12553)
Modified tests

simplify tests
7214e2e

@mroeschke mroeschke added a commit to mroeschke/pandas that referenced this issue Nov 30, 2016

@mroeschke mroeschke BUG: Error upon Series.Groupby.nunique with empty Series (#12553)
Modified tests

simplify tests

Add whatsnew
ed12ad0

@mroeschke mroeschke added a commit to mroeschke/pandas that referenced this issue Dec 1, 2016

@mroeschke mroeschke BUG: Error upon Series.Groupby.nunique with empty Series (#12553)
Modified tests

simplify tests

Add whatsnew

Moved len check
40505bb

@jreback jreback modified the milestone: 0.19.2, Next Major Release Dec 4, 2016

jreback closed this in c0e13d1 Dec 4, 2016

@jorisvandenbossche jorisvandenbossche added a commit that referenced this issue Dec 15, 2016

@mroeschke @jorisvandenbossche mroeschke + jorisvandenbossche BUG: Bug upon Series.Groupby.nunique with empty Series
closes #12553
closes #14770

(cherry picked from commit c0e13d1)
36dad84
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment