New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SeriesGroupby.nunique raises an IndexError on empty Series #12553

Closed
Fixman opened this Issue Mar 7, 2016 · 4 comments

Comments

Projects
None yet
4 participants
@Fixman

Fixman commented Mar 7, 2016

Code Sample, a copy-pastable example if possible

In [18]: b = pandas.Series()

In [19]: g = b.groupby(level = 0)

In [20]: g.nunique()
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-20-fbbfc3108eac> in <module>()
----> 1 g.nunique()

/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in nunique(self, dropna)
   2693 
   2694         out = np.add.reduceat(inc, idx).astype('int64', copy=False)
-> 2695         return Series(out if ids[0] != -1 else out[1:],
   2696                       index=self.grouper.result_index,
   2697                       name=self.name)

IndexError: index 0 is out of bounds for axis 0 with size 0

Expected Output

0

output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-79-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.1
nose: 1.3.1
pip: 8.0.2
setuptools: 20.1.1
Cython: None
numpy: 1.10.4
scipy: 0.13.3
statsmodels: 0.5.0
IPython: 4.1.1
sphinx: None
patsy: 0.2.1
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: 3.1.1
numexpr: 2.2.2
matplotlib: 1.3.1
openpyxl: 1.7.0
xlrd: 0.9.2
xlwt: 0.7.5
xlsxwriter: None
lxml: None
bs4: 4.2.1
html5lib: 0.999
httplib2: 0.8
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
Jinja2: None
@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Mar 7, 2016

Contributor

ok, pull-requests are welcome to fix

Contributor

jreback commented Mar 7, 2016

ok, pull-requests are welcome to fix

@thanasis2028

This comment has been minimized.

Show comment
Hide comment
@thanasis2028

thanasis2028 Mar 7, 2016

Is this supposed to return 0 or an empty Series? I am fixing it right now but it seems awkward returning 0, while other calls of the nunique() return Series objects.

thanasis2028 commented Mar 7, 2016

Is this supposed to return 0 or an empty Series? I am fixing it right now but it seems awkward returning 0, while other calls of the nunique() return Series objects.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Mar 7, 2016

Contributor

equiv to this, an empty Series should be returned. You should only do that null checking (the line where it errors), if the series has len

In [1]: s = Series([])

In [2]: s.groupby(s.index).sum()
Out[2]: Series([], dtype: float64)
Contributor

jreback commented Mar 7, 2016

equiv to this, an empty Series should be returned. You should only do that null checking (the line where it errors), if the series has len

In [1]: s = Series([])

In [2]: s.groupby(s.index).sum()
Out[2]: Series([], dtype: float64)
@thanasis2028

This comment has been minimized.

Show comment
Hide comment
@thanasis2028

thanasis2028 Mar 7, 2016

Thanks, made a pull request:
#12557
Edit: Output:

>>> reload(pandas)
<module 'pandas' from 'pandas/__init__.pyc'>
>>> pandas.Series().groupby(level = 0).nunique()
Series([], dtype: int64)
>>> pandas.Series([1],[1]).groupby(level = 0).nunique()
1    1
dtype: int64
>>>

thanasis2028 commented Mar 7, 2016

Thanks, made a pull request:
#12557
Edit: Output:

>>> reload(pandas)
<module 'pandas' from 'pandas/__init__.pyc'>
>>> pandas.Series().groupby(level = 0).nunique()
Series([], dtype: int64)
>>> pandas.Series([1],[1]).groupby(level = 0).nunique()
1    1
dtype: int64
>>>

@jreback jreback modified the milestones: 0.18.1, 0.18.2 Apr 25, 2016

thanasis2028 added a commit to thanasis2028/pandas that referenced this issue Jun 1, 2016

BUG: Fix for issue #12553
Author:    Thanasis Katsios <thkatsios@gmail.com>

@jorisvandenbossche jorisvandenbossche modified the milestones: Next Major Release, 0.19.0 Sep 1, 2016

mroeschke added a commit to mroeschke/pandas that referenced this issue Nov 17, 2016

mroeschke added a commit to mroeschke/pandas that referenced this issue Nov 27, 2016

mroeschke added a commit to mroeschke/pandas that referenced this issue Nov 30, 2016

mroeschke added a commit to mroeschke/pandas that referenced this issue Nov 30, 2016

BUG: Error upon Series.Groupby.nunique with empty Series (#12553)
Modified tests

simplify tests

Add whatsnew

mroeschke added a commit to mroeschke/pandas that referenced this issue Dec 1, 2016

BUG: Error upon Series.Groupby.nunique with empty Series (#12553)
Modified tests

simplify tests

Add whatsnew

Moved len check

@jreback jreback modified the milestones: 0.19.2, Next Major Release Dec 4, 2016

@jreback jreback closed this in c0e13d1 Dec 4, 2016

jorisvandenbossche added a commit that referenced this issue Dec 15, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment