KeyError: '__dummy__' for pd.crosstab in pandas #10291

Closed
songhuiming opened this Issue Jun 5, 2015 · 12 comments

Comments

Projects
None yet
4 participants

get ~~ KeyError: 'dummy' ~~ when I run the following:

np.random.seed(seed = 99)
s = np.random.randint(1,10,200)
s = pd.Series(np.where(s > 9, np.nan, s))
s1 = s[:100]
s2 = s[100:]
pd.crosstab(s1, s2)

KeyError: '__dummy__
Contributor

lexual commented Jun 6, 2015

Even simpler example. Perhaps something to do with the indices not overlapping at all.

s1 = pd.Series([1, 2, 3], index=[1, 2, 3])
s2 = pd.Series([4, 5, 6], index=[4, 5, 6])
pd.crosstab(s1, s2)
Contributor

lexual commented Jun 6, 2015

Contributor

lexual commented Jun 6, 2015

Yes http://pandas.pydata.org/pandas-docs/stable/groupby.html#na-group-handling is this cause.

Because the 2 indices have no overlapping indexes, this means that each groupby ends up including a nan which then excludes it from groupby result.

You then end up with an empty dataframe and that is the cause of the KeyError, as you're accessing df['dummy'] on an empty dataframe.

Contributor

jreback commented Jun 7, 2015

yeh, this should just be an empty frame, as there are no cross-tabulations.

jreback added this to the Next Major Release milestone Jun 7, 2015

Contributor

lexual commented Jun 8, 2015

So this is not a bug?

should we:

  • raise exception
  • return an empty dataframe?
Contributor

jreback commented Jun 8, 2015

return an empty frame

I'm getting the same KeyError: 'dummy' for my grouped data.

And I'm not really sure how to fix it / what you mean by 'return an empty frame.' Care to dumb it down/show precisely what you mean?

Thanks!

Contributor

jreback commented Jan 5, 2016

@dan7davis this needs a fix that would return an empty frame when catching the KeyError exception raised by the example above

https://github.com/pydata/pandas/blob/master/pandas/tools/pivot.py#L151, just need something like:

try:
    table = table[values[0]]
except KeyError:
    pass

@jreback problem solved. thank you! really appreciate the alacrity

Contributor

jreback commented Jan 5, 2016

want to do a pull request to fix in master?

I'm (very) new to coding/python/GitHub, so unfortunately I have no idea
what that means. But it sounds useful for me to know & helpful for others,
so I'd be happy to learn/try..

On Tue, Jan 5, 2016 at 3:32 PM, Jeff Reback notifications@github.com
wrote:

want to do a pull request to fix in master?


Reply to this email directly or view it on GitHub
pydata#10291 (comment).

Contributor

jreback commented Jan 6, 2016

contributing is a great way to learn ...., see our docs: http://pandas.pydata.org/pandas-docs/stable/contributing.html

any questions, pls ask.

@jreback jreback modified the milestone: 0.18.0, Next Major Release Feb 11, 2016

jreback closed this in dcc7cca Feb 12, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment