Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG/API Raise ValueError for non-unique stack by name #6738

Conversation

TomAugspurger
Copy link
Contributor

Closes #6729

We should raise a ValueError when (un)stacking and referring to a non-unique index name. You can still stack and unstack by position like normal (no ambiguity there, unless your names are integers and that integer is duplicated, in which case we'll raise a ValueError).

Right now I'm assuming that index names are always hashable, but that's not necessarily true, is it? Any good ways to check the uniqueness of the names? I didn't see anything on MultiIndex that did this.



def _reference_duplicate_name(names, name):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this a method of Multiindex

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I was thinking.

I also changed the implementation a bit. My way was buggy since it would raise whenever there was duplicates and a name was in names (even if name wasn't duplicated). But I want it to only raise when name is one of the duplicates.

@jtratner
Copy link
Contributor

@TomAugspurger my expectation would be that index names should be hashable. They're supposed to be key-esque, right? (plus tuples are still hashable). If you needed to, I'd catch the TypeError from trying to hash non-hashable object and just warn in that case.

@jreback jreback added this to the 0.14.0 milestone Apr 5, 2014
@jreback
Copy link
Contributor

jreback commented Apr 5, 2014

@TomAugspurger pls rebase....seems ok otherwise

@jreback
Copy link
Contributor

jreback commented Apr 9, 2014

@TomAugspurger this closes #6319 as well?

Should raise a ValueError when (un)stacking a DataFrame
on a nonunique level. Previous behavior was to raise
a KeyError (not deliberately). Closes pandas-dev#6729.
@TomAugspurger
Copy link
Contributor Author

This only closes #6319 if we don't want to support pivots/crosstabs where the names are the same. But we should be able to support them. I can do a proper fix for #6319 for .14 probably.

I haven't merged this because my pushes to the remote branch aren't showing up here. All I need to do is rebase, so I may just do that as I'm merging. Should just be release notes / v0.14 that have conflicts. Sound OK?

@jreback
Copy link
Contributor

jreback commented Apr 9, 2014

how so you push to the remote?
do u have it set as a tracking branch?

@TomAugspurger
Copy link
Contributor Author

Somewhere along the way I created two branches: origin:unstack-nonunique and origin:origin/unstack-nonunique. This PR is form origin:origin/unstack-nonunique, but when I push to that the changes don't appear.

Making a new PR at #6849

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[API/ERR]: Better error message on unstack with non-unique index names
3 participants