New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: reindex(columns=..) after get_dummies raises TypeError: values must be SparseArray #18914

Closed
ShadowGiraffe opened this Issue Dec 23, 2017 · 4 comments

Comments

Projects
None yet
4 participants
@ShadowGiraffe

ShadowGiraffe commented Dec 23, 2017

Code Sample

df = pd.DataFrame.from_items([('GDP', [1, 2]),('Nation', ['AB', 'CD'])])
df = pd.get_dummies(df, columns=['Nation'], sparse=True)  # SparseDataFrame
df.reindex(columns=['GDP'])  # Fails :/

TypeError: values must be SparseArray

Problem description

I'm doing a pandas upgrade from 0.19.x to 0.21.x for my project. The above code works under 0.19.x, but not under 0.21.x.

@jreback

This comment has been minimized.

Contributor

jreback commented Dec 23, 2017

hmm that does look buggy.

cc @Licht-T

getitem works ok here

In [10]: df[['GDP']]
Out[10]: 
   GDP
0    1
1    2

@jreback jreback added this to the Next Major Release milestone Dec 23, 2017

@jreback

This comment has been minimized.

Contributor

jreback commented Dec 23, 2017

@ShadowGiraffe welcome for an investigation / PR

@jreback jreback changed the title from [BUG] reindex(columns=..) after get_dummies raises TypeError: values must be SparseArray to BUG: reindex(columns=..) after get_dummies raises TypeError: values must be SparseArray Dec 23, 2017

@hexgnu

This comment has been minimized.

Contributor

hexgnu commented Dec 24, 2017

I think I figured out the problem. Inside of get_dummies not all columns are cast as sparse but are marked as such causing some interesting issues down the line when trying to reindex.

Added a PR for that fix.

@jreback jreback modified the milestones: Next Major Release, 0.23.0 Jan 1, 2018

@summerela

This comment has been minimized.

summerela commented Apr 29, 2018

I just ran into this issue with the latest version of pandas. Please let me know if you would like me to post my code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment