Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unstack with mixed dtypes coerces everything to object #11847

Closed
potash opened this issue Dec 15, 2015 · 6 comments
Closed

Unstack with mixed dtypes coerces everything to object #11847

potash opened this issue Dec 15, 2015 · 6 comments
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@potash
Copy link

potash commented Dec 15, 2015

Related to #2929, if I unstack a dataframe with mixed dtypes they all get coerced to object and I have to recast to go back which is surprisingly slow (30 seconds for 400k rows and 400 np.float32 columns)

Is there any reason pandas doesn't keep the np.float32 dtype, especially since it supports missing values so even when there are missing index/column positions it shouldn't pose a problem?

@jreback
Copy link
Contributor

jreback commented Dec 15, 2015

pls show a copy-pastable example

@jreback jreback added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Dtype Conversions Unexpected or buggy dtype conversions labels Dec 15, 2015
@potash
Copy link
Author

potash commented Dec 15, 2015

Ah, looking for an example helped me narrow down the bug. It is specific to passing a list of levels to unstack, even when that list only has a single entry. E.g. compare:

> df = pd.DataFrame({'state':['IL', 'MI'], 'index':['a','a'], 'value1':[1.0,1.0], 'value2':['c','c'] })
> df.set_index(['state','index']).unstack(['index']).dtypes

        index
value1  a        object
value2  a        object
dtype: object

> df.set_index(['state','index']).unstack('index').dtypes
index
value1  a        float64
value2  a         object
dtype: object

So a workaround in my case with multiple levels is to replace unstack(['index1', 'index2']) with unstack('index1').unstack('index2') and indeed I checked that it works.

@jreback
Copy link
Contributor

jreback commented Dec 15, 2015

so looks like what you want is: #9023

which is almost finished. in fact if you are looking for something to do...could use some updating :)

@jreback
Copy link
Contributor

jreback commented Dec 15, 2015

i'll mark this as a bug, which may be independent. want to see if you can put in a fix with the existing framework?

@potash
Copy link
Author

potash commented Dec 15, 2015

Thanks! I will try but I do not use pandas from master and I've never played with the source so it won't be quick.

@kordek
Copy link
Contributor

kordek commented Aug 20, 2016

Picking this up to take a look

@jreback jreback modified the milestones: 0.19.2, Next Major Release Nov 21, 2016
jorisvandenbossche pushed a commit that referenced this issue Dec 15, 2016
closes #11847

Changed the way
in which the original data frame is copied (dropped use of .values,
since it does not preserve dtypes).

Author: Pawel Kordek <pawel.kordek@gmail.com>

Closes #14053 from kordek/#11847 and squashes the following commits:

6a381ce [Pawel Kordek] BUG: GH11847 Unstack with mixed dtypes coerces everything to object

(cherry picked from commit d531718)
ischurov pushed a commit to ischurov/pandas that referenced this issue Dec 19, 2016
closes pandas-dev#11847

Changed the way
in which the original data frame is copied (dropped use of .values,
since it does not preserve dtypes).

Author: Pawel Kordek <pawel.kordek@gmail.com>

Closes pandas-dev#14053 from kordek/pandas-dev#11847 and squashes the following commits:

6a381ce [Pawel Kordek] BUG: GH11847 Unstack with mixed dtypes coerces everything to object
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants