New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_dense does not preserve dtype in SparseArray #10648

Closed
ebolyen opened this Issue Jul 21, 2015 · 3 comments

Comments

Projects
None yet
2 participants
@ebolyen

ebolyen commented Jul 21, 2015

This isn't a huge deal, but it seems a little odd:

In [1]: import pandas as pd

In [2]: a = pd.SparseArray([True, False, False, False, True], fill_value=False, dtype=bool)

In [3]: a
Out[3]: 
[True, False, False, False, True]
Fill: False
IntIndex
Indices: array([0, 4], dtype=int32)

In [4]: a.dtype
Out[4]: dtype('bool')

In [5]: d = a.to_dense()

In [6]: d
Out[6]: array([ 1.,  0.,  0.,  0.,  1.])

In [7]: d.dtype
Out[7]: dtype('float64')

I would have expected d to retain the dtype of bool. I can cast down, but I am still wasting 7 bytes per element in the process.

@ebolyen

This comment has been minimized.

Show comment
Hide comment
@ebolyen

ebolyen Jul 21, 2015

INSTALLED VERSIONS

commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-57-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.16.2
nose: 1.3.7
Cython: 0.22.1
numpy: 1.9.2
scipy: 0.15.1
statsmodels: None
IPython: 3.2.0
sphinx: 1.2.2
patsy: None
dateutil: 2.4.2
pytz: 2015.4
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.4.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None

ebolyen commented Jul 21, 2015

INSTALLED VERSIONS

commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-57-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.16.2
nose: 1.3.7
Cython: 0.22.1
numpy: 1.9.2
scipy: 0.15.1
statsmodels: None
IPython: 3.2.0
sphinx: 1.2.2
patsy: None
dateutil: 2.4.2
pytz: 2015.4
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.4.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None

@jreback jreback added this to the Next Major Release milestone Jul 22, 2015

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Jul 22, 2015

Contributor

prob a bug, not really well tested and not really a lot of dev support in sparse
pull requests are welcome

Contributor

jreback commented Jul 22, 2015

prob a bug, not really well tested and not really a lot of dev support in sparse
pull requests are welcome

@jreback jreback changed the title from `to_dense` does not preserve dtype in `SparseArray` to to_dense does not preserve dtype in SparseArray Jul 22, 2015

@ebolyen

This comment has been minimized.

Show comment
Hide comment
@ebolyen

ebolyen Jul 22, 2015

I may be speaking too soon, but after skimming the code for this, it looks pretty simple to fix. I'll work on a PR!

ebolyen commented Jul 22, 2015

I may be speaking too soon, but after skimming the code for this, it looks pretty simple to fix. I'll work on a PR!

ebolyen added a commit to ebolyen/pandas that referenced this issue Jul 22, 2015

BUG: to_dense now preserves dtype in SparseArray
Also fixes values and get_values.

closes pandas-dev#10648

@jreback jreback modified the milestones: 0.17.0, Next Major Release Jul 24, 2015

@jreback jreback modified the milestones: Next Major Release, 0.17.0 Sep 2, 2015

@jreback jreback modified the milestones: 0.18.1, Next Major Release Apr 3, 2016

@jreback jreback closed this in 2d13410 Apr 3, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment