New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SparseSeries.value_counts doesn't include fill_value counts #6749

Closed
bdanalytics opened this Issue Mar 31, 2014 · 3 comments

Comments

Projects
None yet
3 participants
@bdanalytics

bdanalytics commented Mar 31, 2014

Based on a suggestion from an user at stackoverflow.com, I am reporting a bug / enhancement request for sparse data frames. Please let me know if you need any more information.

Thanks in advance.

I am encountering a TypeError with a pandas sparse data frame when I use the value_counts method. I have listed the versions of the packages that I am using.

Python 2.7.6 |Anaconda 1.9.1 (x86_64)| (default, Jan 10 2014, 11:23:15)
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

import pandas
print pandas.version
0.13.1
import numpy
print numpy.version
1.8.0

dense_df = pandas.DataFrame(numpy.zeros((10, 10))
,columns=['x%d' % ix for ix in range(10)])
dense_df['x5'] = [1.0, 0.0, 0.0, 1.0, 2.1, 3.0, 0.0, 0.0, 0.0, 0.0]
print dense_df['x5'].value_counts()
0.0 6
1.0 2
3.0 1
2.1 1
dtype: int64

sparse_df = dense_df.to_sparse(fill_value=0) # Tried fill_value=0.0 also
print sparse_df.density
0.04

print sparse_df['x5'].value_counts()
Traceback (most recent call last):
File "", line 1, in
File "//anaconda/lib/python2.7/site-packages/pandas/core/series.py", line 1156, in value_counts
normalize=normalize, bins=bins)
File "//anaconda/lib/python2.7/site-packages/pandas/core/algorithms.py", line 231, in value_counts
values = com._ensure_object(values)
File "generated.pyx", line 112, in pandas.algos.ensure_object (pandas/algos.c:38788)
File "generated.pyx", line 117, in pandas.algos.ensure_object (pandas/algos.c:38695)
File "//anaconda/lib/python2.7/site-packages/pandas/sparse/array.py", line 377, in astype
raise TypeError('Can only support floating point data for now')
TypeError: Can only support floating point data for now

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Mar 31, 2014

Contributor

ok...will mark this as an enhancement (fyi I am Jeff!)

A pr would be welcome as sparse needs some TLC

Contributor

jreback commented Mar 31, 2014

ok...will mark this as an enhancement (fyi I am Jeff!)

A pr would be welcome as sparse needs some TLC

@jreback jreback added this to the 0.15.0 milestone Mar 31, 2014

@bdanalytics

This comment has been minimized.

Show comment
Hide comment
@bdanalytics

bdanalytics Mar 31, 2014

Thanks a lot for all your help, Jeff.

Plz let me know if there is any way I can help. Am a bit of python newbie, so not very confident of getting the pull request right at this point.

Balaji

Sent from my iPad

On Mar 31, 2014, at 10:10 AM, jreback notifications@github.com wrote:

ok...will mark this as an enhancement (fyi I am Jeff!)


Reply to this email directly or view it on GitHub.

bdanalytics commented Mar 31, 2014

Thanks a lot for all your help, Jeff.

Plz let me know if there is any way I can help. Am a bit of python newbie, so not very confident of getting the pull request right at this point.

Balaji

Sent from my iPad

On Mar 31, 2014, at 10:10 AM, jreback notifications@github.com wrote:

ok...will mark this as an enhancement (fyi I am Jeff!)


Reply to this email directly or view it on GitHub.

@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 3, 2015

@sinhrks

This comment has been minimized.

Show comment
Hide comment
@sinhrks

sinhrks Apr 9, 2016

Member

On current master, fill_value is not included result without TypeError. Changing the title.

pd.SparseSeries([1, 2, 0, 0], fill_value=0).value_counts()
# 2    1
# 1    1
# dtype: int64
Member

sinhrks commented Apr 9, 2016

On current master, fill_value is not included result without TypeError. Changing the title.

pd.SparseSeries([1, 2, 0, 0], fill_value=0).value_counts()
# 2    1
# 1    1
# dtype: int64

@sinhrks sinhrks modified the milestones: 0.18.1, Next Major Release Apr 9, 2016

@sinhrks sinhrks changed the title from sparse data frame value_counts not working to SparseSeries.value_counts doesn't include fill_value counts Apr 9, 2016

@jreback jreback closed this in 8439d28 Apr 29, 2016

nps added a commit to nps/pandas that referenced this issue May 17, 2016

BUG: SparseSeries.value_counts ignores fill_value
closes pandas-dev#6749

Author: sinhrks <sinhrks@gmail.com>

Closes pandas-dev#12835 from sinhrks/sparse_valuecounts and squashes the following commits:

2392e7c [sinhrks] Move dtype handling to algorithm
2f46f73 [sinhrks] BUG: SparseSeries.value_counts ignores fill_value
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment