Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Crosstab margins ignoring dropna #12577

Closed
nickeubank opened this issue Mar 9, 2016 · 4 comments
Closed

BUG: Crosstab margins ignoring dropna #12577

nickeubank opened this issue Mar 9, 2016 · 4 comments
Labels
Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@nickeubank
Copy link
Contributor

crosstab also has a bug -- it counts np.nan in margin totals even when dropna=True.

Appears independent of #12569 and #4003

df = pd.DataFrame({'a':[1,2,2,2,2,np.nan],'b':[3,3,4,4,4,4]})
pd.crosstab(df.a,df.b, margins=True)
Out[233]: 
b    3  4  All
a             
1.0  1  0    1
2.0  1  3    4
All  2  4    6

Expected Output

Out[233]: 
b    3  4  All
a             
1.0  1  0    1
2.0  1  3    4
All  2  3    5

output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.4.4.final.0
python-bits: 64
OS: Darwin
OS-release: 15.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.1
nose: 1.3.7
pip: 8.1.0
setuptools: 20.2.2
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.16.1
statsmodels: None
IPython: 4.0.1
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.4
matplotlib: 1.4.3
openpyxl: 2.2.6
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.7.7
lxml: 3.4.4
bs4: 4.4.1
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 1.0.9
pymysql: None
psycopg2: None
Jinja2: 2.8

@jreback jreback added Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode Difficulty Novice labels Mar 9, 2016
@jreback jreback added this to the 0.18.1 milestone Mar 9, 2016
@sakulkar
Copy link

Shouldn't the expected output be?

Out[233]: 
b    3  4  All
a             
1.0  1  0    1
2.0  1  3    4
All  2  3    5

OXPHOS added a commit to OXPHOS/pandas that referenced this issue Mar 14, 2016
To fix bug pandas-dev#12577: Crosstab margins ignoring dropna
@nickeubank
Copy link
Contributor Author

@sakulkar how is that different from issue report? Looks like we're saying same thing.

@jreback
Copy link
Contributor

jreback commented Mar 14, 2016

I edited the top @nickeubank (it was incorrect)

@nickeubank
Copy link
Contributor Author

Oops thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

3 participants