Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pivot_table drops columns for aggregate functions that return all None, even if dropna=False #22159

Closed
tompollard opened this Issue Aug 1, 2018 · 0 comments

Comments

Projects
None yet
4 participants
@tompollard
Copy link

commented Aug 1, 2018

pivot_table drops columns for aggregate functions that return all None, even if dropna=False. I'm using pandas: 0.23.3 and the full output of pd.show_versions() is posted at the end of this comment. #21969 seems to be related.

Code Sample

import pandas as pd
# pd.__version__ = '0.23.3'

# create sample dataframe to be pivoted
df = pd.DataFrame({'fruit':['apple','orange','peach','apple','peach'], 
    'size': [1,1.5,1,2,1],
    'taste': [7,8,6,6,10]})

# create three aggregate functions, one which returns only None
def return_one(x): 
    return 1

def return_sum(x):
    return sum(x)

def return_none(x):
    return None 

# reshape the dataframe
df2 = pd.pivot_table(df, columns='fruit', 
               aggfunc=[return_sum,return_none,return_one], 
               dropna=False)

# df2 does not include the output 
# of return_none, which is unexpected

#      return_sum              return_one             
# fruit      apple orange peach      apple orange peach
# size         3.0    1.5   2.0        1.0    1.0   1.0
# taste       13.0    8.0  16.0        1.0    1.0   1.0

Problem description

The dropna argument of pivot_table is True by default, meaning that the default output of pivot_table will not include "columns whose entries are all NaN".

If dropna is set to False, I would expect the resulting dataframe to include columns whose entries are all NaN (or None). This is not the case. As shown in the example, the columns are dropped even if dropna=False.

Expected Output

      return_sum             return_none              return_one             
fruit      apple orange peach      apple orange peach      apple orange peach
size         3.0    1.5   2.0       None   None  None       1.0    1.0   1.0
taste       13.0    8.0  16.0       None   None  None       1.0    1.0   1.0

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Darwin
OS-release: 17.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.23.3
pytest: 3.3.2
pip: 18.0
setuptools: 39.0.1
Cython: 0.28.4
numpy: 1.14.0
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.6.7
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: 1.0.5
lxml: None
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.6
pymysql: None
psycopg2: 2.7.4 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.