Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Inconsistent behavior of index values for DataFrame.apply and Series.apply #36189

Closed
2 of 3 tasks
YarShev opened this issue Sep 7, 2020 · 6 comments · Fixed by #36231
Closed
2 of 3 tasks

BUG: Inconsistent behavior of index values for DataFrame.apply and Series.apply #36189

YarShev opened this issue Sep 7, 2020 · 6 comments · Fixed by #36231
Assignees
Labels
Apply Apply, Aggregate, Transform Bug Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@YarShev
Copy link
Contributor

YarShev commented Sep 7, 2020

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

import pandas as pd
pdf = pd.DataFrame([[4, 9]] * 3, columns=['A', 'B'])
pdf.apply(["sum", lambda df: df.sum(), lambda df: df.sum()]) # lambdas have postfix 0, 1 ...
             A   B
sum         12  27
<lambda_0>  12  27
<lambda_1>  12  27
s = pd.Series([4] * 3)
s.apply(["sum", lambda df: df.sum(), lambda df: df.sum()]) # lambdas have not postfix 0, 1 ...
sum         12
<lambda>    12
<lambda>    12
dtype: int64

Problem description

Could anyone explain please? Is it normal behavior? Why DataFrame's index has postfix for lambdas, but Series's index hasn't. The same behavior is observed for agg. There is no such behavior in pandas==1.0.5.

Output of pd.show_versions()

pandas : 1.1.1
numpy : 1.18.4
pytz : 2020.1
dateutil : 2.8.1
pip : 20.1.1
setuptools : 41.2.0
Cython : None
pytest : 5.4.2
hypothesis : None
sphinx : None
blosc : None
feather : 0.4.1
xlsxwriter : None
lxml.etree : 4.5.0
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.14.0
pandas_datareader: None
bs4 : 4.9.1
bottleneck : None
fsspec : 0.7.3
fastparquet : None
gcsfs : None
matplotlib : 3.2.1
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.3
pandas_gbq : None
pyarrow : 0.16.0
pytables : None
pyxlsb : None
s3fs : 0.4.2
scipy : 1.4.1
sqlalchemy : 1.3.17
tables : 3.6.1
tabulate : None
xarray : 0.15.1
xlrd : 1.2.0
xlwt : None
numba : None

@YarShev YarShev added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 7, 2020
@phofl
Copy link
Member

phofl commented Sep 8, 2020

Hi,

thanks for your report.

This was introduced with 0534b00

@charlesdong1991 do you remember, why this was not added for Series too?

@charlesdong1991
Copy link
Member

charlesdong1991 commented Sep 8, 2020

In this case, index shouldn't change for dataframe, Series behaviour should be correct IMHO which means no suffix.

I think this line should be moved back to groupby/generic.py and then should be all good:

func = maybe_mangle_lambdas(func)

I can come up with a small patch for this hopefully on Thursday when I have some free time. i think the issue can be fixed in 1.1.3.

I just tested a bit, seems okay, set to WIP first in case any other tests fail.

@charlesdong1991
Copy link
Member

take

@charlesdong1991 charlesdong1991 added Apply Apply, Aggregate, Transform Groupby and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 8, 2020
@YarShev
Copy link
Contributor Author

YarShev commented Sep 9, 2020

@charlesdong1991 , thanks a lot!

@simonjayhawkins
Copy link
Member

In this case, index shouldn't change for dataframe, Series behaviour should be correct IMHO which means no suffix.

to clarify, the DataFrame index is incorrect and should be the same as 1.0.5 and should be labelled as a regression?

@simonjayhawkins simonjayhawkins added the Regression Functionality that used to work in a prior pandas version label Sep 9, 2020
@simonjayhawkins simonjayhawkins added this to the 1.1.3 milestone Sep 9, 2020
@charlesdong1991
Copy link
Member

yeah, thanks @simonjayhawkins should be labelled as a regression!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform Bug Regression Functionality that used to work in a prior pandas version
Projects
None yet
4 participants