Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undocumented behavior change (1.0.5->1.1.0) when using arithmetic operations on dataframes #36702

Closed
2 of 3 tasks
mhaselsteiner opened this issue Sep 28, 2020 · 5 comments · Fixed by #37132
Closed
2 of 3 tasks
Assignees
Labels
Error Reporting Incorrect or improved errors from pandas Numeric Operations Arithmetic, Comparison, and Logical operations
Milestone

Comments

@mhaselsteiner
Copy link

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

pandas 1.0.5:

(pd.DataFrame({'x':[1,2,],'y':[1,2]})+[pd.Series([1,1]),pd.Series([1,1]) ]).iloc[0,0]
Out[37]: 2

pandas 1.1.0:

>>> (pd.DataFrame({'x':[1,2,],'y':[1,2]})+[pd.Series([1,1]),pd.Series([1,1]) ]).iloc[0,0]
0    2
1    2
dtype: int64

Problem description

When applying arithmetic operations between dataframe and list of Series prior to 1.0.5 and we get different results then starting with 1.1.0. Starting with 1.1.0 we seem to get a another level of indexing, altough the dataframes columns and index return the same values, iterating over it returns different objects.

Since now the return type of the chained iloc has changed from float/int to series, this breaks functions using the arithmetic operator like this.

Expected Output

Output of pd.show_versions()

using the latest pandas version:

INSTALLED VERSIONS ------------------ commit : 2a7d332 python : 3.8.5.final.0 python-bits : 64 OS : Linux OS-release : 5.4.0-48-generic Version : #52-Ubuntu SMP Thu Sep 10 10:58:49 UTC 2020 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.1.2
numpy : 1.19.1
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.3
setuptools : 49.6.0.post20200917
Cython : None
pytest : 6.0.2
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.18.1
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.2
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.5.2
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

using 1.0.3

INSTALLED VERSIONS ------------------ commit : None python : 3.8.3.final.0 python-bits : 64 OS : Linux OS-release : 5.4.0-48-generic machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 1.0.3 numpy : 1.18.5 pytz : 2020.1 dateutil : 2.8.1 pip : 20.1.1 setuptools : 47.3.1.post20200616 Cython : None pytest : 5.4.3 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : 1.0.1 pymysql : None psycopg2 : None jinja2 : 2.11.2 IPython : 7.15.0 pandas_datareader: None bs4 : None bottleneck : None fastparquet : None gcsfs : None lxml.etree : None matplotlib : 3.2.1 numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pytables : None pytest : 5.4.3 pyxlsb : None s3fs : None scipy : 1.4.1 sqlalchemy : 1.3.18 tables : None tabulate : None xarray : None xlrd : None xlwt : None xlsxwriter : None numba : NoneI
@mhaselsteiner mhaselsteiner added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 28, 2020
@AlexKirko
Copy link
Member

AlexKirko commented Sep 29, 2020

Confirmed this on the latest commit in master. It seems to me like this changes could have been caused by #31296 or #33600, but this is at first glance, since I'm not familiar with these PRs, and we changed a ton of stuff in 1.1.0

@jbrockmendel , could those PRs have caused this?

@jreback
Copy link
Contributor

jreback commented Sep 29, 2020

I don't think this was every anything but accidently working, how exactly do you align these series? this is very very odd. in fact I would say we should raise here.

This is very different from say lists of ndarrays which we do support because no alignment is involved.

@jbrockmendel
Copy link
Member

@mhaselsteiner formatting your example would make it clearer to the reader what's going on:

df = pd.DataFrame({'x': [1, 2], 'y': [1, 2]})
ser = pd.Series([1, 1])

result = df + [ser, ser]
result.iloc[0, 0]

Like @jreback said, the solution is to not do this.

@dsaxton dsaxton added Usage Question and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 29, 2020
@dsaxton dsaxton changed the title BUG: undocumented behavior change (1.0.5->1.1.0) when using arithmetic operations on dataframes Undocumented behavior change (1.0.5->1.1.0) when using arithmetic operations on dataframes Sep 29, 2020
@AlexKirko AlexKirko added DataFrame DataFrame data structure Error Reporting Incorrect or improved errors from pandas Series Series data structure and removed DataFrame DataFrame data structure Error Reporting Incorrect or improved errors from pandas Series Series data structure labels Sep 30, 2020
@AlexKirko
Copy link
Member

Makes sense that we should raise when the user tries something like this. I'll take a look at how it can be implemented.

@AlexKirko
Copy link
Member

take

@jreback jreback added this to the 1.2 milestone Oct 15, 2020
@jreback jreback added Error Reporting Incorrect or improved errors from pandas Numeric Operations Arithmetic, Comparison, and Logical operations and removed Usage Question labels Oct 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants