Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Series.combine with fill_value gives unexpected results #31142

Open
jamesharrop opened this issue Jan 19, 2020 · 7 comments
Open

Series.combine with fill_value gives unexpected results #31142

jamesharrop opened this issue Jan 19, 2020 · 7 comments
Labels

Comments

@jamesharrop
Copy link

Code Sample, a copy-pastable example if possible

import pandas as pd

a = pd.Series([1, 2, 3], index=["a", "b", "c"])
b = pd.Series([10, 20, 30], index=[0, "e", "f"])
c = b.combine(a, lambda x,y: x+y, fill_value=0)
print(c)

Problem description

This gives the output:
0 11
a 1
b 2
c 3
e 20
f 30
dtype: int64

Expected Output

0 10
a 1
b 2
c 3
e 20
f 30
dtype: int64

With higher numbers in the index it gives expected output i.e.:

a = pd.Series([1, 2, 3], index=["a", "b", "c"])
b = pd.Series([10, 20, 30], index=[4, "e", "f"])
c = b.combine(a, lambda x,y: x+y, fill_value=0)
print(c)

4 10
a 1
b 2
c 3
e 20
f 30
dtype: int64

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : None python : 3.7.6.final.0 python-bits : 64 OS : Windows OS-release : 10 machine : AMD64 processor : Intel64 Family 6 Model 142 Stepping 10, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : None.None

pandas : 0.25.3
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 19.3.1
setuptools : 44.0.0.post20200106
Cython : 0.29.14
pytest : 5.3.2
hypothesis : 4.54.2
sphinx : 2.3.1
blosc : None
feather : None
xlsxwriter : 1.2.7
lxml.etree : 4.4.2
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.11.1
pandas_datareader: None
bs4 : 4.8.2
bottleneck : 1.3.1
fastparquet : None
gcsfs : None
lxml.etree : 4.4.2
matplotlib : 3.1.1
numexpr : 2.7.0
odfpy : None
openpyxl : 3.0.2
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.3.2
sqlalchemy : 1.3.12
tables : 3.6.1
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.2.7

@MarcoGorelli
Copy link
Member

Thanks for the report!

Can confirm this reproduces on master, are you interesting in investigating why this might be happening / submitting a pull request?

@jamesharrop
Copy link
Author

Thanks for your reply. Yes I'll go ahead and investigate it.

@jamesharrop
Copy link
Author

It looks like:
def combine(self, other, func, fill_value=None) -> "Series"
checks for key values by using:
def get(self, key, default=None)
which uses self[key]
which will accept an index value instead of a key and unexpectedly return the item at that index.

I could submit a pull request changing it to self.loc[key] but I don't know if that would cause problems elsewhere?

@MarcoGorelli
Copy link
Member

I don't know if that would cause problems elsewhere?

I haven't had a look at this yet so I don't know, but to check locally whether this'll break any existing functionality you can run

pytest pandas

@jamesharrop
Copy link
Author

Thanks, the tests produce about 20 errors, at least some of them because they expect .get(key) to accept index values. I will think about other options

@jamesharrop
Copy link
Author

Could add another get function which uses .loc which is stricter about accepting keys only, this does work without errors locally but is not very elegant. I could submit as a pull request for review

@MarcoGorelli
Copy link
Member

I could submit as a pull request for review

Go ahead :) Please refer to the contributing guide if you're unsure how to get started

@simonjayhawkins simonjayhawkins added this to the Contributions Welcome milestone Apr 24, 2020
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants