Deprecate fill_value argument for df.add #32608

WillAyd · 2020-03-11T02:20:30Z

This came up in the submission for #13488

This is a documented argument but strangely not universally implemented for all argument types. Both @jbrockmendel and I agreed it is probably better to deprecate and push users towards using something like what @jreback suggested in #13488 (comment)

jorisvandenbossche · 2020-03-11T12:04:36Z

Some questions / possible counter-arguments:

It is implemented for DataFrames (df1.add(df2, fill_value=..). So would we deprecate that as well? (I suppose so, but not fully clear from your description) Then we are deprecating something that works fine and is documented, just because it is not implemented for Series.
The alternative that you link to is df.fillna(0).add(s). But, this is actually not fully equivalent (for better or worse, fill_value will keep the ouput NaN if both left and right are NaN)
You mention add, but I suppose the same goes for the other flexible binary operators (mul, sub, ..)?

TomAugspurger · 2020-03-11T12:18:54Z

+1 to all of Joris' questions.

fill_value will keep the ouput NaN if both left and right are NaN)

FWIW, I've found this behavior useful in the past (if not immediately obvious).

jbrockmendel · 2020-03-11T15:51:00Z

Strength of my opinion: I agree with Will, but wouldn't have brought it up on my own.

FWIW1: We have a total of 6 tests that pass fill_value to any of the DataFrame flex methods.

FWIW2: the place I've found this most troublesome is with complex dtypes, where numpy's behavior doesnt match the builtin complex. (xref #28050)

It is implemented for DataFrames (df1.add(df2, fill_value=..). So would we deprecate that as well?
You mention add, but I suppose the same goes for the other flexible binary operators (mul, sub, ..)?

I think that is the idea, yes, on both.

fill_value will keep the ouput NaN if both left and right are NaN

Do we have an idea how often this is used? I agree that if a user wants this, there isnt a simple "do X instead".

WillAyd · 2020-03-11T23:17:05Z

It is implemented for DataFrames (df1.add(df2, fill_value=..). So would we deprecate that as well? (I suppose so, but not fully clear from your description) Then we are deprecating something that works fine and is documented, just because it is not implemented for Series.

Yea I was thinking across the board deprecation

The alternative that you link to is df.fillna(0).add(s). But, this is actually not fully equivalent (for better or worse, fill_value will keep the ouput NaN if both left and right are NaN)

Maybe @TomAugspurger can offer more insight, but what use cases are there for this? Would have never thought to use this as a quasi-XOR fillna

You mention add, but I suppose the same goes for the other flexible binary operators (mul, sub, ..)?

Yes I think would make sense to do across the board if we wanted to do at all

osvill · 2020-03-18T13:55:53Z

Do we have an idea how often this is used?

I thought I'd jump in as I'm using this feature a lot at the moment. Maybe it is a very special case, but there it is important if operating (add, sub, mul, div) with two dataframes with same index and columns that the result distinguishes whether both values are NaN (resulting in NaN), only one (use fill_value) or neither.

On this topic I found the following behavior curious:
import pandas as pd
df = pd.DataFrame( {'a': [None, 1.0, 2.0], 'b': [None, None, 4.0]})
#THIS WORKS!
df.sub(df.rename(columns={'a':'b', 'b':'a'}), fill_value=0)
#THIS DOESN'T WORK!
df.sub(df['a'], axis=0, fill_value=0)

I know in the second example I subtract a pd.Series but I find it strange.

jbrockmendel · 2020-05-12T23:22:49Z

@WillAyd shoould we add this to the agenda for tomorrow's call?

WillAyd · 2020-05-13T00:39:29Z

Sure! Not a huge sticking point for me but can always be discussed if there is interest. FYI I probably won’t be able to make tomorrow’s call

…

On May 12, 2020, at 4:23 PM, jbrockmendel ***@***.***> wrote: @WillAyd <https://github.com/WillAyd> shoould we add this to the agenda for tomorrow's call? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#32608 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAEU4UPDKFEFOUTGHMI6RITRRHK5NANCNFSM4LFL7DAA>.

jbrockmendel · 2020-09-25T22:37:01Z

Another option is to enable the case that currently raises NotImplementedError (i.e. frame.add(series, fill_value=foo))

One way to do that would be to tile the series to the same shape as the frame and then use the frame path. Easy+cheap for non-EA dtypes. Would be the same for EA with 2D EAs. For EAs we can cobble together a non-copying solution for axis=1, but that doesnt work for axis=0.

WillAyd added the Deprecate Functionality to remove in pandas label Mar 11, 2020

WillAyd added this to the Contributions Welcome milestone Mar 11, 2020

jorisvandenbossche added the Needs Discussion Requires discussion from core team before further action label Mar 11, 2020

This comment has been minimized.

Sign in to view

jbrockmendel added the Numeric Operations Arithmetic, Comparison, and Logical operations label Sep 17, 2020

mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022

mroeschke mentioned this issue May 11, 2023

ENH: df.add fill_value NotImplementedError #13488

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecate fill_value argument for df.add #32608

Deprecate fill_value argument for df.add #32608

WillAyd commented Mar 11, 2020

jorisvandenbossche commented Mar 11, 2020

TomAugspurger commented Mar 11, 2020

jbrockmendel commented Mar 11, 2020

WillAyd commented Mar 11, 2020

This comment has been minimized.

osvill commented Mar 18, 2020

jbrockmendel commented May 12, 2020

WillAyd commented May 13, 2020 via email

jbrockmendel commented Sep 25, 2020

Deprecate fill_value argument for df.add #32608

Deprecate fill_value argument for df.add #32608

Comments

WillAyd commented Mar 11, 2020

jorisvandenbossche commented Mar 11, 2020

TomAugspurger commented Mar 11, 2020

jbrockmendel commented Mar 11, 2020

WillAyd commented Mar 11, 2020

This comment has been minimized.

osvill commented Mar 18, 2020

jbrockmendel commented May 12, 2020

WillAyd commented May 13, 2020 via email

jbrockmendel commented Sep 25, 2020