Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: negating pd.Na and None #42862

Open
2 of 3 tasks
attack68 opened this issue Aug 3, 2021 · 8 comments
Open
2 of 3 tasks

BUG: negating pd.Na and None #42862

attack68 opened this issue Aug 3, 2021 · 8 comments
Labels
Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Numeric Operations Arithmetic, Comparison, and Logical operations

Comments

@attack68
Copy link
Contributor

attack68 commented Aug 3, 2021

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of pandas.
  • I have confirmed this bug exists on the master branch of pandas.

I don't really know how to describe this or if it is as expected but it seems strange behaviour to me:

>>> df = pd.DataFrame({"a": [pd.NA, -1, None], "b": [np.nan, -1, 1]})
      a     b
0  <NA>   NaN
1    -1  -1.0
2  None   1.0

>>> print(df.dtypes)
a    object
b   float64

>>>print(df * -1)
     a    b
0  NaN  NaN
1    1  1.0
2  NaN -1.0

>>> print((df * -1).dtypes
a    object
b   float64

>>> print(-df)
TypeError: bad operand type for unary -: 'NoneType'

I guess there are two things here:

  1. (-df) is not the same as (df * -1)
  2. the operation keeps dtype as object but the <NA> and None are converted to NaN display
@attack68 attack68 added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 3, 2021
@KrishnaSai2020
Copy link
Contributor

KrishnaSai2020 commented Aug 3, 2021

In df *-1 it defaults to NaN which is expected as None * -1 is invalid but surely it should do the same for -df. but maybe because there's no * sign DataFrame.multiply doesn't register it. So then Python kicks in and since in Python Multiplication of NaN and -1 is invalid it raises an error.

@attack68
Copy link
Contributor Author

attack68 commented Aug 3, 2021

In df *-1 it defaults to NaN which is expected as None * -1 is invalid but surely it should do the same for -df. but maybe because there's no * sign DataFrame.multiply doesn't register it. So then Python kicks in and since in Python Multiplication of NaN and -1 is invalid it raises an error.

ok, but why is pd.NA * -1 = np.nan and what is the value of -pd.NA?

@KrishnaSai2020
Copy link
Contributor

KrishnaSai2020 commented Aug 3, 2021

for consistency maybe ie to make all values floats as pd.NA stores an integer value and np.nan registers as a float. with regards to -pd.NA it comes up as <NA> which it should.

@KrishnaSai2020
Copy link
Contributor

It seems strange but makes sense. Sort of anyway.

@mzeitlin11
Copy link
Member

I think currently pd.NA in object type data is just not well-defined (xref #33066 and #32931). In general, I think the current behavior for nullable types makes sense under the semantics that pd.NA just signifies an unknown value:

In [4]: ser = pd.Series([1, None], dtype="Int64")

In [5]: ser * -1
Out[5]:
0      -1
1    <NA>
dtype: Int64

So IMO object type data should match this and just propagate NA

@attack68
Copy link
Contributor Author

attack68 commented Aug 3, 2021

So IMO object type data should match this and just propagate NA

I agree, I think pd.NA = -pd.NA = pd.NA [*/+/-/**] float/int/complex

@KrishnaSai2020
Copy link
Contributor

+1 to redefining pd.NA

@mzeitlin11
Copy link
Member

+1 to redefining pd.NA

To be clear on the scope here, I'm not recommending a redefinition of pd.NA, this is how it already works. There are just problems that are specific to object type data

@mroeschke mroeschke added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Numeric Operations Arithmetic, Comparison, and Logical operations and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

No branches or pull requests

4 participants