Boolean "~" operator ignored after "|" #60

EricPrideaux · 2018-07-26T14:07:02Z

Hi kieferk,

I am an R user learning how to use dfply. I may have spotted an issue: it appears that Boolean ~ isn't evaluated after Boolean | if applied in the syntax below.

My code:

# Import
import pandas as pd
import numpy as np
from dfply import *

# Create data frame and mask it
df  = pd.DataFrame({'a':[np.nan,2,3,4,5],'b':[6,7,8,9,np.nan],'c':[5,4,3,2,1]})
df2 = (df >>
        mask((X.a.isnull()) | ~(X.b.isnull())))
print(df)
print(df2)

Here is the original data frame, df:

       a    b    c
    0  NaN  6.0  5
    1  2.0  7.0  4
    2  3.0  8.0  3
    3  4.0  9.0  2
    4  5.0  NaN  1

And here is the result of the piped mask, df2:

         a    b    c
      0  NaN  6.0  5
      4  5.0  NaN  1

However, I expect this instead:

         a    b    c
      0  NaN  6.0  5
      1  2.0  7.0  4
      2  3.0  8.0  3
      3  4.0  9.0  2

I don't understand why the | and ~ operators result in rows in which column "a" is either NaN or column "b" is not NaN?

By the way, I also tried np.logical_or():

df  = pd.DataFrame({'a':[np.nan,2,3,4,5],'b':[6,7,8,9,np.nan],'c':[5,4,3,2,1]})
df2 = (df >>
        mask(np.logical_or(X.a.isnull(),~X.b.isnull())))
print(df)
print(df2)

But this resulted in error:

mask(np.logical_or(X.a.isnull(),~X.b.isnull())))
ValueError: invalid __array_struct__

The text was updated successfully, but these errors were encountered:

kieferk · 2018-08-28T01:08:34Z

This is a tricky one. I'll have to dive into it a little bit to see what's going on. The ~ usage on the symbolic is one of the more complicated parts of the code and it's been awhile since I wrote that.

kieferk · 2018-08-28T04:06:48Z

Ok so this is definitely a bug, but I'm gonna need to think about how I'll fix it. Essentially the problem is that the inversion is not propagating through properly in the chain of operations, and unfortunately it's not a trivial fix as far as I can tell right now. I'll let you know when I come up with a solution.

EricPrideaux · 2018-08-28T10:51:48Z

Hi Kieferk,
Many thanks for your update. I look forward to your solution and will keep an eye out!

andrewkho · 2018-10-31T17:48:37Z

Just wanted to chime in that I have also come across this bug, same scenario when using mask except my case was e.g. mask(X.bool_col1 & (~X.bool_col2))

andrewkho · 2018-10-31T17:53:10Z

Also wanted to add that in the case of &, you can use mask(condA, ~condB), and alternatively, the - sign for inversion also works, e.g. mask(condA & -condB)

kieferk · 2019-01-18T18:50:28Z

Sorry I've been inactive for awhile since work has been very busy. I am going to dive back in and try to tackle this over the weekend.

I am hoping I can resolve this "elegantly" but from what I can see it may require some substantial code re-writing. I'll keep you posted.

jstrong-tios · 2019-07-30T21:28:21Z

interestingly, passing the invert operator to make_symbolic results in correct behavior (fwiw):

from operator import inv # inv(x) == ~x

df['a'].isnull() | (~df['b'].isnull())
#        m
# 0   True
# 1   True
# 2   True
# 3   True
# 4  False

df >> transmute(m = X.a.isnull() | inv(X.b.isnull()))
#        m
# 0   True
# 1  False
# 2  False
# 3  False
# 4   True

df >> transmute(m = X.a.isnull() | make_symbolic(inv)(X.b.isnull()))
#        m
# 0   True
# 1   True
# 2   True
# 3   True
# 4  False

antonio-yu · 2020-11-24T03:05:13Z

Hi kieferk,

My friends and I are very excited and thankful when encounting the dplyr-style package.
We use filter_by a lot in filting chinese by boolean values.
We look forward to your solution for this Boolean bug.

EricPrideaux changed the title ~~Boolean ~ operators ignored after |~~ Boolean "~" operator ignored after "|" Jul 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Boolean "~" operator ignored after "|" #60

Boolean "~" operator ignored after "|" #60

EricPrideaux commented Jul 26, 2018

kieferk commented Aug 28, 2018

kieferk commented Aug 28, 2018

EricPrideaux commented Aug 28, 2018

andrewkho commented Oct 31, 2018

andrewkho commented Oct 31, 2018

kieferk commented Jan 18, 2019

jstrong-tios commented Jul 30, 2019

antonio-yu commented Nov 24, 2020

Boolean "~" operator ignored after "|" #60

Boolean "~" operator ignored after "|" #60

Comments

EricPrideaux commented Jul 26, 2018

kieferk commented Aug 28, 2018

kieferk commented Aug 28, 2018

EricPrideaux commented Aug 28, 2018

andrewkho commented Oct 31, 2018

andrewkho commented Oct 31, 2018

kieferk commented Jan 18, 2019

jstrong-tios commented Jul 30, 2019

antonio-yu commented Nov 24, 2020