Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/pandas-compat]: Handling of type promotion and division/mod by zero for boolean columns #12162

Open
wence- opened this issue Nov 16, 2022 · 0 comments
Labels
0 - Backlog In queue waiting for assignment bug Something isn't working Python Affects Python cuDF API.

Comments

@wence-
Copy link
Contributor

wence- commented Nov 16, 2022

After #12074, most type promotions between columns of mixed types (and non-mixed types) match pandas. The exception is columns with boolean dtypes.

Pandas have taken the decision to disallow division and exponentiation on boolean types when both operands are booleans (https://github.com/pandas-dev/pandas/blob/d13c9e034ce8a1d738766c4b1cec80c76f5523be/pandas/core/ops/array_ops.py#L503).

Aside: I kind of disagree with this since this is all perfectly well defined (excepting the usual caveat of division by zero).

When only one of the operands is bool, the status quo depends on the dtype of the other operand:

Pandas behaviour:

For a % b, with a == 1, b == 0 for various dtypes

dtype-a \ dtype-b bool int float
bool int8(0) (or ZeroDivisionError1) float64(NaN) float64(NaN)
int int64(0) (or ZeroDivisionError1) float64(NaN) float64(NaN)
float float64(NaN) (or ZeroDivisionError1) float64(NaN) float64(NaN)

For a / b (or a // b) with a == 1, b = 0

dtype-a \ dtype-b bool int float
bool NotImplemented (or ZeroDivisionError1) float64(inf) float64(inf)
int float64(inf)(or ZeroDivisionError1) float64(inf) float64(inf)
float float64(inf) (or ZeroDivisionError1) float64(inf) float64(inf)

For a % b, with a == 0, b == 0 for various dtypes

dtype-a \ dtype-b bool int float
bool int8(0)(or ZeroDivisionError1) float64(NaN) float64(NaN)
int int64(0) (or ZeroDivisionError1) float64(NaN) float64(NaN)
float float64(NaN) (or ZeroDivisionError1) float64(NaN) float64(NaN)

For a / b (or a // b) with a == 0, b = 0

dtype-a \ dtype-b bool int float
bool NotImplemented (or ZeroDivisionError1) float64(NaN) float64(NaN)
int float64(NaN) (or ZeroDivisionError1) float64(NaN) float64(NaN)
float float64(NaN) (or ZeroDivisionError1) float64(NaN) float64(NaN)

cuDF behaviour:

For a % b, with a == 1, b == 0 for various dtypes

dtype-a \ dtype-b bool int float
bool bool(0) float64(NaN) float64(NaN)
int int64(2**32 - 1) float64(NaN) float64(NaN)
float float64(NaN) float64(NaN) float64(NaN)

For a // b with a == 1, b = 0

dtype-a \ dtype-b bool int float
bool bool(1) float64(inf) float64(inf)
int int64(2**32 - 1) float64(inf) float64(inf)
float float64(inf) float64(inf) float64(inf)

For a / b with a == 1, b = 0

dtype-a \ dtype-b bool int float
bool float32(inf) float64(inf) float64(inf)
int float64(inf) float64(inf) float64(inf)
float float64(inf) float64(inf) float64(inf)

For a % b, with a == 0, b == 0 for various dtypes

dtype-a \ dtype-b bool int float
bool bool(0) float64(NaN) float64(NaN)
int int64(2**32 - 1) float64(NaN) float64(NaN)
float float64(NaN) float64(NaN) float64(NaN)

For a // b with a == 0, b = 0

dtype-a \ dtype-b bool int float
bool bool(False) float64(NaN) float64(NaN)
int int64(2**32 - 1) float64(NaN) float64(NaN)
float float64(NaN) float64(NaN) float64(NaN)

For a / b with a == 0, b = 0

dtype-a \ dtype-b bool int float
bool float32(NaN) float64(NaN) float64(NaN)
int float64(NaN) float64(NaN) float64(NaN)
float float64(NaN) float64(NaN) float64(NaN)

Footnotes

  1. If the operands are different lengths, we get a ZeroDivisionError (see https://github.com/pandas-dev/pandas/issues/49699) 2 3 4 5 6 7 8 9 10 11 12

@GregoryKimball GregoryKimball added bug Something isn't working 0 - Backlog In queue waiting for assignment Python Affects Python cuDF API. labels Nov 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 - Backlog In queue waiting for assignment bug Something isn't working Python Affects Python cuDF API.
Projects
Status: No status
Status: TODO
Development

No branches or pull requests

2 participants