-
Notifications
You must be signed in to change notification settings - Fork 901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Pymod operation is missing null
values
#5938
Comments
I guess it makes sense for a modulo zero operation to return null since it's undefined behavior but this is somewhere we would differ from Pandas. |
This issue has been marked rotten due to no recent activity in the past 90d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. |
cc @brandon-b-miller for more binaryop null things |
Binary operations in libcudf are designed so that nulls always propagate. As a result, although we have created a specialized mod operator However, in this instance there is some precedent for solving this at the libcudf level. Specifically, all the @jrhemstad I'd be curious what you think about this. The tl;dr is that in pandas |
"PYMOD" is a misnomer. It wasn't intended to be 100% compliant with Python's See: https://stackoverflow.com/questions/13683563/whats-the-difference-between-mod-and-remainder I wanted to rename
This is the correct answer. It's no different than #7389 (comment) imo. |
Found the original conversation: #1985 (review)
I still agree with my past self :) |
I'm fine with that explanation, although to @harrism's point we'd definitely need to document how we define remainder vs modulo since it's at best it's a nonstandard but common distinction. So perhaps we should do that rename while also addressing this issue here :) I think resolving this issue then requires two things:
@brandon-b-miller would be up for handling this? I'm hoping it would be a near carbon copy of #11441 except with a different fill value. I can take on 1 myself if you don't want to. |
happy to handle making the change 👍 |
The type normalisation applied before heading into libcudf previously had slightly unexpected consequences for large int64 values. If not providing a `cudf.Scalar`, a bare `int64` scalar would be cast to `uint64` and then normal numpy type promotion would unify to `float64`. This is lossy, since int64 to float64 is neither surjective nor injective. To avoid this, try very hard to maintain the dtype of the object coming in, and match pandas behaviour by applying numpy type promotion rules via `numpy.result_type`. - Closes #5938 - Closes #7389 - Closes #12072 - Closes #12092 Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #12074
Describe the bug
When there is
0
in denominator ofpymod
operation, we seem to be returning garbage value instead ofnull
value.Steps/Code to reproduce bug
Expected behavior
Return either
nan
ornull
inplace of garbage valuesEnvironment overview (please complete the following information)
Additional context
This kind of looks similar to #5722 but not sure if they share the same code-flows, so Just putting it out there.
The text was updated successfully, but these errors were encountered: