-
Notifications
You must be signed in to change notification settings - Fork 900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Integer promotion fixes needed for NumPy 2 for comparison operators #16282
Comments
If this will be fixed in CuPy 13.3.0, is there anything more for us to do here? |
This needs to be fixed in CuDF itself to avoid users running into annoying limitations (such as It isn't a difficult thing as such: We need a custom promotion logic for comparisons, so getting a good-enough fix should be quick. But opened an issue, because I thought someone deeper in cudf may have a better thought on the best approach. |
) When Python integers are compared to a series of integers, the result can always be correctly defined no matter the values of the Python integer. This was always a very mild issue. But with NumPy 2 behavior not upcasting the computation result type based on the value anymore, even things like: ``` cudf.Series([1, 2, 3], dtype="int8") < 1000 ``` would fail. (Similar paths could be taken for other integer scalars, but there would be mostly nice for performance.) N.B. NumPy/pandas also support exact comparisons when mixing e.g. uint64 and int64. This is another rare exception that cudf currently does not support. Closes gh-16282 Authors: - Sebastian Berg (https://github.com/seberg) Approvers: - Matthew Roeschke (https://github.com/mroeschke) URL: #16532
While NumPy 2 and pandas promotions should now work mostly correctly on the main branch (tests pass with
cupy==12.3.1
(not actually released) andnumpy==2.0.0
, the comparison operators are still a problem.As of now the following new test would just fail:
Because when comparing an integer series and a 0-D object, we would assume it is OK to use the series' dtype even when the integer does not fit.
The solution to this should not be very complicated. For integer comparison operations change
normalize_binop_value
to:series.dtype.type(other_integer)
fits (if yes OK)I may look into it now that the other things are merged. But if anyone has a preference for the approach, happy to hear it.
N.B. for series to series comparisons, pandas/numpy now compare
uint64
andint64
correctly by value (for scalars the above fixes already do that when they work). One could fix that, but it is a separate issue and seems lower priority.The text was updated successfully, but these errors were encountered: