[BUG] NumPy 1.24: test regression `ValueError: cannot convert float NaN to integer` #423

mgorny · 2022-12-22T15:43:36Z

Describe the bug
When NumPy is upgrade to 1.24, the following test fails:

________________________________________________________ test_move[move_rank] _________________________________________________________

func = <built-in function move_rank>

    @pytest.mark.parametrize("func", bn.get_functions("move"), ids=lambda x: x.__name__)
    def test_move(func):
        """Test that bn.xxx gives the same output as a reference function."""
        fmt = (
            "\nfunc %s | window %d | min_count %s | input %s (%s) | shape %s | "
            "axis %s | order %s\n"
        )
        fmt += "\nInput array:\n%s\n"
        aaae = assert_array_almost_equal
        func_name = func.__name__
        func0 = eval("bn.slow.%s" % func_name)
        if func_name == "move_var":
            decimal = 3
        else:
            decimal = 5
        for i, a in enumerate(arrays(func_name)):
            axes = range(-1, a.ndim)
            for axis in axes:
                windows = range(1, a.shape[axis])
                for window in windows:
                    min_counts = list(range(1, window + 1)) + [None]
                    for min_count in min_counts:
                        actual = func(a, window, min_count, axis=axis)
>                       desired = func0(a, window, min_count, axis=axis)

a          = array([], shape=(2, 0), dtype=int64)
aaae       = <function assert_array_almost_equal at 0x7faf70006ca0>
actual     = array([], shape=(2, 0), dtype=float64)
axes       = range(-1, 2)
axis       = 0
da         = dtype('float32')
dd         = dtype('float32')
decimal    = 5
desired    = array([], shape=(1, 0, 2), dtype=float32)
err_msg    = '\nfunc move_rank | window 1 | min_count None | input a53 (float32) | shape (1, 0, 2) | axis 2 | order C,F\n\nInput array:\n[]\n\n dtype mismatch %s %s'
fmt        = '\nfunc %s | window %d | min_count %s | input %s (%s) | shape %s | axis %s | order %s\n\nInput array:\n%s\n'
func       = <built-in function move_rank>
func0      = <function move_rank at 0x7faf70ab9d00>
func_name  = 'move_rank'
i          = 57
min_count  = 1
min_counts = [1, None]
tup        = ('move_rank', 1, 'None', 'a53', 'float32', '(1, 0, 2)', ...)
window     = 1
windows    = range(1, 2)

bottleneck/.venv/lib/python3.11/site-packages/bottleneck/tests/move_test.py:33: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
bottleneck/.venv/lib/python3.11/site-packages/bottleneck/slow/move.py:110: in move_rank
    return move_func(lastrank, a, window, min_count, axis=axis)
        a          = array([], shape=(2, 0), dtype=int64)
        axis       = 0
        min_count  = 1
        window     = 1
bottleneck/.venv/lib/python3.11/site-packages/bottleneck/slow/move.py:148: in move_func
    y[tuple(idx2)] = func(a[tuple(idx1)], axis=axis, **kwargs)
        a          = array([], shape=(2, 0), dtype=int64)
        axis       = 0
        func       = <function lastrank at 0x7faf70ab9ee0>
        i          = 0
        idx1       = [slice(0, 1, None), slice(None, None, None)]
        idx2       = [0, slice(None, None, None)]
        kwargs     = {}
        mc         = 1
        min_count  = 1
        win        = 1
        window     = 1
        y          = array([], shape=(2, 0), dtype=float64)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

a = array([], shape=(1, 0), dtype=int64), axis = 0

    def lastrank(a, axis=-1):
        """
        The ranking of the last element along the axis, ignoring NaNs.
    
        The ranking is normalized to be between -1 and 1 instead of the more
        common 1 and N. The results are adjusted for ties.
    
        Parameters
        ----------
        a : ndarray
            Input array. If `a` is not an array, a conversion is attempted.
        axis : int, optional
            The axis over which to rank. By default (axis=-1) the ranking
            (and reducing) is performed over the last axis.
    
        Returns
        -------
        d : array
            In the case of, for example, a 2d array of shape (n, m) and
            axis=1, the output will contain the rank (normalized to be between
            -1 and 1 and adjusted for ties) of the the last element of each row.
            The output in this example will have shape (n,).
    
        Examples
        --------
        Create an array:
    
        >>> y1 = larry([1, 2, 3])
    
        What is the rank of the last element (the value 3 in this example)?
        It is the largest element so the rank is 1.0:
    
        >>> import numpy as np
        >>> from la.afunc import lastrank
        >>> x1 = np.array([1, 2, 3])
        >>> lastrank(x1)
        1.0
    
        Now let's try an example where the last element has the smallest
        value:
    
        >>> x2 = np.array([3, 2, 1])
        >>> lastrank(x2)
        -1.0
    
        Here's an example where the last element is not the minimum or maximum
        value:
    
        >>> x3 = np.array([1, 3, 4, 5, 2])
        >>> lastrank(x3)
        -0.5
    
        """
        a = np.array(a, copy=False)
        ndim = a.ndim
        if a.size == 0:
            # At least one dimension has length 0
            shape = list(a.shape)
            shape.pop(axis)
            r = np.empty(shape, dtype=a.dtype)
>           r.fill(np.nan)
E           ValueError: cannot convert float NaN to integer

a          = array([], shape=(1, 0), dtype=int64)
axis       = 0
ndim       = 2
r          = array([], dtype=int64)
shape      = [0]

bottleneck/.venv/lib/python3.11/site-packages/bottleneck/slow/move.py:236: ValueError

Downgrading to 1.23.5 makes tests pass.

To Reproduce
To assist in reproducing the bug, please include the following:

git clone https://github.com/pydata/bottleneck
pip install bottleneck pytest
python -c "import bottleneck;bottleneck.test()"

Expected behavior
Tests passing ;-).

Additional context
Tested on a825e20.

The text was updated successfully, but these errors were encountered:

https://build.opensuse.org/request/show/1045128 by user mcepl + dimstar_suse Forwarded request #1045057 from bnavigator - Skip a failing test -- gh#pydata/bottleneck#423 - Add rpmlintrc

nileshpatra · 2023-01-14T20:11:48Z

This is now reported also as a bug in corresponding debian package which need to be addressed soonish. A hacky way of getting this moving would be to create the r array with float dtype and later cast it back.

--- a/bottleneck/slow/move.py
+++ b/bottleneck/slow/move.py
@@ -232,8 +232,9 @@
         # At least one dimension has length 0
         shape = list(a.shape)
         shape.pop(axis)
-        r = np.empty(shape, dtype=a.dtype)
+        r = np.empty(shape, dtype=float)
         r.fill(np.nan)
+        r = r.astype(a.dtype)
         if (r.ndim == 0) and (r.size == 1):
             r = np.nan
         return r

I know this looks ugly, but just a suggestion which kind of works for a few combinations that I tried.

rdbisme · 2023-01-16T10:10:11Z

@nileshpatra Can you just check the performance impact of your fix? You can run the fix with and without the fix on the previous numpy version and see how that impacts.

if it's negligible, I think we can go on with the PR.

nileshpatra · 2023-01-16T19:30:05Z

Hi @rdbisme I ran the benchmark with the current HEAD of bottleneck (master branch) and numpy 1.23.5 (wherein the code works fine). As it seems, the benchmark numbers are not consistent across runs (there are deltas of +/- 15% and I do not have the time/energy to do more runs/analysis on this), but the runs with and without patches do not differ much. Let me know what you think about this.

Without the patch:

Bottleneck performance benchmark
    Bottleneck 1.3.5.post0.dev3; Numpy 1.23.5
    Speed is NumPy time divided by Bottleneck time
    NaN means approx one-fifth NaNs; float64 used

              no NaN     no NaN      NaN       no NaN      NaN    
               (100,)  (1000,1000)(1000,1000)(1000,1000)(1000,1000)
               axis=0     axis=0     axis=0     axis=1     axis=1  
nansum         38.9        1.1        1.8        1.5        2.1
nanmean       112.0        1.7        2.1        2.3        2.4
nanstd        200.1        2.1        2.2        2.7        2.5
nanvar        184.7        2.2        2.1        2.7        2.4
nanmin         22.7        0.4        0.4        0.5        0.6
nanmax         23.9        0.4        0.4        0.5        0.6
median        101.1        1.2        5.8        1.0        5.7
nanmedian     106.7        4.3        4.9        4.1        4.8
ss             10.6        0.9        0.8        1.2        1.2
nanargmin      80.7        2.6        5.4        2.0        6.1
nanargmax      77.1        2.4        5.2        1.9        6.3
anynan          8.7        0.4       40.8        0.8       34.4
allnan         12.5      129.8       94.0      113.4       73.4
rankdata       49.2        2.0        2.1        2.4        2.3
nanrankdata    52.0        2.2        2.0        2.4        2.4
partition       2.4        1.2        1.6        1.0        1.5
argpartition    3.0        1.0        1.3        1.0        1.5
replace         9.8        1.5        1.6        1.5        1.6
push         1565.2       10.8       10.2       23.9       14.3
move_sum     2650.8       72.9      164.6      268.8      230.0
move_mean    6062.1      113.7      189.2      267.3      250.3
move_std     8331.0      134.8      235.1      177.6      293.9
move_var    10452.3      174.0      292.3      281.0      368.3
move_min     1131.9        7.7        7.3       13.0       16.9
move_max     1071.0        7.9        7.4       13.3       17.5
move_argmin  3073.9       42.6       80.1       47.2       89.0
move_argmax  3031.7       43.7       79.2       53.2       86.1
move_median  1646.1      133.7      133.7      144.1      147.7
move_rank     569.3        1.5        1.6        2.0        2.0

With the patch:

>>> bn.bench()
Bottleneck performance benchmark
    Bottleneck 1.3.5.post0.dev4; Numpy 1.23.5
    Speed is NumPy time divided by Bottleneck time
    NaN means approx one-fifth NaNs; float64 used

              no NaN     no NaN      NaN       no NaN      NaN    
               (100,)  (1000,1000)(1000,1000)(1000,1000)(1000,1000)
               axis=0     axis=0     axis=0     axis=1     axis=1  
nansum         42.3        1.1        1.8        1.5        2.1
nanmean       125.9        1.8        2.2        2.5        2.4
nanstd        217.9        2.2        2.1        2.8        2.4
nanvar        207.4        2.1        2.1        2.7        2.5
nanmin         22.5        0.4        0.4        0.5        0.6
nanmax         22.8        0.4        0.4        0.5        0.6
median        111.5        1.3        5.8        1.0        5.6
nanmedian     115.6        4.7        5.6        4.4        5.2
ss              9.1        0.8        0.8        1.2        1.2
nanargmin      72.5        2.6        5.5        1.9        7.0
nanargmax      82.6        2.4        5.4        1.9        6.3
anynan          9.5        0.4       40.8        0.8       35.3
allnan         13.2      139.5       90.0      110.3       72.3
rankdata       48.8        2.1        2.1        2.3        2.3
nanrankdata    55.7        2.5        2.2        2.5        2.5
partition       2.8        1.2        1.7        1.0        1.5
argpartition    3.5        1.0        1.3        1.0        1.5
replace         9.6        1.6        1.6        1.5        1.5
push         1617.2       11.0        9.5       22.4       14.5
move_sum     2810.9       73.2      166.9      269.3      228.4
move_mean    6681.1      120.2      195.3      272.4      254.2
move_std     8567.2      141.3      243.7      167.2      294.5
move_var    10914.3      204.9      293.9      287.0      384.6
move_min     1121.8        7.0        7.2       13.1       17.7
move_max     1074.7        7.3        8.2       14.0       17.6
move_argmin  3146.1       42.5       78.9       44.7       85.8
move_argmax  2974.9       42.5       83.5       44.7       87.1
move_median  1817.4      158.1      152.6      145.9      146.2
move_rank     568.9        1.5        1.4        2.1        1.9

nileshpatra · 2023-01-18T04:39:35Z

@rdbisme We'd need to fix this in debian package soonish since it affects some key packages. And hence, could you please comment if the above report looks OK to you or you'd like something more?

rdbisme · 2023-01-18T23:06:10Z

@nileshpatra Sounds good. Feel free to open a PR and we'll get it merged fast! :)

Closes pydata#423

nileshpatra · 2023-01-19T03:17:58Z

@rdbisme done, please consider to check.

mgorny added the bug label Dec 22, 2022

mgorny assigned qwhelan Dec 22, 2022

qwhelan removed their assignment Jan 14, 2023

nileshpatra added a commit to nileshpatra/bottleneck that referenced this issue Jan 19, 2023

BUG: Fix build/tests with numpy 1.24

d837bd5

Closes pydata#423

nileshpatra mentioned this issue Jan 19, 2023

BUG: Fix build/tests with numpy 1.24 #425

Merged

rdbisme closed this as completed in e512ad9 Jan 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] NumPy 1.24: test regression `ValueError: cannot convert float NaN to integer` #423

[BUG] NumPy 1.24: test regression `ValueError: cannot convert float NaN to integer` #423

mgorny commented Dec 22, 2022

nileshpatra commented Jan 14, 2023 •

edited

Loading

rdbisme commented Jan 16, 2023

nileshpatra commented Jan 16, 2023

nileshpatra commented Jan 18, 2023

rdbisme commented Jan 18, 2023

nileshpatra commented Jan 19, 2023

[BUG] NumPy 1.24: test regression ValueError: cannot convert float NaN to integer #423

[BUG] NumPy 1.24: test regression ValueError: cannot convert float NaN to integer #423

Comments

mgorny commented Dec 22, 2022

nileshpatra commented Jan 14, 2023 • edited Loading

rdbisme commented Jan 16, 2023

nileshpatra commented Jan 16, 2023

nileshpatra commented Jan 18, 2023

rdbisme commented Jan 18, 2023

nileshpatra commented Jan 19, 2023

[BUG] NumPy 1.24: test regression `ValueError: cannot convert float NaN to integer` #423

[BUG] NumPy 1.24: test regression `ValueError: cannot convert float NaN to integer` #423

nileshpatra commented Jan 14, 2023 •

edited

Loading