Skip to content

Fix flaky RuntimeWarning during array reductions#10030

Merged
jrbourbeau merged 1 commit intodask:mainfrom
jrbourbeau:array-reductions-use-ones-not-empty
Mar 8, 2023
Merged

Fix flaky RuntimeWarning during array reductions#10030
jrbourbeau merged 1 commit intodask:mainfrom
jrbourbeau:array-reductions-use-ones-not-empty

Conversation

@jrbourbeau
Copy link
Copy Markdown
Member

@jrbourbeau jrbourbeau commented Mar 7, 2023

This is sort of a shot in the dark. We've got some (pretty rare) flaky RuntimeWarning: invalid value encountered in reduce warnings being emitted in test_reductions_2D (see full traceback below). The warning occurs when we're determining the output meta dtype using np.empty. This PR suggests we use np.ones instead of np.empty to make things more deterministic. Again, not totally sure if this actually solves the flaky warning, but it might, and the change here seems harmless anyways.

I'll run CI here a few times to see if we encounter the test_reductions_2D error

Traceback:
____________________________ test_reductions_2D[f4] ____________________________
[gw3] linux -- Python 3.8.16 /usr/share/miniconda3/envs/test-environment/bin/python3.8

dtype = 'f4'

    @pytest.mark.slow
    @pytest.mark.parametrize("dtype", ["f4", "i4"])
    def test_reductions_2D(dtype):
        x = np.arange(1, 122).reshape((11, 11)).astype(dtype)
        a = da.from_array(x, chunks=(4, 4))
    
        b = a.sum(keepdims=True)
        assert b.__dask_keys__() == [[(b.name, 0, 0)]]
    
        reduction_2d_test(da.sum, a, np.sum, x)
        reduction_2d_test(da.mean, a, np.mean, x)
        reduction_2d_test(da.var, a, np.var, x, False)  # Difference in dtype algo
        reduction_2d_test(da.std, a, np.std, x, False)  # Difference in dtype algo
        reduction_2d_test(da.min, a, np.min, x, False)
        reduction_2d_test(da.max, a, np.max, x, False)
        reduction_2d_test(da.any, a, np.any, x, False)
        reduction_2d_test(da.all, a, np.all, x, False)
    
        reduction_2d_test(da.nansum, a, np.nansum, x)
>       reduction_2d_test(da.nanmean, a, np.mean, x)

dask/array/tests/test_reductions.py:219: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
dask/array/tests/test_reductions.py:158: in reduction_2d_test
    assert same_keys(da_func(darr, axis=1), da_func(darr, axis=1))
dask/array/reductions.py:734: in nanmean
    dt = getattr(np.mean(np.empty(shape=(1,), dtype=a.dtype)), "dtype", object)
<__array_function__ internals>:5: in mean
    ???
/usr/share/miniconda3/envs/test-environment/lib/python3.8/site-packages/numpy/core/fromnumeric.py:3440: in mean
    return _methods._mean(a, axis=axis, dtype=dtype,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

a = array([nan], dtype=float32), axis = None, dtype = None, out = None
keepdims = False

    def _mean(a, axis=None, dtype=None, out=None, keepdims=False, *, where=True):
        arr = asanyarray(a)
    
        is_float16_result = False
    
        rcount = _count_reduce_items(arr, axis, keepdims=keepdims, where=where)
        if rcount == 0 if where is True else umr_any(rcount == 0, axis=None):
            warnings.warn("Mean of empty slice.", RuntimeWarning, stacklevel=2)
    
        # Cast bool, unsigned int, and int to float64 by default
        if dtype is None:
            if issubclass(arr.dtype.type, (nt.integer, nt.bool_)):
                dtype = mu.dtype('f8')
            elif issubclass(arr.dtype.type, nt.float16):
                dtype = mu.dtype('f4')
                is_float16_result = True
    
>       ret = umr_sum(arr, axis, dtype, out, keepdims, where=where)
E       RuntimeWarning: invalid value encountered in reduce

/usr/share/miniconda3/envs/test-environment/lib/python3.8/site-packages/numpy/core/_methods.py:179: RuntimeWarning

EDIT: Also xref #8892, which is related

@jrbourbeau jrbourbeau changed the title Fix flaky RuntimeWarning during array reductions [WIP] Fix flaky RuntimeWarning during array reductions Mar 7, 2023
@github-actions github-actions bot added the array label Mar 7, 2023
Copy link
Copy Markdown
Contributor

@j-bennet j-bennet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Still makes me wonder why the test is only failing intermittently.

@jrbourbeau
Copy link
Copy Markdown
Member Author

jrbourbeau commented Mar 7, 2023

np.empty can sometimes give arrays with interesting values

In [1]: import numpy as np

In [2]: np.empty([2, 2], dtype="float32")
Out[2]:
array([[1.67e-43, 1.36e-43],
       [1.60e-43, 1.54e-43]], dtype=float32)
In [10]: np.empty([1, 2], dtype="float64")[0][0]
Out[10]: 2.058335917824e-312

My hunch (this is really just a guess) is that some downstream operations might not like some of the values empty can produce and using np.ones would provide the same dtype-determining functionality, but on more predictable data.

Copy link
Copy Markdown
Contributor

@j-bennet j-bennet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jrbourbeau jrbourbeau changed the title [WIP] Fix flaky RuntimeWarning during array reductions Fix flaky RuntimeWarning during array reductions Mar 8, 2023
@jrbourbeau
Copy link
Copy Markdown
Member Author

Okay, so I've run CI 6 times here and haven't seen the test_reductions_2D failure. It's a pretty rare failure, so maybe 6 times isn't enough, but the changes here seem harmless, regardless. Let's give it a try and I'll keep an eye out for test_reductions_2D failures

@jrbourbeau jrbourbeau merged commit 5f1fc42 into dask:main Mar 8, 2023
@jrbourbeau jrbourbeau deleted the array-reductions-use-ones-not-empty branch March 8, 2023 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants