Skip to content

binary() output dtype differs across backends (float64 on numpy/dask vs float32 on cupy) #3511

Description

@brendancol

binary() returns a different result dtype depending on the array backend.

On numpy and dask+numpy the output keeps the input dtype, because _cpu_binary allocates np.empty(data.shape, dtype=data.dtype). On cupy and dask+cupy the output is always float32, because _run_cupy_binary allocates cupy.empty(..., dtype='f4').

Reproducer with a float64 input:

numpy      -> float64
dask       -> float64
cupy       -> float32
dask+cupy  -> float32

Every other classifier in classify.py routes through _cpu_bin, which fixes the output to float32, so binary is the only classifier whose result dtype depends on the backend. Preserving the input dtype also cannot hold the NaN sentinel that binary writes for non-finite cells when the input is an integer array.

Proposed fix: allocate float32 in _cpu_binary so all four backends agree.

Found by the accuracy sweep (Cat 5, backend inconsistency).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions