Skip to content

normalize dask paths use boolean fancy indexing, forcing materialization #1124

@brendancol

Description

@brendancol

Describe the bug

_run_dask_numpy_rescale (line 69), _run_dask_cupy_rescale (line 111), _run_dask_numpy_standardize (line 227), and _run_dask_cupy_standardize (line 257) all do data[finite_mask] where both are dask arrays. Dask does not support boolean fancy indexing lazily — it materializes the masked values into a single chunk.

Expected behavior

Use da.nanmin()/da.nanmax() for rescale and da.nanmean()/da.nanstd() for standardize. These are lazy per-chunk reductions that never materialize the full array.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghigh-priorityoomOut-of-memory risk with large datasets

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions