Skip to content

reproject vertical_crs path crashes on dask, cupy, and 3D multi-band inputs #2025

@brendancol

Description

@brendancol

The vertical datum shift inside reproject(..., src_vertical_crs=..., tgt_vertical_crs=...) only works for the 2-D in-memory numpy backend. Three valid input shapes raise on call:

1. Dask backend

_apply_vertical_shift calls .copy(), slices result[r0:r1], then assigns via boolean fancy indexing (strip_data[is_valid] += ...). Dask arrays do not support __setitem__ with a boolean mask, so the call raises before any output is produced.

Minimal repro:

import numpy as np, xarray as xr, dask.array as da
from xrspatial.reproject import reproject

data = da.from_array(np.random.rand(64, 64) * 100, chunks=(32, 32))
raster = xr.DataArray(
    data, dims=['y','x'],
    coords={'y': np.linspace(40, 30, 64), 'x': np.linspace(-80, -70, 64)},
    attrs={'crs':'EPSG:4326'},
)
reproject(raster, 'EPSG:4326',
          src_vertical_crs='EGM96', tgt_vertical_crs='ellipsoidal')
# ValueError: operands could not be broadcast together with shapes (nan,) (4096,)

2. CuPy backend

_apply_vertical_shift calls Numba JIT functions (_interp_geoid_2d) that only accept numpy arrays. With a cupy-backed raster, np.empty((n_rows, out_w)) plus the JIT call sees a numpy strip output but the result array is cupy, and the boolean-index update raises.

Repro yields TypeError: Unsupported type <class 'numpy.ndarray'> when result[is_valid] += ... mixes cupy result with numpy update.

3. 3D multi-band

For a (y, x, band) source, the reprojected result is (H, W, B) but is_valid has shape (n_rows, W) and N_strip has shape (n_rows, W). Broadcasting fails:

ValueError: operands could not be broadcast together with shapes (64,64,3) (64,64)

Why it matters

The public reproject() signature documents src_vertical_crs and tgt_vertical_crs as supported alongside dask, cupy, and multi-band inputs. There are no warnings or guards; users get a hard crash on three of the four backends.

Suggested fix

Rework _apply_vertical_shift to:

  1. Bail out and apply the shift via dask.array.map_blocks for dask inputs.
  2. Move cupy results back to numpy before the JIT geoid lookup (or use a cupy-native interpolation), then convert back.
  3. Loop over band slices for 3-D results so the geoid undulation array broadcasts correctly per band.

Cover each branch with a small test in xrspatial/tests/test_reproject.py.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions