Skip to content

zonal: rechunk apply 3D dask output along stacked axis (#2526)#2532

Merged
brendancol merged 1 commit into
mainfrom
deep-sweep-performance-zonal-2026-05-27
May 28, 2026
Merged

zonal: rechunk apply 3D dask output along stacked axis (#2526)#2532
brendancol merged 1 commit into
mainfrom
deep-sweep-performance-zonal-2026-05-27

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

  • zonal.apply on a 3D dask values array left output chunks at size 1 along the stacked axis because da.stack(layers, axis=2) was not followed by rechunk().
  • Rechunk the stacked output back to values_data.chunks[2] in both _apply_dask_numpy and _apply_dask_cupy.
  • Adds regression test_apply_dask_3d_axis2_rechunked_2526 and updates the sweep-performance state row for zonal.

Fixes #2526.

Background

Graph probe on a 256x256 raster with chunks=(64, 64, 3) and 3 bands previously yielded:

"apply_dask_3d": {
  "task_count": 176,
  "tasks_per_chunk": 11.0,
  "output_chunks_axis2": [1, 1, 1]
}

The documented anti-pattern from CLAUDE.md is da.stack without a following rechunk — see existing zonal hypsometric_integral audit notes that called out the same shape.

Test plan

  • pytest xrspatial/tests/test_zonal.py (126 passed)
  • New regression test asserts result.data.chunks[2] == values.data.chunks[2]
  • Existing test_apply_3d[dask+numpy] still passes

`zonal.apply` on a 3D dask `values` array calls `da.stack(layers,
axis=2)` to assemble per-band `map_blocks` outputs.  `da.stack`
creates unit chunks along the new axis, and the function did not
rechunk afterward, so the result had `chunks[2] = (1, 1, ..., 1)`
regardless of the input chunking.

Reproducing on 256x256 input with chunks=(64, 64, 3) confirmed
output chunks of `(1, 1, 1)` instead of `(3,)`.  Downstream
operations that touch all bands together paid scheduling overhead
proportional to the band count.

Fix: rechunk the stacked output back to the input's axis-2 chunking
in both `_apply_dask_numpy` and `_apply_dask_cupy`.

Adds regression test `test_apply_dask_3d_axis2_rechunked_2526` and
updates the sweep-performance state row for `zonal`.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 27, 2026
@brendancol
Copy link
Copy Markdown
Contributor Author

Review

Targeted fix, low risk. The da.stack -> rechunk({2: ...}) pattern is the documented remediation for unit-chunk-along-stack-axis (also referenced in the in-tree hypsometric_integral audit).

Verified

  • rechunk({2: values_data.chunks[2]}) only touches axis 2; axes 0/1 keep zones_data.chunks.
  • meta=np.array(()) / meta=cupy.array(()) from map_blocks propagates through da.stack and rechunk, so dtype/backend metadata is preserved.
  • N=1 band edge case: rechunk target becomes (1,), which matches what da.stack already produces - no-op, harmless.
  • Both _apply_dask_numpy (line 1691-1694) and _apply_dask_cupy (line 1734-1737) get the fix.
  • pytest xrspatial/tests/test_zonal.py -> 126 passed locally.

Notes (not blocking)

  • Regression test exercises _apply_dask_numpy only. The cupy path uses the same construction so the fix is symmetric, and this matches the existing test_apply_3d parametrization (['numpy', 'dask+numpy']). cupy parity test can land separately when the cupy CI runner exercises it.
  • No benchmark added; existing zonal benchmark already covers apply.

Looks ready to merge.

@brendancol brendancol merged commit 0d454eb into main May 28, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

zonal.apply 3D dask path leaves axis-2 chunked at size 1 after da.stack

1 participant