Skip to content

crosstab: validate backend compatibility on the 3D path#2653

Merged
brendancol merged 4 commits into
mainfrom
issue-2640
May 29, 2026
Merged

crosstab: validate backend compatibility on the 3D path#2653
brendancol merged 4 commits into
mainfrom
issue-2640

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2640

crosstab() only validated backend/shape compatibility on the 2D path. A 3D call with mismatched backends (numpy zones plus a dask 3D values, for example) sailed past validation and crashed deep inside the dask/numba machinery with TypeError: 'NoneType' object is not subscriptable, which gives the caller no hint that the real problem is a backend mismatch.

This adds an early backend check to the 3D path. validate_arrays() can't be reused here because it requires equal shapes and zones is 2D while values is 3D, so the check compares _classify_backend() of each input and raises a clear ValueError when they differ.

  • Adds a backend-compatibility check for 3D crosstab() inputs, raising ValueError up front instead of crashing later.
  • New test covers both directions of mismatch (numpy zones + dask values, and the reverse).

Backend coverage: the validation runs for all backends. Matching-backend 3D calls (numpy/numpy, dask/dask) are unchanged.

Test plan:

  • pytest xrspatial/tests/test_zonal.py -k crosstab_3d
  • pytest xrspatial/tests/test_zonal.py (full module, no regressions)

Dedupe duplicate module rows (last-write-wins by last_inspected) and
collapse multi-line notes to single physical lines. The notes had
embedded newlines, which the merge=union .gitattributes strategy splits
record-by-record, corrupting the file into a 156-column phantom row on
parallel-agent appends. One line per record keeps union merges safe.
The 3D crosstab() path never checked that zones and values share a
backend. A mixed-backend call (e.g. numpy zones with dask 3D values)
crashed deep inside the dask/numba machinery with an opaque
"'NoneType' object is not subscriptable". Check backend compatibility
up front and raise a clear ValueError, matching the 2D path.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 29, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: crosstab: validate backend compatibility on the 3D path

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

None.

Nits (optional improvements)

  • xrspatial/zonal.py:1416-1430 -- the new check covers the backend mismatch but not the "early shape" half of the original report. Spatial shape is still validated later (zones.shape != values.shape[1:]), which only fires after layer is resolved and the array is transposed. That's fine in practice: a real shape mismatch still raises a clear "Incompatible shapes" ValueError (verified). The one rough edge is that if layer is also invalid, the user sees "Invalid layer" first. Not worth reordering for this PR; flagging it so the call is on record.
  • xrspatial/tests/test_zonal.py:1354 -- the test is gated on dask being available, which is right since both cases need a dask array. There's no numpy-vs-cupy mixed case, but that needs a GPU runner, so leaving it out is reasonable.

What looks good

  • The new check sits right next to the existing 2D validate_arrays() call, so the two paths read symmetrically.
  • Reusing _classify_backend() keeps the backend names consistent with the rest of the codebase (numpy / cupy / dask+numpy / dask+cupy).
  • The comment explains why validate_arrays() can't be reused here (it needs equal shapes), which is the non-obvious part.
  • The test checks both directions of the mismatch and asserts the error message, not just the exception type.
  • Matching-backend 3D calls (numpy/numpy, dask/dask) are unchanged; the full test_zonal.py passes (170 tests).

Checklist

  • Algorithm matches reference/paper: n/a (validation hardening, no algorithm change)
  • All implemented backends produce consistent results: yes, the check is backend-agnostic
  • NaN handling correct: unchanged
  • Edge cases covered by tests: mixed-backend, both directions
  • Dask chunk boundaries handled correctly: unchanged
  • No premature materialization: yes, the check only inspects backend type, no compute
  • Benchmark exists or not needed: not needed (validation path)
  • README feature matrix updated: n/a (no new function, no backend support change)
  • Docstrings present and accurate: unchanged public signature

Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review (follow-up after merge with main)

Re-reviewed after merging origin/main (which brought in #2619's 3D crosstab cupy/dask+cupy test coverage). No code changes were needed in response to the first review; the two nits were both dismissed with reasons:

  • Early-shape ordering nit: dismissed. Same-backend spatial mismatches already raise a clear "Incompatible shapes" ValueError. The "Invalid layer" ordering edge case is pre-existing and reordering it would expand this PR's scope past the backend gap it targets.
  • No numpy-vs-cupy mixed test: dismissed. That case needs a GPU runner not present in CI.

The merge was conflict-free. The backend check survived intact and the full zonal suite passes (218 tests across test_zonal.py and the backend-coverage file, including the new #2619 3D cupy cases). No new findings.

@brendancol brendancol merged commit 218aded into main May 29, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

crosstab() skips backend/shape validation on the 3D path

1 participant