crosstab: validate backend compatibility on the 3D path#2653
Merged
Conversation
Dedupe duplicate module rows (last-write-wins by last_inspected) and collapse multi-line notes to single physical lines. The notes had embedded newlines, which the merge=union .gitattributes strategy splits record-by-record, corrupting the file into a 156-column phantom row on parallel-agent appends. One line per record keeps union merges safe.
The 3D crosstab() path never checked that zones and values share a backend. A mixed-backend call (e.g. numpy zones with dask 3D values) crashed deep inside the dask/numba machinery with an opaque "'NoneType' object is not subscriptable". Check backend compatibility up front and raise a clear ValueError, matching the 2D path.
brendancol
commented
May 29, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review: crosstab: validate backend compatibility on the 3D path
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
None.
Nits (optional improvements)
- xrspatial/zonal.py:1416-1430 -- the new check covers the backend mismatch but not the "early shape" half of the original report. Spatial shape is still validated later (
zones.shape != values.shape[1:]), which only fires afterlayeris resolved and the array is transposed. That's fine in practice: a real shape mismatch still raises a clear "Incompatible shapes" ValueError (verified). The one rough edge is that iflayeris also invalid, the user sees "Invalidlayer" first. Not worth reordering for this PR; flagging it so the call is on record. - xrspatial/tests/test_zonal.py:1354 -- the test is gated on dask being available, which is right since both cases need a dask array. There's no numpy-vs-cupy mixed case, but that needs a GPU runner, so leaving it out is reasonable.
What looks good
- The new check sits right next to the existing 2D
validate_arrays()call, so the two paths read symmetrically. - Reusing
_classify_backend()keeps the backend names consistent with the rest of the codebase (numpy / cupy / dask+numpy / dask+cupy). - The comment explains why
validate_arrays()can't be reused here (it needs equal shapes), which is the non-obvious part. - The test checks both directions of the mismatch and asserts the error message, not just the exception type.
- Matching-backend 3D calls (numpy/numpy, dask/dask) are unchanged; the full test_zonal.py passes (170 tests).
Checklist
- Algorithm matches reference/paper: n/a (validation hardening, no algorithm change)
- All implemented backends produce consistent results: yes, the check is backend-agnostic
- NaN handling correct: unchanged
- Edge cases covered by tests: mixed-backend, both directions
- Dask chunk boundaries handled correctly: unchanged
- No premature materialization: yes, the check only inspects backend type, no compute
- Benchmark exists or not needed: not needed (validation path)
- README feature matrix updated: n/a (no new function, no backend support change)
- Docstrings present and accurate: unchanged public signature
brendancol
commented
May 29, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review (follow-up after merge with main)
Re-reviewed after merging origin/main (which brought in #2619's 3D crosstab cupy/dask+cupy test coverage). No code changes were needed in response to the first review; the two nits were both dismissed with reasons:
- Early-shape ordering nit: dismissed. Same-backend spatial mismatches already raise a clear "Incompatible shapes" ValueError. The "Invalid
layer" ordering edge case is pre-existing and reordering it would expand this PR's scope past the backend gap it targets. - No numpy-vs-cupy mixed test: dismissed. That case needs a GPU runner not present in CI.
The merge was conflict-free. The backend check survived intact and the full zonal suite passes (218 tests across test_zonal.py and the backend-coverage file, including the new #2619 3D cupy cases). No new findings.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2640
crosstab()only validated backend/shape compatibility on the 2D path. A 3D call with mismatched backends (numpyzonesplus a dask 3Dvalues, for example) sailed past validation and crashed deep inside the dask/numba machinery withTypeError: 'NoneType' object is not subscriptable, which gives the caller no hint that the real problem is a backend mismatch.This adds an early backend check to the 3D path.
validate_arrays()can't be reused here because it requires equal shapes and zones is 2D while values is 3D, so the check compares_classify_backend()of each input and raises a clearValueErrorwhen they differ.crosstab()inputs, raisingValueErrorup front instead of crashing later.Backend coverage: the validation runs for all backends. Matching-backend 3D calls (numpy/numpy, dask/dask) are unchanged.
Test plan:
pytest xrspatial/tests/test_zonal.py -k crosstab_3dpytest xrspatial/tests/test_zonal.py(full module, no regressions)