Skip to content

Fail loudly on unsupported GeoTIFF feature combinations (PR 5 of epic #2340)#2355

Merged
brendancol merged 4 commits into
mainfrom
issue-2349
May 24, 2026
Merged

Fail loudly on unsupported GeoTIFF feature combinations (PR 5 of epic #2340)#2355
brendancol merged 4 commits into
mainfrom
issue-2349

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2349. Parent epic #2340 (PR 5 of 6).

Summary

Surface a typed error at the entry point for the unsupported feature combinations listed in the release contract, instead of letting them slip through as silent no-ops or wrong output.

  • New cases rejected at parse time: VRTDataset subClass (Warped / Pansharpened / Processed / Derived), VRTRasterBand subClass (DerivedRasterBand), and unknown VRTRasterBand child elements (KernelFilteredSource, MaskBand, PansharpeningOptions, pixel-function tags, plus a catch-all for unrecognised tags). Known informational children keep passing.
  • New cases rejected at write time: write_vrt refuses sources with a rotated source transform, refuses cross-source AREA_OR_POINT mismatch, and refuses cross-source nodata mismatch unless the caller pins the mosaic nodata via the nodata kwarg.
  • New UnsupportedGeoTIFFFeatureError exported from xrspatial.geotiff. Subclasses ValueError so existing except ValueError callers keep catching the case.
  • Each message names the feature, names the offending input (kwarg / source path / band number), and points at SUPPORTED_FEATURES plus epic Epic: GeoTIFF release contract and feature tiering #2340.

Out of scope

  • Experimental opt-in plumbing (PR 4).
  • Docstring tier audits (PR 3).
  • SUPPORTED_FEATURES reshape (PR 1).
  • Docs page (PR 2).

Backend coverage

  • numpy / cupy / dask+numpy / dask+cupy share parse_vrt and write_vrt; the new gates run before backend dispatch, so all four hit the same error.

Test plan

  • New regression tests under xrspatial/geotiff/tests/test_unsupported_features_2349.py cover each rejected case plus the existing rotated 6-tuple writer refusal and the rotated VRT GeoTransform read refusal.
  • Full VRT test set (-k vrt) still passes (709 tests).
  • Full rotated test set (-k rotated) still passes (99 tests).
  • Writer / drop-rotation / VRT-tiled-validation suites still pass.

Epic #2340 PR 5: detect unsupported GeoTIFF / VRT feature
combinations at the entry point and raise an actionable typed error
naming the feature and pointing at SUPPORTED_FEATURES.

Cases newly rejected:

* VRTDataset subClass attributes (VRTWarpedDataset,
  VRTPansharpenedDataset, etc.) rejected at parse time. The reader
  has no warp / pansharpen pipeline and silent dispatch on whatever
  simple sources happen to be embedded would drop the subclass
  semantics.
* VRTRasterBand subClass attribute (VRTDerivedRasterBand etc.)
  rejected at parse time. No pixel-function evaluator exists.
* Unknown VRTRasterBand child elements raise rather than silently
  skip. Known informational children (Description, UnitType,
  Offset, Scale, Metadata, ColorTable, ...) keep passing; known
  output-altering children (KernelFilteredSource, MaskBand,
  PansharpeningOptions, PixelFunction*, ...) and any unrecognised
  tag raise.
* write_vrt now refuses sources whose declared transform carries a
  non-zero skew term (rotated_affine), refuses cross-source
  AREA_OR_POINT mismatch, and refuses cross-source nodata mismatch
  unless the caller pins the mosaic nodata via the nodata kwarg.

New UnsupportedGeoTIFFFeatureError exported from xrspatial.geotiff.
Subclasses ValueError so existing ``except ValueError`` callers keep
catching the cases.

Regression test suite covers each gate plus the existing rotated
6-tuple writer refusal and the rotated VRT GeoTransform read
refusal so future refactors cannot regress them back to silent
fallback.

Backend coverage: numpy / dask+numpy / cupy / dask+cupy share the
parse_vrt and write_vrt entry points; the gates run before the
backend dispatch, so all four hit the same error.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 24, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Fail loudly on unsupported GeoTIFF feature combinations (PR 5 of epic #2340)

Blockers

None.

Suggestions

  • Dataset-level <MaskBand> slips through. Per the GDAL VRT spec, <MaskBand> sits as a child of <VRTDataset> (sibling of <VRTRasterBand>), not inside a band. The scan at _vrt.py:504-534 only walks band children, so a VRT carrying a dataset-level mask band still silently drops the mask. Putting MaskBand in _UNSUPPORTED_BAND_TAGS does not help. Either sweep the dataset root for MaskBand / GCPList after the subClass check at _vrt.py:390, or note that the gap is deliberate.

  • The three frozensets at _vrt.py:490-502 are rebuilt on every band iteration. Hoist them to module scope.

  • <OverviewList> and <Overview> (band-level VRT overview declarations) are in none of the three sets. The new "raise on unknown" branch at _vrt.py:520-534 will now reject a VRT that declares external overviews on a band -- the previous parser ignored those. If ignoring was intentional, add both tags to _INFORMATIONAL_BAND_TAGS. If rejecting is intentional, add a regression test.

Nits

  • The new write-side checks at _vrt.py:1705-1828 are interleaved with the legacy pixel-size / dtype / CRS checks. A helper or even a comment block grouping the new gates would make the cross-source policy easier to follow. Not a correctness issue.

  • Error messages embed :data:`xrspatial.geotiff.SUPPORTED_FEATURES` in raw RST. Callers see this in tracebacks, not in rendered Sphinx output, so the :data: role is dead weight. Plain `xrspatial.geotiff.SUPPORTED_FEATURES` reads better.

What looks good

  • 13 tests cover each rejected case plus regression pins for the existing rotated-write and rotated-VRT-read paths.
  • UnsupportedGeoTIFFFeatureError subclasses ValueError, so existing except ValueError callers keep catching the case.
  • The mixed-nodata gate has an explicit nodata= opt-out and the test covers it.
  • Each new check has a comment naming the silent-failure mode it replaces.
  • The rotated 6-tuple writer refusal and the rotated VRT GeoTransform read refusal both pick up regression pins.

Checklist

  • VRT subClass attribute and GeoTransform skew terms match the GDAL VRT spec.
  • All four backends share parse_vrt and write_vrt, so the new gates hit uniformly.
  • No NaN-handling regressions; the new checks are metadata-only.
  • Edge cases covered (asymmetric nodata, derived band subClass, unknown band child).
  • No premature materialization; the gates are O(num_sources) attribute reads.
  • [n/a] Benchmark not needed (validation-only, runs before IO).
  • [n/a] README feature matrix unchanged (no new functions added).
  • Docstring present on UnsupportedGeoTIFFFeatureError.

…le (#2349)

Review fixes for PR 5 of epic #2340:

* Add dataset-level sweep for ``<MaskBand>``, ``<GCPList>``, and
  dataset-level ``<PansharpeningOptions>``. The band-children loop
  never saw these because they sit as siblings of
  ``<VRTRasterBand>``; without the new sweep a VRT carrying a
  dataset-level mask band silently dropped the mask.
* Hoist the three band-children classification frozensets
  (``_INFORMATIONAL_BAND_TAGS`` / ``_SOURCE_BAND_TAGS`` /
  ``_UNSUPPORTED_BAND_TAGS``) to module scope so parse_vrt does
  not rebuild them on every band iteration.
* Add ``OverviewList`` and ``Overview`` to the informational set so
  the new "raise on unknown band child" branch does not regress
  VRTs with band-level overview declarations -- the previous
  parser ignored these and the new gate now ignores them too,
  matching real-world GDAL-emitted VRTs.
* Group the new write-side cross-source checks (rotated source
  transform, mixed AREA_OR_POINT registration, mixed nodata) into
  three module-level helpers
  (``_check_no_rotated_source_transforms`` /
  ``_check_no_mixed_raster_type`` / ``_check_no_mixed_nodata``) so
  the cross-source policy is easy to read end-to-end. The pre-#2349
  pixel-size / dtype / band-count / CRS checks stay inline because
  they predate this PR.
* Drop the ``:data:`` Sphinx role from error messages. Callers see
  these strings in tracebacks, not rendered docs, so the plain
  ``\`xrspatial.geotiff.SUPPORTED_FEATURES\``` reads better.

Three new tests pin the dataset-level MaskBand / GCPList refusals
and the OverviewList allow-list.
…2349)

test_features.TestPublicAPI.test_all_lists_supported_functions pins
the exact public surface of ``xrspatial.geotiff.__all__``. The new
``UnsupportedGeoTIFFFeatureError`` added in this PR's earlier
commit needs to be in the expected set too.
``float('nan') != float('nan')`` evaluates True in plain Python, so
the cross-source nodata check would flag two perfectly consistent
NaN-sentinel sources as a mismatch. NaN nodata is the standard
sentinel for float32 / float64 GeoTIFFs, so this would have made
the new fail-closed gate too aggressive on real-world inputs.

Compare via ``math.isnan`` so two NaNs are equal. Add a regression
test that mosaics two sources both carrying NaN nodata and pins
that the writer no longer rejects them.
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review (post-revision)

Re-reviewed after the address-review commits (510e515, 0ff6f06, 88e5904).

Disposition of prior findings

  • Dataset-level <MaskBand> / <GCPList> / <PansharpeningOptions> sweep -- fixed at _vrt.py:458-475. Two new regression tests pin the rejection.
  • Frozensets hoisted to module scope -- fixed at _vrt.py:366-400.
  • <OverviewList> and <Overview> band children kept informational -- fixed at _vrt.py:376. Regression test added.
  • Write-side checks grouped into helpers (_check_no_rotated_source_transforms, _check_no_mixed_raster_type, _check_no_mixed_nodata) -- done at _vrt.py:1646-1749.
  • :data: Sphinx role dropped from error messages -- done.

New findings from this pass

Self-caught while re-reviewing

Two sources both declaring NaN nodata would have hit the new mixed-nodata gate falsely because float('nan') != float('nan') is True in plain Python. Fixed in 88e5904 by routing the comparison through _nodata_values_agree (math.isnan-aware). Regression test pins the round-trip.

Blockers

None.

Suggestions

None remaining.

Nits

None remaining.

Verification

  • 17 tests in test_unsupported_features_2349.py pass.
  • Full xrspatial/geotiff/tests/ suite: 5358 passed, 68 skipped, 1 xfailed, 1 xpassed.

Ready for CI. No outstanding review items.

@brendancol brendancol merged commit 9c40df4 into main May 24, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fail loudly on unsupported GeoTIFF feature combinations (PR 5 of epic #2340)

1 participant