You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The reader has tightened over time (test_crs_fail_closed_1929.py, test_crs_fail_closed_gpu_1929.py, test_coord_regularity_1720.py, test_rotated_transform_attr_1764.py) but several "guess and continue" paths remain. The reviewer's position, which I agree with: for foundational I/O, refusing ambiguous input is better than manufacturing a plausible-looking raster.
Unparseable CRS strings — partial WKT may be accepted and emitted unchanged on read, leading to mismatched crs vs crs_wkt attrs.
Rotated transforms — currently flagged in attrs but the reader does not reject; downstream functions assume axis-aligned and may produce wrong results.
Non-uniform coords on write — to_geotiff accepts coords that disagree with the implied transform; sentinel checks exist but coverage is uneven.
Mixed band metadata — VRT inputs with bands that declare different nodata sentinels are silently flattened.
Conflicting crs and crs_wkt on write — no consistency check.
Conflicting nodata aliases — attrs['nodata'] set to one value and attrs['nodatavals'] set to another. _resolve_nodata_attr picks one and ignores the other.
Proposal
Each of the above becomes a hard error by default, with an opt-in escape hatch where the legacy behavior was documented.
Design:
Add a validation layer in _validation.py (which already exists) and call it from every read and write entry point. The validator raises with a specific error class per case (subclasses of a new GeoTIFFAmbiguousMetadataError). Each case gets its own test in test_crs_fail_closed_* style — one new test file per case is fine, but prefer adding to existing fail-closed files where the topic matches.
unparseable CRS strings — raise on read, raise on write. No silent passthrough.
rotated transforms — read raises by default; opt-in allow_rotated=True keeps existing attr-flag behavior for users who want the array without geo-aware downstream ops.
mixed band metadata in VRT — read raises; opt-in band_nodata='first' retains current.
conflicting crs/crs_wkt — write raises if both are set and disagree (compare EPSG → WKT after canonicalization).
conflicting nodata aliases — write raises if attrs['nodata'] and attrs['nodatavals'] disagree. _FillValue is a CF alias and continues to be deprioritized.
Usage:
Users see a typed error pointing at the ambiguous field. Existing legitimate workflows surface through explicit opt-in kwargs.
Value:
A user complaining that "xrspatial gave me the wrong CRS" almost always traces to a guess we should have refused. Reviewer's words: "refusing ambiguous input is better than manufacturing plausible-looking rasters."
Stakeholders and Impacts
Read paths and write paths gain a validation step.
Downstream code that relied on the guess may need to set the new opt-in kwargs.
Existing fail-closed tests get extended; no expected removal.
Drawbacks
This is a behavior change. Some files that read today will start raising. Each case needs a release-note entry and a migration suggestion.
Alternatives
Per-case warnings instead of errors. Already tried — wkt_only_crs_warning is one example. Warnings get filtered out and users continue to be surprised.
Unresolved Questions
Single base error class with subclasses, or one error class per case? Probably subclasses, but worth confirming for except clarity.
Should the opt-in kwargs go on every entry point or live on a central strict_metadata= flag (defaulting to True)?
Additional Notes or Context
Source of this proposal: code review feedback on the geotiff module, suggestion 4 of 5. Each case should land as a separate PR per the project convention of one fix per PR.
Reason or Problem
The reader has tightened over time (
test_crs_fail_closed_1929.py,test_crs_fail_closed_gpu_1929.py,test_coord_regularity_1720.py,test_rotated_transform_attr_1764.py) but several "guess and continue" paths remain. The reviewer's position, which I agree with: for foundational I/O, refusing ambiguous input is better than manufacturing a plausible-looking raster.Specific cases still loose:
to_geotiff(crs=True)writes EPSG=1 silently (already filed as geotiff: to_geotiff(crs=True) writes EPSG=1 silently; bool slips through int check #1971).crsvscrs_wktattrs.to_geotiffaccepts coords that disagree with the implied transform; sentinel checks exist but coverage is uneven.crsandcrs_wkton write — no consistency check.attrs['nodata']set to one value andattrs['nodatavals']set to another._resolve_nodata_attrpicks one and ignores the other.Proposal
Each of the above becomes a hard error by default, with an opt-in escape hatch where the legacy behavior was documented.
Design:
Add a validation layer in
_validation.py(which already exists) and call it from every read and write entry point. The validator raises with a specific error class per case (subclasses of a newGeoTIFFAmbiguousMetadataError). Each case gets its own test intest_crs_fail_closed_*style — one new test file per case is fine, but prefer adding to existing fail-closed files where the topic matches.Per-case rough plan:
allow_rotated=Truekeeps existing attr-flag behavior for users who want the array without geo-aware downstream ops.band_nodata='first'retains current.crs/crs_wkt— write raises if both are set and disagree (compare EPSG → WKT after canonicalization).attrs['nodata']andattrs['nodatavals']disagree._FillValueis a CF alias and continues to be deprioritized.Usage:
Users see a typed error pointing at the ambiguous field. Existing legitimate workflows surface through explicit opt-in kwargs.
Value:
A user complaining that "xrspatial gave me the wrong CRS" almost always traces to a guess we should have refused. Reviewer's words: "refusing ambiguous input is better than manufacturing plausible-looking rasters."
Stakeholders and Impacts
Drawbacks
This is a behavior change. Some files that read today will start raising. Each case needs a release-note entry and a migration suggestion.
Alternatives
Per-case warnings instead of errors. Already tried —
wkt_only_crs_warningis one example. Warnings get filtered out and users continue to be surprised.Unresolved Questions
exceptclarity.strict_metadata=flag (defaulting to True)?Additional Notes or Context
Source of this proposal: code review feedback on the geotiff module, suggestion 4 of 5. Each case should land as a separate PR per the project convention of one fix per PR.