Promote user-defined CRS WKT to attrs['crs_wkt'] on read (#1632)#1635
Conversation
…rib#1632) GeoTIFFs with GeoKey *CSTypeGeoKey == 32767 store their CRS WKT in the GeoTIFF citation rather than referencing an EPSG code. extract_geo_info parsed the citation into geo_info.crs_name but left geo_info.crs_wkt unset, so _populate_attrs_from_geo_info emitted attrs['crs_name'] only. to_geotiff consults attrs['crs'] / attrs['crs_wkt'] but not 'crs_name', so a read -> write round-trip silently dropped the projection on user-defined CRS files. All four read backends were affected. The fix promotes the citation to geo_info.crs_wkt when the citation parses as WKT (top-level WKT 1 / WKT 2 root keywords: PROJCS[, GEOGCS[, PROJCRS[, GEOGCRS[, COMPD_CS[, COMPOUNDCRS[, BOUNDCRS[, VERT_CS[, VERTCRS[, LOCAL_CS[, ENGCRS, PARAMETRICCRS, TIMECRS, DERIVEDPROJCRS). crs_name stays set for back-compat. Citations that are plain names (e.g. 'NAD83 / UTM Zone 12N') are unchanged because _looks_like_wkt rejects them.
There was a problem hiding this comment.
Pull request overview
This PR fixes a GeoTIFF read-path regression where user-defined CRS GeoTIFFs (GeoKey *CSTypeGeoKey == 32767) were losing their CRS WKT on read, causing open_geotiff(...) -> to_geotiff(...) round-trips to silently drop projection information. The fix promotes WKT-looking citations to attrs['crs_wkt'] (while keeping attrs['crs_name'] for backward compatibility), and adds regression tests covering backend parity and round-trip behavior.
Changes:
- Add a lightweight
_looks_like_wkt(...)heuristic and use it inextract_geo_infoto populatecrs_wktfrom the citation when EPSG is not resolved. - Add a comprehensive regression test module covering WKT detection, all four read backends (numpy/dask/cupy/dask+cupy), and read→write→read round-trips.
- Update internal sweep metadata CSV entry for issue tracking.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
xrspatial/geotiff/_geotags.py |
Adds WKT-prefix detection and promotes citation WKT to crs_wkt for user-defined CRS reads. |
xrspatial/geotiff/tests/test_user_defined_crs_wkt_1632.py |
New regression tests for user-defined CRS WKT preservation across backends and round-trips. |
.claude/sweep-metadata-state.csv |
Updates sweep metadata entry to reflect issue #1632. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| import pytest | ||
| import xarray as xr | ||
|
|
||
| import tifffile |
There was a problem hiding this comment.
Fixed in eb600ba: switched to pytest.importorskip
Matches the convention used elsewhere in xrspatial/geotiff/tests/. Test module no longer hard-fails collection on machines without tifffile installed.
Summary
attrs['crs_wkt']inextract_geo_info._populate_attrs_from_geo_infofunnel.crs_namestays set for back-compat. The new_looks_like_wktgate keeps plain names ("NAD83 / UTM Zone 12N") out ofcrs_wkt.Fixes #1632.
Test plan
test_user_defined_crs_wkt_1632.pycovers_looks_like_wkt(positive + negative), backend parity (eager / dask / cupy / dask+cupy), the read -> write -> read round trip, and a non-regression check on EPSG-based files.test_attrs_parity_1548.pyandtest_nodata_attr_aliases_1582.pystill pass (no key-set changes for EPSG files).deepcopyrecursion errors unrelated to this change.