You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add Hypothesis-driven property tests for the GeoTIFF write/read round trip.
Round-trip coverage today is a long list of incident-specific files (test_metadata_round_trip_1484.py, test_descending_coords_1716.py, test_no_georef_writer_round_trip_1949.py, test_int_coords_round_trip_hotfix_1962.py, test_round_trip_invariants.py) plus one fuzz file (test_fuzz_hypothesis_1661.py) that covers dtype/codec/predictor but skips the metadata axes. The metadata axes (coord dtype, axis direction, degenerate shape, CRS/transform presence, nodata encoding, band axis position) are where most of the recent round-trip bugs have come from.
Why property tests instead of more example tests
Combinatorial blowup. Six axes with 3-4 values each is roughly 500-1500 combinations. Not feasible by hand.
Bugs in the existing 200+ geotiff test files cluster around interactions between two axes (integer coords with degenerate shape, descending y with rotated transform, float nodata with int dtype). Single-axis example tests miss those.
Hypothesis shrinking gives a minimal failing case for free when a future writer change breaks one of the corners.
Property space
One strategy per axis, then a composite strategy that draws one value from each.
sampled_from([4326, 3857, 32633, 26910, None]) (None paired with crs_only=False)
Per-draw filtering for illegal combinations (e.g. nodata=nan with int dtype must promote to float on read, which is documented behaviour, not a failure).
CI profile registered as ci with max_examples=50 and derandomize=True for reproducibility
Seed printed on failure (Hypothesis default)
Backend scope
numpy and dask+numpy for the first pass. cupy and dask+cupy share the same writer/reader code but need a CUDA runner, so leave those for a follow-up once the numpy invariants are pinned.
Out of scope
Byte-for-byte file equality. The writer is allowed to reorder IFD tags, change strip layout, etc. The test_golden_corpus_*.py files cover byte stability where it matters.
Performance or timing assertions.
VRT, COG, overviews. Those have their own round-trip suites.
GPU code paths.
Tie-in
This file is meant to back-stop several adjacent contracts. Cross-reference them in the docstring:
hypothesis is already imported by test_fuzz_hypothesis_1661.py via pytest.importorskip. It is not declared in pyproject.toml or setup.py. Adding it to a test extras group is a small follow-up but not blocking. The new file uses the same importorskip guard.
Acceptance
New file xrspatial/geotiff/tests/test_roundtrip_properties.py exists
Summary
Add Hypothesis-driven property tests for the GeoTIFF write/read round trip.
Round-trip coverage today is a long list of incident-specific files (
test_metadata_round_trip_1484.py,test_descending_coords_1716.py,test_no_georef_writer_round_trip_1949.py,test_int_coords_round_trip_hotfix_1962.py,test_round_trip_invariants.py) plus one fuzz file (test_fuzz_hypothesis_1661.py) that covers dtype/codec/predictor but skips the metadata axes. The metadata axes (coord dtype, axis direction, degenerate shape, CRS/transform presence, nodata encoding, band axis position) are where most of the recent round-trip bugs have come from.Why property tests instead of more example tests
Property space
One strategy per axis, then a composite strategy that draws one value from each.
sampled_from(['int32', 'int64', 'float32', 'float64'])sampled_from(['asc_asc', 'asc_desc', 'desc_asc', 'desc_desc'])sampled_from([(1, 1), (1, 8), (8, 1), (4, 5), (16, 16)])sampled_from(['crs_only', 'transform_only', 'both', 'neither'])sampled_from(['in_range', 'out_of_range', 'fractional', 'nan', 'none'])sampled_from(['band_first', 'band_last', 'no_band'])sampled_from(['uint8', 'int16', 'int32', 'float32', 'float64'])sampled_from([4326, 3857, 32633, 26910, None])(None paired withcrs_only=False)Per-draw filtering for illegal combinations (e.g.
nodata=nanwith int dtype must promote to float on read, which is documented behaviour, not a failure).Round-trip invariant
After two cycles:
assert
da1 == da2under semantic equality:data:np.array_equalwith NaN-aware comparedtype: identicaldims: identical (including band axis position)coords: per-axisnp.allclosefor float,np.array_equalfor int; direction preservedattrs['crs']: same int EPSG, or both absentattrs['transform']: sameGeoTransformtuple to 1e-9 relative tolerance, or both flagged no-georef via the marker from geotiff: to_geotiff silently strips georef on int64 step-1 user coords #2120attrs['nodata']: same value, or both absent; NaN compares equal to NaNFixed-point (
da1 == da2) works because the writer is deterministic given the same input attrs. One full cycle is enough to surface drift.File location
xrspatial/geotiff/tests/test_roundtrip_properties.pySkip the module if
hypothesisis not installed, same pattern astest_fuzz_hypothesis_1661.py.Hypothesis profile
settings(max_examples=200, deadline=None, suppress_health_check=[HealthCheck.too_slow])ciwithmax_examples=50andderandomize=Truefor reproducibilityBackend scope
numpy and dask+numpy for the first pass. cupy and dask+cupy share the same writer/reader code but need a CUDA runner, so leave those for a follow-up once the numpy invariants are pinned.
Out of scope
test_golden_corpus_*.pyfiles cover byte stability where it matters.Tie-in
This file is meant to back-stop several adjacent contracts. Cross-reference them in the docstring:
xrspatial/geotiff/tests/test_backend_parity_matrix.py)test_masked_nodata_attr_2092.pyand Bug: attrs['masked_nodata'] reports True when masking was disabled #2092)Dev dependency
hypothesisis already imported bytest_fuzz_hypothesis_1661.pyviapytest.importorskip. It is not declared inpyproject.tomlorsetup.py. Adding it to atestextras group is a small follow-up but not blocking. The new file uses the sameimportorskipguard.Acceptance
xrspatial/geotiff/tests/test_roundtrip_properties.pyexistsciand default profiles registeredtest_backend_parity_matrix.py