Epic: Consolidate the GeoTIFF test suite
Background
xrspatial/geotiff/tests/ has grown to 354 test files / ~97k lines, plus one related file under xrspatial/tests/. The growth pattern is one file per GitHub issue: test_allow_rotated_geotiff_2115.py, test_backend_full_parity_2211.py, test_vrt_missing_sources_policy_1799.py, and so on.
The shape of the problem:
- Four backend-parity files, four attrs-contract files, three VRT-validation files, three rotated-CRS files. Each set covers nearly the same surface area with slightly different cases.
- Past "consolidated matrix" PRs landed without removing their predecessors.
_1799 is a strict subset of _2367 but both still ship.
- File names read like a list of closed tickets. It is hard to look at the directory and see what is tested, what is duplicated, or what is missing.
- 28KB+ clusters where one parametrized test would do the same job.
Goal
Restructure xrspatial/geotiff/tests/ from ~354 files down to ~50, organised by concern. Drop issue numbers from filenames (git log and PR descriptions remain the audit trail). Tests-only restructure; no changes to xrspatial/geotiff/ source modules.
Target layout
xrspatial/geotiff/tests/
├── conftest.py (slim: fixtures + marker registration only)
├── _helpers/
│ ├── tiff_builders.py (make_minimal_tiff, relocated from conftest)
│ ├── tiff_surgery.py (relocated from current top-level)
│ └── markers.py (requires_gpu, requires_loopback, requires_integration)
├── read/
│ ├── test_basic.py
│ ├── test_dtypes.py
│ ├── test_compression.py
│ ├── test_tiling.py
│ ├── test_endianness.py
│ ├── test_nodata.py
│ ├── test_crs.py (rotated, dropped, missing, EPSG variants)
│ ├── test_coords.py
│ └── test_streaming.py
├── write/
│ ├── test_basic.py
│ ├── test_cog.py
│ ├── test_bigtiff.py
│ └── test_overview.py
├── vrt/
│ ├── test_validation.py
│ ├── test_missing_sources.py
│ ├── test_metadata.py
│ ├── test_window.py
│ └── test_dtype_conversion.py
├── attrs/
│ ├── test_contract.py
│ └── test_roundtrip.py
├── parity/
│ ├── test_backend_matrix.py
│ └── test_pixel_equality.py
├── release_gates/
│ └── test_stable_features.py
├── integration/
│ ├── test_http_sources.py
│ ├── test_dask_pipeline.py
│ └── test_gpu_pipeline.py
├── unit/
│ ├── test_header.py
│ ├── test_dtypes.py
│ ├── test_geotags.py
│ ├── test_safe_xml.py
│ └── test_compression.py
└── golden_corpus/ (unchanged; already well structured)
xrspatial/tests/test_geotiff_streaming_bigtiff_threshold_1785.py folds into geotiff/tests/read/test_streaming.py.
Conventions
- No issue numbers in filenames or test names. Git history is the trail.
- One file per concern. Each file owns a slice of the surface area and is the only place that slice is tested.
- Parametrize over backends × fixtures × variants with descriptive
id= strings (id="rotated_no_crs[float32-deflate]", not id="bug_2126").
make_minimal_tiff lives in _helpers/tiff_builders.py. conftest.py only holds fixtures and registers markers.
- Markers in one place:
_helpers/markers.py (requires_gpu, requires_loopback, requires_integration). The pytest_collection_modifyitems socketserver hack in the current conftest.py is replaced by an explicit @requires_loopback marker on the affected tests.
Rollout
Each PR is self-contained. It creates the directories it needs, lands the consolidated tests, runs the audit (below), and deletes the superseded files in the same commit. The order goes from the smallest, lowest-risk cluster to the largest.
PR 1 — Foundation + VRT missing-sources
- Adds
_helpers/ (tiff_builders, tiff_surgery, markers) and vrt/ directory.
- Collapses
test_vrt_missing_sources_policy_1799.py + test_vrt_missing_sources_policy_2367.py into vrt/test_missing_sources.py.
- Smallest cluster; exercises the new layout end to end.
PR 2 — VRT validation
test_vrt_validation_2321.py + test_vrt_capability_validator_2371.py + test_vrt_unsupported_2370.py → vrt/test_validation.py.
PR 3 — Rotated / dropped CRS
test_allow_rotated_geotiff_2115.py + test_allow_rotated_crs_drop_2126.py + test_allow_rotated_no_crs_2122.py → read/test_crs.py, parametrized over (has_crs, has_rotation, drop_crs).
PR 4 — Backend parity
test_backend_parity_matrix.py + test_backend_full_parity_2211.py + test_backend_pixel_parity_matrix_1813.py + test_attrs_kwarg_parity_1561.py → parity/test_backend_matrix.py + parity/test_pixel_equality.py.
- Largest cluster by lines; biggest reviewer load.
PR 5 — Attrs contract
- The four
test_attrs_*_1984.py files (canonical, aliases, passthrough, version) → attrs/test_contract.py.
PR 6 — VRT metadata / window / dtype
- The ~120 remaining VRT-prefixed files split across
vrt/test_metadata.py, vrt/test_window.py, vrt/test_dtype_conversion.py.
- May split into two sub-PRs if the review surface is too large.
PR 7 — Writer / COG / BigTIFF
- ~20 files →
write/test_basic.py, write/test_cog.py, write/test_bigtiff.py, write/test_overview.py.
PR 8 — Reader paths
- Remaining read-side files →
read/*.py (basic, dtypes, compression, tiling, endianness, nodata, coords, streaming).
- Folds in
xrspatial/tests/test_geotiff_streaming_bigtiff_threshold_1785.py.
PR 9 — Accessor / integration
- ~20 files →
integration/test_http_sources.py, integration/test_dask_pipeline.py, integration/test_gpu_pipeline.py.
PR 10 — Release-gate registry
- Every
@pytest.mark.release_gate test pulled into release_gates/test_stable_features.py.
PR 11 — Unit-level cleanup
- Header / dtypes / geotags / safe_xml / compression unit tests →
unit/*.py.
- Slim
conftest.py to fixtures only. Drop the pytest_collection_modifyitems socketserver hack. Final lint pass.
PR title pattern: geotiff tests: consolidate <cluster> cluster. PR body links the superseded issue numbers so traceability stays in the GitHub record.
Audit method (per PR)
Each cluster PR adds a temporary CLUSTER_AUDIT.md that maps old test functions to their new home. It is deleted before merge.
| Old file:test |
New file:test_id |
Notes |
test_allow_rotated_geotiff_2115.py::test_rotated_read_float32 |
read/test_crs.py::test_rotated_read[float32-with_crs] |
parametrized |
test_allow_rotated_no_crs_2122.py::test_no_crs_warns |
read/test_crs.py::test_rotated_read[float32-no_crs] |
warning assertion preserved |
A row may legitimately collapse: multiple old tests can map to one parametrized case. Dropping a row requires a justification in the Notes column. The audit table is the gate, rather than a coverage diff or trust in the parametrize design.
Reusing existing code
make_minimal_tiff() (currently in conftest.py) already covers compression, tiling, geotransform, byte order, and BigTIFF. Relocate, do not rewrite.
_tiff_surgery.py helpers move under _helpers/ unchanged.
- Golden corpus oracle (
golden_corpus/_oracle.py) is already the right pattern; leave alone.
- Existing fixtures (
simple_float32_tiff, simple_uint16_tiff, geo_tiff_data, tiled_tiff_data) stay in conftest.py.
- Markers keep their existing names (
requires_gpu, requires_loopback, requires_integration).
Verification (per PR)
pytest xrspatial/geotiff/tests/ -x -q passes on a clean checkout.
pytest xrspatial/geotiff/tests/<new_cluster_dir>/ -v lists the expected parametrized IDs.
CLUSTER_AUDIT.md is filled in: every old file::test row is mapped or justified as dropped.
git diff --stat shows old cluster files deleted, not left behind.
pytest --collect-only xrspatial/geotiff/tests/ | wc -l drops, trending toward ~50 files over the full epic.
- CI green across numpy / cupy / dask+numpy / dask+cupy backends, including release-gate jobs.
Definition of done
After PR 11:
find xrspatial/geotiff/tests -name 'test_*.py' | wc -l is around 50.
- No filename matches
test_*_[0-9]{4,}.py.
pytest xrspatial/ -q is green across all backends.
- Coverage on
xrspatial/geotiff/ is at or above the baseline captured before PR 1 lands.
Out of scope
- Changes to
xrspatial/geotiff/ source modules.
- The
golden_corpus/ subdirectory.
- Test suites outside
xrspatial/geotiff/tests/, except the one BigTIFF threshold file noted above.
Epic: Consolidate the GeoTIFF test suite
Background
xrspatial/geotiff/tests/has grown to 354 test files / ~97k lines, plus one related file underxrspatial/tests/. The growth pattern is one file per GitHub issue:test_allow_rotated_geotiff_2115.py,test_backend_full_parity_2211.py,test_vrt_missing_sources_policy_1799.py, and so on.The shape of the problem:
_1799is a strict subset of_2367but both still ship.Goal
Restructure
xrspatial/geotiff/tests/from ~354 files down to ~50, organised by concern. Drop issue numbers from filenames (git log and PR descriptions remain the audit trail). Tests-only restructure; no changes toxrspatial/geotiff/source modules.Target layout
xrspatial/tests/test_geotiff_streaming_bigtiff_threshold_1785.pyfolds intogeotiff/tests/read/test_streaming.py.Conventions
id=strings (id="rotated_no_crs[float32-deflate]", notid="bug_2126").make_minimal_tifflives in_helpers/tiff_builders.py.conftest.pyonly holds fixtures and registers markers._helpers/markers.py(requires_gpu,requires_loopback,requires_integration). Thepytest_collection_modifyitemssocketserver hack in the currentconftest.pyis replaced by an explicit@requires_loopbackmarker on the affected tests.Rollout
Each PR is self-contained. It creates the directories it needs, lands the consolidated tests, runs the audit (below), and deletes the superseded files in the same commit. The order goes from the smallest, lowest-risk cluster to the largest.
PR 1 — Foundation + VRT missing-sources
_helpers/(tiff_builders, tiff_surgery, markers) andvrt/directory.test_vrt_missing_sources_policy_1799.py+test_vrt_missing_sources_policy_2367.pyintovrt/test_missing_sources.py.PR 2 — VRT validation
test_vrt_validation_2321.py+test_vrt_capability_validator_2371.py+test_vrt_unsupported_2370.py→vrt/test_validation.py.PR 3 — Rotated / dropped CRS
test_allow_rotated_geotiff_2115.py+test_allow_rotated_crs_drop_2126.py+test_allow_rotated_no_crs_2122.py→read/test_crs.py, parametrized over(has_crs, has_rotation, drop_crs).PR 4 — Backend parity
test_backend_parity_matrix.py+test_backend_full_parity_2211.py+test_backend_pixel_parity_matrix_1813.py+test_attrs_kwarg_parity_1561.py→parity/test_backend_matrix.py+parity/test_pixel_equality.py.PR 5 — Attrs contract
test_attrs_*_1984.pyfiles (canonical,aliases,passthrough,version) →attrs/test_contract.py.PR 6 — VRT metadata / window / dtype
vrt/test_metadata.py,vrt/test_window.py,vrt/test_dtype_conversion.py.PR 7 — Writer / COG / BigTIFF
write/test_basic.py,write/test_cog.py,write/test_bigtiff.py,write/test_overview.py.PR 8 — Reader paths
read/*.py(basic, dtypes, compression, tiling, endianness, nodata, coords, streaming).xrspatial/tests/test_geotiff_streaming_bigtiff_threshold_1785.py.PR 9 — Accessor / integration
integration/test_http_sources.py,integration/test_dask_pipeline.py,integration/test_gpu_pipeline.py.PR 10 — Release-gate registry
@pytest.mark.release_gatetest pulled intorelease_gates/test_stable_features.py.PR 11 — Unit-level cleanup
unit/*.py.conftest.pyto fixtures only. Drop thepytest_collection_modifyitemssocketserver hack. Final lint pass.PR title pattern:
geotiff tests: consolidate <cluster> cluster. PR body links the superseded issue numbers so traceability stays in the GitHub record.Audit method (per PR)
Each cluster PR adds a temporary
CLUSTER_AUDIT.mdthat maps old test functions to their new home. It is deleted before merge.test_allow_rotated_geotiff_2115.py::test_rotated_read_float32read/test_crs.py::test_rotated_read[float32-with_crs]test_allow_rotated_no_crs_2122.py::test_no_crs_warnsread/test_crs.py::test_rotated_read[float32-no_crs]A row may legitimately collapse: multiple old tests can map to one parametrized case. Dropping a row requires a justification in the Notes column. The audit table is the gate, rather than a coverage diff or trust in the parametrize design.
Reusing existing code
make_minimal_tiff()(currently inconftest.py) already covers compression, tiling, geotransform, byte order, and BigTIFF. Relocate, do not rewrite._tiff_surgery.pyhelpers move under_helpers/unchanged.golden_corpus/_oracle.py) is already the right pattern; leave alone.simple_float32_tiff,simple_uint16_tiff,geo_tiff_data,tiled_tiff_data) stay inconftest.py.requires_gpu,requires_loopback,requires_integration).Verification (per PR)
pytest xrspatial/geotiff/tests/ -x -qpasses on a clean checkout.pytest xrspatial/geotiff/tests/<new_cluster_dir>/ -vlists the expected parametrized IDs.CLUSTER_AUDIT.mdis filled in: every oldfile::testrow is mapped or justified as dropped.git diff --statshows old cluster files deleted, not left behind.pytest --collect-only xrspatial/geotiff/tests/ | wc -ldrops, trending toward ~50 files over the full epic.Definition of done
After PR 11:
find xrspatial/geotiff/tests -name 'test_*.py' | wc -lis around 50.test_*_[0-9]{4,}.py.pytest xrspatial/ -qis green across all backends.xrspatial/geotiff/is at or above the baseline captured before PR 1 lands.Out of scope
xrspatial/geotiff/source modules.golden_corpus/subdirectory.xrspatial/geotiff/tests/, except the one BigTIFF threshold file noted above.