Skip to content

geotiff: add fsspec backend to golden corpus matrix (#1930)#2076

Merged
brendancol merged 2 commits into
mainfrom
1930-fsspec-coverage
May 18, 2026
Merged

geotiff: add fsspec backend to golden corpus matrix (#1930)#2076
brendancol merged 2 commits into
mainfrom
1930-fsspec-coverage

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Last gap in #1930's backend matrix. Every other backend already has a test_golden_corpus_<backend>_1930.py running the manifest fixtures through compare_to_oracle; fsspec was the one left over from the proposal text ("fsspec where applicable").

How it works: each fixture's bytes get written into fsspec's in-process memory:// filesystem (no credentials, no localstack needed), then read back through open_geotiff via _CloudSource and compared against the on-disk rasterio oracle. Same module shape as the eager / dask / GPU modules: _PARITY_GAPS, _FSSPEC_SKIPS, _INTENTIONAL_SKIPS tables, _build_param plumbing, fast_slow_marks_for from golden_corpus/_marks.py, and a test_taxonomy_ids_are_in_manifest guard.

Two extras worth calling out: test_fsspec_candidate_is_actually_numpy pins the contract that the cloud eager path returns numpy (not dask), and a per-test fixture calls fs.store.clear() so the process-global memory filesystem cannot leak state between tests.

Pass / skip counts

Module alone: 32 passed, 2 skipped, 0 xfails.

The two skips are the same ones every other backend module hits:

  • nodata_miniswhite_uint8: intentional, MinIsWhite inversion is asserted in test_miniswhite_backend_parity_1797.py.
  • example_tiled_uint16_deflate_pred2: schema-only manifest entry, no .tif on disk.

_PARITY_GAPS and _FSSPEC_SKIPS both came out empty. The shared codec/attrs gaps that closed for the eager backend (integer nodata masking, RGB band axis order, citation-only CRS) close here too because the fsspec eager path runs through the same decode primitives once _CloudSource.read_all() lands the bytes in memory.

Full corpus matrix after this PR: 350 passed, 12 skipped.

Test plan

  • pytest xrspatial/geotiff/tests/test_golden_corpus_fsspec_1930.py -v (32 passed, 2 skipped)
  • pytest xrspatial/geotiff/tests/test_golden_corpus_*_1930.py xrspatial/geotiff/tests/golden_corpus/ -q (350 passed, 12 skipped, no failures)

Final corpus-coverage backend per issue #1930's proposal text
("fsspec where applicable"). Pushes each fixture's bytes into
fsspec's in-process memory filesystem and reads them back through
open_geotiff via _CloudSource, so the cloud read path exercises
the same fixtures the eager / dask / GPU / VRT / HTTP modules do
without needing real credentials or a localstack-style service.

The skip / xfail taxonomy starts empty: every shared codec/attrs
gap has been closed at the oracle layer, and the fsspec eager
path piggybacks on the eager decode primitives. Only the
intentional MinIsWhite skip and the schema-only fixture (no
.tif on disk) come up as skips.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 18, 2026
Review follow-up on #2076:

* Drop the leading underscore on the ``memory_fs_clean`` pytest
  fixture. Pytest fixtures are public test API; the underscore
  convention is for module-private helpers, not fixtures. Existing
  modules in the repo don't use it.
* Rename the overview-factory lambda parameter ``lvl`` to
  ``level`` so the variable matches the kwarg it forwards to and
  stays consistent with the other backend modules' factories.

The third dismissed nit (repeated ``open(path, 'rb')`` reads) is
genuinely low-value; left as-is.
@brendancol brendancol merged commit 1b0189f into main May 18, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant