Skip to content

geotiff: make _FileSource a context manager (#2449)#2479

Merged
brendancol merged 2 commits into
mainfrom
issue-2449
May 27, 2026
Merged

geotiff: make _FileSource a context manager (#2449)#2479
brendancol merged 2 commits into
mainfrom
issue-2449

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Adds __enter__ / __exit__ to _FileSource so callers can write with _FileSource(path) as src: instead of wrapping every open in a try / finally.

Sweeps the mechanical call sites in _vrt_validation.py, _vrt.py, _backends/gpu.py, and the four test modules that exercised the helper directly. The mmap-cache refcount semantics are unchanged; __exit__ just calls the existing close().

Two production sites are left as-is. _reader._read_to_array and _backends/gpu._read_geotiff_gpu_eager both wrap a several-hundred-line function body in a try / finally and mix _FileSource with _CloudSource. Converting those would mean either teaching _CloudSource the protocol or restructuring large blocks, which is outside the mechanical scope flagged in #2449.

Test plan

  • New tests xrspatial/geotiff/tests/test_file_source_context_2449.py cover __enter__ returning self, refcount release on normal exit, release on exception, double-close safety, and nested-with refcount accounting.
  • pytest xrspatial/geotiff/tests/ (non-gpu / non-integration): 5096 passed, 65 skipped, 2 xfailed.
  • pytest xrspatial/geotiff/tests/gpu/: 384 passed, 3 skipped.
  • pytest xrspatial/geotiff/tests/integration/: 474 passed.

Closes #2449.

Add __enter__/__exit__ to _FileSource so callers can use a `with`
block instead of try/finally around close(). Sweep the mechanical
call sites in _vrt_validation, _vrt, _backends/gpu, and the four
test modules that exercised the helper.

Two production sites are intentionally left alone: _reader._read_to_array
and _backends/gpu._read_geotiff_gpu_eager both wrap a several-hundred-line
function body in the try/finally and mix _FileSource with _CloudSource,
which is outside the mechanical scope flagged by #2449.

Adds xrspatial/geotiff/tests/test_file_source_context_2449.py covering
__enter__ returning self, refcount release on normal exit, release on
exception, double-close safety, and nested-with refcount accounting.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 27, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: geotiff: make _FileSource a context manager (#2449)

Blockers

None.

Suggestions

None. The refactor is mechanical and the behaviour is preserved at every call site.

Nits

  • xrspatial/geotiff/tests/test_file_source_context_2449.py:21 uses __import__('os').path.realpath(path) instead of a top-level import os. The inline import keeps the helper self-contained but is unusual; a top-level import os reads more naturally.
  • xrspatial/geotiff/tests/test_file_source_context_2449.py:50 has a del src line annotated "unreachable" after a call that always raises. Dead code; can drop the line and the trailing comment.
  • xrspatial/geotiff/_sources.py:332 returns False from __exit__ explicitly. That matches the implicit None return; either is fine, but dropping the return keeps the body to a single line.

What looks good

  • __enter__ returns self, __exit__ calls the existing close(). Idiomatic, and the mmap-cache refcount semantics are preserved.
  • _vrt_validation.py:559 flattens the nested try / except / finally into try / with / except. The outer except catches OSError both from the constructor (matching the old inner try) and from the read/parse calls, so the observable behaviour is the same with fewer indentation levels.
  • _vrt.py:1958 is a latent fix in addition to the cleanup. The old loop only called src.close() on the success path, so a parse_header exception would leak a cache refcount. The with block closes on both paths.
  • New tests cover __enter__ identity, refcount release on normal exit, refcount release on exception, double-close safety, and nested-with refcount accounting.
  • The PR body is explicit about the two intentionally-skipped production sites (_reader._read_to_array, _backends/gpu._read_geotiff_gpu_eager). Both wrap several hundred lines and mix _FileSource with _CloudSource, so converting them is outside the mechanical scope #2449 was scoped to.

Checklist

  • Algorithm matches reference/paper: N/A (helper refactor)
  • All implemented backends produce consistent results: no backend-specific behaviour changes
  • NaN handling is correct: N/A
  • Edge cases are covered by tests: exception path, double-close, nesting all covered
  • Dask chunk boundaries handled correctly: N/A
  • No premature materialization or unnecessary copies
  • Benchmark exists or is not needed: not needed
  • README feature matrix updated (if applicable): not applicable
  • Docstrings present and accurate: __enter__ / __exit__ are self-evident protocol methods; the class docstring already covers the source

Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review: nits addressed

Original nits, all fixed in a44d532:

  • _sources.py __exit__ no longer returns False; falls through to implicit None.
  • test_file_source_context_2449.py now uses a top-level import os instead of __import__('os').
  • test_exit_releases_on_exception drops the unreachable del src line and the trailing # always raises comment.

Re-ran pytest xrspatial/geotiff/tests/test_file_source_context_2449.py: 5 passed.

No further findings.

@brendancol brendancol merged commit bb13d56 into main May 27, 2026
6 of 7 checks passed
brendancol added a commit that referenced this pull request May 27, 2026
…2440)

Two follow-ups after the consolidation surfaced fresh issues:

1. CI fast-lane failure (ImportError: lerc) in
   ``test_dask_numpy_parity[compression_lerc_float32]``. The original
   top-level ``test_golden_corpus_dask_numpy_1930.py`` was never
   reached by the ``pytest xrspatial/geotiff/tests/golden_corpus/``
   workflow target, so a latent bug in its ``_build_param`` (no call
   to ``fast_slow_marks_for``) was invisible. The Sub-PR B relocation
   moved the file into ``golden_corpus/`` per the issue spec, and the
   fast-lane CI lane started running it without the ``slow`` filter
   applying. Wire ``fast_slow_marks_for`` into the three dask /
   GPU backend modules (``test_dask_numpy``, ``test_dask_gpu``,
   ``test_gpu``) so the manifest's tagged-slow ``compression_*``
   fixtures (including the optional-dep ``lerc`` cell) drop out of
   ``pytest -m "not slow"`` the same way they do in the eager
   backend module. Nightly / push-to-main lanes still exercise the
   full set.

2. The merge from origin/main pulled in
   ``test_file_source_context_2449.py`` (PR #2479's coverage for the
   new ``_FileSource`` context-manager protocol). That landed at the
   top level with a numeric suffix, pushing the Definition-of-Done
   counts off-target (top-level=6, suffixed=1). Folded into
   ``read/test_basic.py`` alongside the other input-source variants
   so the counts return to 5 top-level / 0 suffixed.

Verified locally:
- ``pytest xrspatial/geotiff/tests/golden_corpus/ -m "not slow" -q``
  green with 30 deselected (the slow compression cells, including
  ``lerc``).
- ``pytest xrspatial/geotiff/tests/golden_corpus/ -q`` green (full
  lane: 339 passed, 25 skipped).
- ``pytest xrspatial/geotiff/tests/ -q`` green (5941 passed, 81
  skipped, 2 xfailed).
brendancol added a commit that referenced this pull request May 27, 2026
…pus / tail (#2440) (#2480)

* geotiff tests: consolidate closing cluster 16 — features / golden-corpus / tail (#2440)

Final cluster of long-tail epic #2424. Folds 26 top-level test files
into their natural homes and lands the final audit pass.

Sub-PR A — features / tiers (4 files) → release_gates/test_features.py:
- test_features.py (relocated + retained as the spine of the merged file)
- test_supported_features_shape_2348.py (section header)
- test_supported_features_tiers_2137.py (section header)
- test_unsupported_features_2349.py (section header)
- test_vrt_stable_only_2443.py (release-contract gate, folded too)

Sub-PR B — golden-corpus relocation (13 files) → golden_corpus/:
Moved with rename (drop _1930 suffix). Rewires the relative
._helpers import in test_http.py to ..\_helpers after the move.

Sub-PR C — uncategorised tail (11 files) → individual homes per the
issue's placements. test_remaining_fail_closed_1987.py lands in
unit/test_metadata.py alongside the existing _1987 framework tests.

Out-of-scope cleanup: five numbered files left at the top level by
prior clusters folded in (test_inconsistent_geokeys_2417,
test_overview_kernels_2413, test_invalid_int_nodata_rejection_2441,
test_compound_crs_reject_2418), plus the three remaining numbered
top-level files renamed to drop the suffix (test_fuzz_hypothesis_1661,
test_namespace_no_leak_1708, test_polish_1488). The two near-duplicate
round-trip modules consolidated into one test_round_trip.py.

Release-gate citations in docs/source/reference/release_gate_geotiff.rst,
docs/source/reference/geotiff.rst, golden_corpus_baselines.rst, the
package docstring, and a stray code comment in _header.py updated to
the new paths. The release-gate citation meta-gate now passes.

Definition-of-done audit (epic #2424):
- find xrspatial/geotiff/tests -name 'test_*.py' | wc -l → 72 (target 60–80)
- find xrspatial/geotiff/tests -maxdepth 1 -name 'test_*.py' | wc -l → 5 (target ≤5)
- filenames matching test_*_[0-9]{4,}.py → 0
- pytest xrspatial/geotiff/tests/ -q → green

Tests-only.

Closes #2425
Closes #2434
Closes #2438
Closes #2440
Closes #2424

* Address review: rename property-test helpers, fix stale citation (#2440)

- test_round_trip.py: rename second-section _assert_fixed_point /
  _LOCKED_ATTRS to _assert_property_fixed_point / _PROPERTY_LOCKED_ATTRS
  so the invariant tests at lines 149-682 keep their original
  byte-equivalent tolerance (np.testing.assert_array_equal +
  pytest.approx 1e-15) and the Hypothesis property tests use their
  intended math.isclose tolerance. Both helpers now coexist instead of
  one shadowing the other on module load.
- test_round_trip.py: drop the duplicate numpy/pytest/xarray imports
  in the Hypothesis section since they're already imported at the top.
- read/test_georef.py: update stale citation pointing at the deleted
  test_remaining_fail_closed_1987.py to point at the new home in
  unit/test_metadata.py.
- release_gates/test_features.py: add a docstring note that the
  release_gate markers are localised to the contract sections so
  pytest -m release_gate selects the right subset.

* Delete CLUSTER_AUDIT_CLOSE.md (epic #2424 hard gate for cluster 16 / #2440)

Final pre-merge gate. The cluster-16 audit doc was a working artefact
used to capture per-section moves, helper renames, and citation
rewires while building the consolidation. Every prior cluster issue
deleted its audit doc in a dedicated final commit before merge
(#2474, #2476). Closing cluster follows the same convention.

* Fix CI: gate slow corpus fixtures in dask backends + fold #2449 test (#2440)

Two follow-ups after the consolidation surfaced fresh issues:

1. CI fast-lane failure (ImportError: lerc) in
   ``test_dask_numpy_parity[compression_lerc_float32]``. The original
   top-level ``test_golden_corpus_dask_numpy_1930.py`` was never
   reached by the ``pytest xrspatial/geotiff/tests/golden_corpus/``
   workflow target, so a latent bug in its ``_build_param`` (no call
   to ``fast_slow_marks_for``) was invisible. The Sub-PR B relocation
   moved the file into ``golden_corpus/`` per the issue spec, and the
   fast-lane CI lane started running it without the ``slow`` filter
   applying. Wire ``fast_slow_marks_for`` into the three dask /
   GPU backend modules (``test_dask_numpy``, ``test_dask_gpu``,
   ``test_gpu``) so the manifest's tagged-slow ``compression_*``
   fixtures (including the optional-dep ``lerc`` cell) drop out of
   ``pytest -m "not slow"`` the same way they do in the eager
   backend module. Nightly / push-to-main lanes still exercise the
   full set.

2. The merge from origin/main pulled in
   ``test_file_source_context_2449.py`` (PR #2479's coverage for the
   new ``_FileSource`` context-manager protocol). That landed at the
   top level with a numeric suffix, pushing the Definition-of-Done
   counts off-target (top-level=6, suffixed=1). Folded into
   ``read/test_basic.py`` alongside the other input-source variants
   so the counts return to 5 top-level / 0 suffixed.

Verified locally:
- ``pytest xrspatial/geotiff/tests/golden_corpus/ -m "not slow" -q``
  green with 30 deselected (the slow compression cells, including
  ``lerc``).
- ``pytest xrspatial/geotiff/tests/golden_corpus/ -q`` green (full
  lane: 339 passed, 25 skipped).
- ``pytest xrspatial/geotiff/tests/ -q`` green (5941 passed, 81
  skipped, 2 xfailed).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: make _FileSource a context manager (cleanup follow-up to #2444)

1 participant