Skip to content

Wrap eager source lifetime around read_all() (#2322)#2325

Merged
brendancol merged 1 commit into
mainfrom
issue-2322
May 23, 2026
Merged

Wrap eager source lifetime around read_all() (#2322)#2325
brendancol merged 1 commit into
mainfrom
issue-2322

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2322.

Summary

  • _read_to_array used to call src.read_all() BEFORE entering the
    try/finally block that calls src.close(). Any exception from
    read_all() (fsspec network failure, transient S3 error, local I/O
    failure) would skip the cleanup.
  • The fix moves the try to start right after src is constructed
    and pulls src.read_all() inside the protected region. Mirrors the
    close-on-error contract _read_cog_http already enforces (geotiff: _read_cog_http skips source.close() when tile fetch raises #1816).
  • _CloudSource.close() is a no-op today, so this is a structural
    guard rather than a fix for an observable leak. The point is to keep
    the cleanup intact before any future resource-holding source
    (pinned credentials, persistent fsspec sessions, cached file
    handles) gets added.

Backend coverage

N/A. Pure correctness fix in the eager-read entry point. No backend
dispatch involved.

Test plan

  • New regression tests in
    xrspatial/geotiff/tests/test_eager_source_close_on_error_2322.py
    cover the _FileSource, _BytesIOSource, and _CloudSource
    branches by injecting fake sources whose read_all() raises
    and asserting close() was still called.
  • Confirmed the new tests fail without the fix and pass with it.
  • Existing reader/source tests still pass
    (test_bytesio_source, test_cog_http_close_on_error_1816,
    test_cloud_read_byte_limit_1928,
    test_open_geotiff_missing_sources_1810 — 40 tests).

``_read_to_array`` constructed a source (``_FileSource``,
``_BytesIOSource``, or ``_CloudSource``) and immediately called
``src.read_all()`` BEFORE entering the ``try/finally`` block that
calls ``src.close()``. If ``read_all()`` raised mid-read, the
exception propagated up and ``src.close()`` was never called.

Move the ``try`` to start right after ``src`` is constructed and
pull ``src.read_all()`` inside the protected region so cleanup runs
even when the eager read fails.

``_CloudSource.close()`` is a no-op today, so this is a structural
guard rather than a fix for an observable leak. It mirrors the
close-on-error contract that ``_read_cog_http`` already enforces
(issue #1816), and prevents a future resource-holding source from
leaking state on the failure path.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 23, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Wrap eager source lifetime around read_all() (#2322)

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

None.

Nits (optional improvements)

  • xrspatial/geotiff/_reader.py:160_CloudSource(source) itself can
    raise (e.g. an fsspec failure when fetching size), and that raise
    still happens outside the new try/finally. Today the partially
    constructed _CloudSource has no state to clean up, but the same
    reasoning that motivates this PR applies: a future stateful
    constructor would leak. Out of scope for this PR (the constructor
    guard is a separate concern), but worth noting in a follow-up.

What looks good

  • The fix is the minimal correct one: try now starts immediately
    after src is constructed, with src.read_all() inside.
    finally: src.close() already existed and now covers read_all()
    too.
  • Comment block above the try explains the why, links to issue
    #1816 (the analogous _read_cog_http fix), and references #2322.
  • The pre-existing CloudSizeLimitError paths at lines 167 and 175
    still call src.close() before raising, outside the new try.
    That's fine — they already had their own cleanup. No double-close
    risk because they raise immediately after.
  • Tests cover all three source types (_FileSource, _BytesIOSource,
    _CloudSource) by patching the constructor to return a fake whose
    read_all() raises. The test file follows the same pattern as
    test_cog_http_close_on_error_1816.py.
  • I verified the tests fail without the production change and pass
    with it, so they actually pin the contract.
  • Temp filenames include the issue number (tmp_2322_cleanup_file.tif).

Checklist

  • Algorithm matches reference/paper — N/A (cleanup correctness fix)
  • All implemented backends produce consistent results — N/A
  • NaN handling is correct — N/A
  • Edge cases are covered by tests — all three source-type branches
  • Dask chunk boundaries handled correctly — N/A
  • No premature materialization or unnecessary copies — N/A
  • Benchmark exists or is not needed — not needed
  • README feature matrix updated — N/A
  • Docstrings present and accurate — module docstring on test file

@brendancol brendancol merged commit edcaf8a into main May 23, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Eager source cleanup is not exception-safe around read_all() in geotiff reader

1 participant