Skip to content

Use sidecar header for remote .ovr decode byte order (#2314)#2319

Merged
brendancol merged 2 commits into
mainfrom
issue-2314
May 22, 2026
Merged

Use sidecar header for remote .ovr decode byte order (#2314)#2319
brendancol merged 2 commits into
mainfrom
issue-2314

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2314.

Summary

  • _parse_cog_http_meta now returns the sidecar's TIFFHeader when the selected overview IFD came from the .tif.ovr. The eager HTTP path and the dask HTTP path already feed the unpacked header.byte_order into _decode_strip_or_tile, so the swap is enough to make a big-endian sidecar paired with a little-endian base (or the reverse) decode against the right endianness.
  • The local sidecar path in _reader.py:223 already did this; the remote paths now reach parity.
  • Adds test_remote_sidecar_byte_order_2314.py covering the eager HTTP, dask HTTP, fsspec memory, and direct _parse_cog_http_meta contracts for both byte-order directions. Eight of ten test cases fail on the unfixed branch and pass after the change.

Backend coverage

  • numpy HTTP eager (_read_cog_http)
  • dask+numpy HTTP (_backends/dask.py)
  • dask+numpy fsspec (same dask path, exercised via memory://)
  • cupy / dask+cupy paths inherit the fix because they share the same metadata parse

Test plan

  • pytest xrspatial/geotiff/tests/test_remote_sidecar_byte_order_2314.py (10 tests pass after fix, 8 fail before)
  • pytest xrspatial/geotiff/tests/ (5224 pass, 68 skipped)
  • Existing test_remote_sidecar_chunked_2239.py still passes (parity with same-endian sidecar fixture)

When the selected overview IFD lives in an external .tif.ovr sidecar,
_parse_cog_http_meta now returns the sidecar's TIFFHeader instead of
the base file's. The eager HTTP path and the dask HTTP path already
unpack this header and feed header.byte_order into
_decode_strip_or_tile, so the swap is enough to make a big-endian
.ovr paired with a little-endian base (or the reverse) decode against
the right endianness. Mirrors the local sidecar path in
_reader.py:223 which already handled this case.

Adds test_remote_sidecar_byte_order_2314.py covering the eager HTTP,
dask HTTP, fsspec memory, and direct _parse_cog_http_meta contracts
for both byte-order directions. Eight of the ten test cases fail on
the unfixed branch.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 22, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Use sidecar header for remote .ovr decode byte order (#2314)

Blockers

None.

Suggestions

None.

Nits

  • xrspatial/geotiff/tests/test_remote_sidecar_byte_order_2314.py: the fixture builder relies on tifffile's default layout, which is stripped (untiled). The eager and dask HTTP paths both flow through _fetch_decode_cog_http_strips for these fixtures, so the tiled decode path's header.byte_order plumbing is exercised only transitively. The fix lives at the metadata layer so the result is the same, but a second parametrized fixture with tile=... would close the last coverage gap. Optional.
  • xrspatial/geotiff/_cog_http.py:265: return_header = sidecar.header if used_sidecar else header is safe because used_sidecar can only be True when sidecar is not None (sidecar_ifd_ids stays empty otherwise). A defensive comment tying those two invariants together would make the relationship obvious to a future reader. Optional.

What looks good

  • The fix is minimum-surface: one line in _parse_cog_http_meta returns the sidecar's TIFFHeader when used_sidecar is True. Downstream code in _read_cog_http and _backends/dask.py already reads header.byte_order off the unpacked header and picks up the right value with no further plumbing.
  • Docstring updated to describe the new contract.
  • Comment in dask.py updated so the code matches what the comment claims.
  • Tests fail on 8 of 10 cases without the fix (parametrized LE/BE and BE/LE across eager HTTP, dask HTTP, fsspec memory, and a direct contract test on _parse_cog_http_meta). The two passing-before cases are the local sidecar path, included as a parity sanity check.
  • Test fixtures use uuid-suffixed paths so parallel runs do not collide.
  • No public API change; no docs / README update needed.

Checklist

  • Algorithm matches reference behaviour (parity with _reader.py:223 local sidecar swap)
  • All implemented backends produce consistent results (numpy and dask HTTP both exercised)
  • NaN handling is correct (untouched; uint16 fixtures have no NaN)
  • Edge cases are covered by tests (both byte-order directions)
  • Dask chunk boundaries handled correctly (chunks=16 forces multi-chunk read)
  • No premature materialization or unnecessary copies
  • Benchmark exists or is not needed (pure correctness fix)
  • README feature matrix updated (not applicable)
  • Docstrings present and accurate

* Add a parametrized tiled-layout HTTP eager test so the
  _fetch_decode_cog_http_tiles call site is exercised directly, not
  just through the stripped path.
* Document the used_sidecar => sidecar is not None invariant in the
  comment above the sidecar.header read so future readers can see
  why the conditional access is safe.
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review (follow-up after c1dbfb9)

Both nits from the previous review have been addressed:

  • _cog_http.py: invariant comment added above the sidecar.header read, explaining why used_sidecar=True implies sidecar is not None.
  • test_remote_sidecar_byte_order_2314.py: new parametrized test_http_eager_mixed_endian_sidecar_tiled exercises the tiled HTTP decode path directly. Test count is now 12 (was 10), all passing.

No remaining blockers, suggestions, or nits. Ready for CI to land.

@brendancol brendancol merged commit 69e60d2 into main May 22, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remote sidecar overview reads decode with base TIFF byte order

1 participant