Fix COG overview block ordering (#2308)#2309
Merged
Merged
Conversation
Reorder pixel-data emission in _assemble_cog_layout so on-disk tile
blocks run smallest-overview first, then progressively larger, with
the main-resolution image last. The IFD chain still walks
[main, ov1, ov2, ...] in the conventional order; only the byte-level
placement of the tile blocks changes.
rio-cogeo cog_validate now returns valid=True on the issue's repro.
Before this fix it reported two block-order errors:
- The offset of the first block of overview of index 0 should be
after the one of the overview of index 1
- The offset of the first block of the main resolution image should
be after the one of the overview of index 1
Adds a layout-invariant test that parses the file directly and
asserts the min tile-offset per IFD runs ov_smallest < ... < ov_largest
< main_resolution, plus a rio-cogeo-gated row using the same skip
semantics as test_cog_writer_compliance.
brendancol
commented
May 22, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review: Fix COG overview block ordering (#2308)
Blockers
None.
Suggestions
None.
Nits
_write_layout.py:312-- then_parts == 1branch only triggers for callers that bypass the upstreamis_cog and len(ifd_specs) > 1gate. A one-line comment saying the[0]branch exists for direct unit tests would help future readers.
What looks good
- The fix reorders only the byte placement of tile/strip data and leaves the IFD chain order and per-IFD tile-offset patching alone.
pixel_emission_orderis built once and reused by both the offset-precomputation pass and the append pass, so they cannot drift.level_pixel_offsets[level_idx]stays indexed by the original IFD position, so the third pass needs no changes._write_streaminghas nocogparameter and the GPU writer funnels through the same_assemble_tiff -> _assemble_cog_layoutcall site, so one change covers every writer path that emits COG.- The new test parses files with
_header.parse_all_ifdsinstead of reimplementing TIFF parsing, and the rio-cogeo gate reuses the skip pattern fromtest_cog_writer_compliance.py. - Existing geotiff tests pass (5187 pass / 68 skip / 1 xpassed locally).
Notes
- The xpassed test is
test_external_cog_validator, expected per the PR body. The xfail removal is deferred to a follow-up.setup.cfghas noxfail_strictflag, so XPASS does not turn the suite red.
Checklist
- Algorithm matches the COG spec block-order requirement
- Backends consistent (CPU eager and GPU funnel through the same helper; streaming does not produce COG)
- NaN handling unchanged (this PR does not touch pixel values, only byte placement)
- Edge cases covered (single-band, 3-band, 2-level, 3-level overviews, rio-cogeo strict=False)
- Dask chunk boundaries n/a (dask + COG materialises eagerly)
- No new copies or materialisations
- No new function, so no README matrix or benchmark needed
- Docstrings on new test helpers
brendancol
commented
May 22, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review
The single nit from the prior pass is addressed in 760c7f5: the n_parts == 1 fallback now carries a one-line comment explaining that the upstream gate (is_cog and len(ifd_specs) > 1) keeps real callers off this branch and the entry exists for direct unit tests.
No new findings on a re-read. Tests still pass locally (73 in the cog suite + 1 xpassed).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2308.
What changed
_assemble_cog_layoutinxrspatial/geotiff/_write_layout.pynow emits the on-disk pixel data in COG-spec order: smallest overview first, then progressively larger overviews, with the main-resolution image's tile blocks last. The IFD chain still walks[main, ov1, ov2, ...]in the conventional order, so the change is invisible to readers that just enumerate IFDs. Only the byte placement of the tile data moves, which is what strict validators inspect.The fix is local to
_assemble_cog_layout: build apixel_emission_orderlist that puts the overview indices in reverse, with index 0 (main) last, then use that order for both the offset-precomputation pass and the final append pass. Per-level offset patching is unchanged because each IFD looks up its tile/strip base fromlevel_pixel_offsets[level_idx].The streaming writer does not produce COG output (
cog=Truealways routes through the eager_write->_assemble_tiff->_assemble_cog_layoutpath), and the GPU writer funnels through the same helper, so a single change covers all writer paths.Lines touched in
_write_layout.py: roughly 30, all inside the existing function. No new helpers, no signature changes.Evidence
Before, with the issue's repro:
After:
New test
xrspatial/geotiff/tests/test_overview_block_order_2308.pyparses the TIFF directly via the existing_header.parse_all_ifdshelpers, extracts the min tile-offset per IFD, and assertsov_smallest_min < ... < ov_largest_min < main_min. Covers single-band and 3-band cases plus a three-overview-level case. A rio-cogeo-gated row reuses the skip pattern fromtest_cog_writer_compliance.pyso contributor laptops without rio-cogeo see a clean skip while CI runs the strict check.Test plan
pytest xrspatial/geotiff/tests/test_overview_block_order_2308.py(5/5 pass locally with rio-cogeo installed)pytest xrspatial/geotiff/tests/(5187 pass, 68 skip, 1 xpassed)cog_validate(path, strict=False)returns(True, [], [])Follow-up
The
xfail(strict=False)marker ontest_external_cog_validatorintest_cog_writer_compliance.pynow XPASSES (the gate's reason "writer emits overview tile blocks in wrong order; see #2308" no longer applies). The task scope is to leave that marker for a separate one-line follow-up PR once this lands, per the PR #2304 separation noted in the issue. No CI flag insetup.cfgmakes XPASS strict, so this does not break the build.Scope