Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resize each block to custom shape #244

Merged
merged 6 commits into from
Mar 17, 2023

Conversation

will-moore
Copy link
Member

Bug-fix: when using ome_zarr.dask_utils.resize() to resize an array that has chunks of different sizes (e.g. at the edge of a large image), these chunks are resized to the same dimensions as the other chunks. This creates a lager image than expected, but this is then cropped.

E.g. image of 1050 x 1050 with chunk size of 512 x 512, this will give edge chunks of size 512 x 26 or 26 x 26.
When downsampling the image, ALL chunks will get resized to 256 x 256, so the edge chunks will get stretched.
This will give a slightly bigger output image than expected, but this is then cropped.

E.g. idr image 14257890 at lowest resolution:

Screenshot 2023-01-23 at 22 24 17

Resized with ome_zarr.dask_utils.resize() to give a half-size array gives stretched edge tiles:

Screenshot 2023-01-23 at 22 23 29

This is fixed by this PR.

@codecov
Copy link

codecov bot commented Jan 23, 2023

Codecov Report

Base: 84.46% // Head: 84.54% // Increases project coverage by +0.07% 🎉

Coverage data is based on head (79e4f0c) compared to base (1f47e59).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #244      +/-   ##
==========================================
+ Coverage   84.46%   84.54%   +0.07%     
==========================================
  Files          13       13              
  Lines        1474     1475       +1     
==========================================
+ Hits         1245     1247       +2     
+ Misses        229      228       -1     
Impacted Files Coverage Δ
ome_zarr/dask_utils.py 72.72% <100.00%> (+1.29%) ⬆️
ome_zarr/scale.py 69.17% <0.00%> (+0.75%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@joshmoore
Copy link
Member

Hmmm.... lots of failures on comparison of the dask arrays.

@will-moore
Copy link
Member Author

Strange that the tests passed on all the py37 builds, but not py38 or py39.
I have Python 3.9.12 and pytest 7.1.1 locally, and all my tests are passing...

(omeroweb) Williams-MacBook-Pro:tests wmoore$ pytest test_writer.py 
===================================================================================== test session starts =====================================================================================
platform darwin -- Python 3.9.12, pytest-7.1.1, pluggy-1.0.0
rootdir: /Users/wmoore/Desktop/ZARR/ome-zarr-py, configfile: pytest.ini
plugins: anyio-3.5.0, napari-plugin-engine-0.2.0, napari-0.4.15
collected 302 items                                                                                                                                                                           

test_writer.py ........................................................................................................................................................................ [ 55%]
......................................................................................................................................                                                  [100%]

==================================================================================== 302 passed in 16.09s =====================================================================================

@will-moore
Copy link
Member Author

Strange, but tests are NOT failing for NGFF v0.1, suggesting that the problem is due to dimension separator?
I wonder if this is illustrating the bug that is fixed in #243.
I'll try a temp merge of that branch into this one to see if it helps...

@will-moore
Copy link
Member Author

Looks like #243 is the fix needed. So once that's merged I can revert my merge above.
I'll add some tests that test the fix in this PR...

@will-moore will-moore mentioned this pull request Feb 1, 2023
@imagesc-bot
Copy link

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/converting-other-idr-images-into-public-zarr/44025/20

@will-moore
Copy link
Member Author

cc @aeisenbarth
This PR attempts to address issues that I've noticed in the usage of the dask resize() that we borrowed from https://github.com/aeisenbarth/ngff-writer/blob/main/ngff_writer/dask_utils.py
I don't know you've experienced any issues in your usage of that method, or have any feedback about the changes here?

@aeisenbarth
Copy link
Contributor

I haven't observed such issues, probably because we stayed always with the same chunk size 256.

The explanation makes a lot of sense to me for this example. In general, the problem with block-based scaling is the accumulation of rounding errors. It is interesting that for the opposite case (target size 3338, factor 0.5000749, output block size 501) ceil does not lead to a bigger error.

@joshmoore
Copy link
Member

I'm still failing to get "Build using Zarr development branch / ubuntu-latest 3.8 (pull_request)" green. I'll open a PR to remove that and then we can start trying to ressurect it.

@joshmoore
Copy link
Member

@aeisenbarth: is there another test that you think would detect accumulation in the opposite case?

@will-moore
Copy link
Member Author

I think that if there was an accumulation of chunks where the pixel size is rounded up, then the concatenated array would be a few pixels larger than expected, but this would then be cropped/sliced at:

output = da.map_blocks(
        resize_block, image_prepared, dtype=image.dtype, chunks=block_output_shape
    )[output_slices]

But using ceil() should avoid problems in most cases.
E.g. if self.downscale is 2: then the width and height of the new_shape will be rounded down:

new_shape = list(image.shape)

So that the scale factor derived from that will be on the low side, e.g. 0.49992509 as in the example in the comments.
So, using ceil() to calculate the block sizes should round that back up to a factor of 0.5 (instead of giving us blocks that are too big).

@will-moore
Copy link
Member Author

@joshmoore - anything else needed here?

Copy link
Member

@joshmoore joshmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not particularly, and thanks for the explanation. If anyone sees an edge case that needs covering, feel free to add tests 🙂

@joshmoore joshmoore merged commit 9540832 into ome:master Mar 17, 2023
@will-moore will-moore added this to the 0.7.0 milestone Mar 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants