Missing patches when zarr-chunks is not "full" #5

ClementCaporal · 2024-04-25T13:42:42Z

Related to #3 (Inference Sampler)

The grid created by zds.PatchSampler doesn't take into account border of zarr if the zarr.shape is not a multiple of chunk.shape

Small example:

%load_ext autoreload
%autoreload 2

import zarr
import zarrdataset as zds
from torch.utils.data import DataLoader

filename = r"data.zarr"

# create empty zarr dataset
z = zarr.zeros((1, 1, 6), chunks=(1, 1, 4), dtype='uint8')
zarr.save(filename, z)


patch_size = dict(Z=1, Y=1, X=2)
patch_sampler = zds.PatchSampler(patch_size=patch_size)

my_datasets = zds.ZarrDataset(
    [
    zds.ImagesDatasetSpecs(
        filenames=filename,
        source_axes="ZYX",
        axes="ZYX",
    )
    ],
    patch_sampler=patch_sampler,
    return_positions=True,
    return_worker_id=True
)

my_dataloader = DataLoader(my_datasets,
                    num_workers=0,
                        worker_init_fn=zds.zarrdataset_worker_init_fn,
                    batch_size=1
                    )

for i, (wid, pos, sample) in enumerate(my_dataloader):
    print(pos)

result:

tensor([[[0, 1],
         [0, 1],
         [0, 2]]])
tensor([[[0, 1],
         [0, 1],
         [2, 4]]])

Is there a reason it doesn't return (or is it a bug?)

tensor([[[0, 1],
         [0, 1],
         [0, 2]]])
tensor([[[0, 1],
         [0, 1],
         [2, 4]]])
tensor([[[0, 1],
         [0, 1],
         [4, 6]]])

The text was updated successfully, but these errors were encountered:

ClementCaporal · 2024-04-25T14:30:24Z

Related to the possible solution of this issue:

In case of

z = zarr.zeros((1, 1, 5), chunks=(1, 1, 4), dtype='uint8')
zarr.save(filename, z)
patch_size = dict(Z=1, Y=1, X=2)

what would you expect as output?

Exact grid

tensor([[[0, 1],
         [0, 1],
         [0, 2]]])
tensor([[[0, 1],
         [0, 1],
         [2, 4]]])
tensor([[[0, 1],
         [0, 1],
         [4, 5]]]) # <--- this one is smaller than the model might expect

Cropped grid

tensor([[[0, 1],
         [0, 1],
         [0, 2]]])
tensor([[[0, 1],
         [0, 1],
         [2, 4]]])
# <--- But the border of this image won't be represented

Adapted grid

tensor([[[0, 1],
         [0, 1],
         [0, 2]]])
tensor([[[0, 1],
         [0, 1],
         [2, 4]]])
tensor([[[0, 1],
         [0, 1],
         [3, 5]]]) # <--- But the img[...,4] will be loaded twice

Intuitively I would choose the adapted grid solution.
This is what I tried to implement here (I can opened a draft pull request just to show the differences) #6

fercer · 2024-04-29T13:09:37Z

Hi @ClementCaporal, thanks for noticing this issue!

The reason for missing patches from non-full chunks is highly related to the previous way of computing patch locations based on the chunk size instead of the patch size.
I'll review your pull request and iterate there to find a solution, but I think that it will probably solve by using #4.

Thanks again!

fercer · 2024-05-07T21:13:06Z

This is solved by PR #4, where patch size is used as base to compute the sampleable chunks in the input image.

This was referenced Apr 25, 2024

Add Inference example Sampler #3

Closed

Fix grid missing border patches #6

Closed

fercer mentioned this issue Apr 30, 2024

Use patch_size instead of chunk_size as base shape for sampling #4

Merged

fercer closed this as completed May 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing patches when zarr-chunks is not "full" #5

Missing patches when zarr-chunks is not "full" #5

ClementCaporal commented Apr 25, 2024 •

edited

Loading

ClementCaporal commented Apr 25, 2024 •

edited

Loading

fercer commented Apr 29, 2024

fercer commented May 7, 2024

Missing patches when zarr-chunks is not "full" #5

Missing patches when zarr-chunks is not "full" #5

Comments

ClementCaporal commented Apr 25, 2024 • edited Loading

ClementCaporal commented Apr 25, 2024 • edited Loading

fercer commented Apr 29, 2024

fercer commented May 7, 2024

ClementCaporal commented Apr 25, 2024 •

edited

Loading

ClementCaporal commented Apr 25, 2024 •

edited

Loading