Add MoNuSAC Dataset #158

anwai98 · 2023-10-18T17:10:09Z

(WIP) Haven't been tested, as the generate_labeled_array depends on the MoNuSeg PR (will test and put to review once this works)

Update MoNuSAC with MoNuSeg functionality

anwai98 · 2023-10-20T11:59:59Z

There's an error thrown by an input image (TCGA-5P-A9K0-01Z-00-DX1_3.tif) of shape (185, 497, 4), while the dataloader requests for a patch shape of (512, 512), stating:

Traceback (most recent call last):
  File "/home/nimanwai/torch-em/scripts/datasets/check_monusac.py", line 32, in <module>
    check_monusac()
  File "/home/nimanwai/torch-em/scripts/datasets/check_monusac.py", line 18, in check_monusac
    check_loader(train_loader, 8, instance_labels=True, rgb=True, plt=True, save_path="./monusac_train.png")
  File "/home/nimanwai/torch-em/torch_em/util/debug.py", line 127, in check_loader
    _check_plt(loader, n_samples, instance_labels, save_path=save_path)
  File "/home/nimanwai/torch-em/torch_em/util/debug.py", line 17, in _check_plt
    for ii, (x, y) in enumerate(loader):
  File "/scratch/usr/nimanwai/mambaforge/envs/sam/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 634, in __next__
    data = self._next_data()
  File "/scratch/usr/nimanwai/mambaforge/envs/sam/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 678, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/scratch/usr/nimanwai/mambaforge/envs/sam/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/scratch/usr/nimanwai/mambaforge/envs/sam/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/nimanwai/torch-em/torch_em/data/image_collection_dataset.py", line 167, in __getitem__
    raw, labels = self._get_sample(index)
  File "/home/nimanwai/torch-em/torch_em/data/image_collection_dataset.py", line 134, in _get_sample
    raw, label = self._ensure_patch_shape(raw, label, have_raw_channels, have_label_channels, channel_first)
  File "/home/nimanwai/torch-em/torch_em/data/image_collection_dataset.py", line 106, in _ensure_patch_shape
    raise NotImplementedError("Padding is not implemented for data with channels")
NotImplementedError: Padding is not implemented for data with channels

whereas, if I request for a patch shape of (128. 128), it works just fine.

What do you think about this @constantinpape? (I think the issue is that the paddding implementation for multi-channel inputs is missing)

anwai98 · 2023-10-20T15:53:20Z

@constantinpape The padding for images with raw and label channels has been taken care of now. Let me know how this looks now.

constantinpape

Besides the comments I left in the code: this data is saved as RGBA. That doesn't make sense, the alpha channel (A) does not contain any information and we should get rid of it.

torch_em/data/datasets/util.py

torch_em/data/image_collection_dataset.py

torch_em/data/raw_image_collection_dataset.py

anwai98 · 2023-10-22T18:46:38Z

Besides the comments I left in the code: this data is saved as RGBA. That doesn't make sense, the alpha channel (A) does not contain any information and we should get rid of it.

Okay. So for this, what would you recommend? (to manually remove the alpha channels from the tif images? ~ using a functionality on top)

(UPDATE: I implemented a funcitonality which takes care of this in the recent commit)

constantinpape

Looks good now, you should just change how the RGBA data is converted.

torch_em/data/datasets/monusac.py

anwai98 and others added 6 commits October 18, 2023 19:02

Add MoNuSAC Dataset

991a20a

Update __init__.py and xml to array fn call

f9b1706

Add monusac check script

11eda33

Add organ split for monusac data

c100d96

Merge branch 'monusac' into main

f70aaaa

Merge pull request #3 from anwai98/main

c20c8f0

Update MoNuSAC with MoNuSeg functionality

Add padding for raw and label channels

944cce9

anwai98 marked this pull request as ready for review October 20, 2023 15:51

constantinpape reviewed Oct 22, 2023

View reviewed changes

Update padding in image collection dataset

befe24b

Remove alpha channels from monusac inputs

9a28eb4

constantinpape reviewed Oct 23, 2023

View reviewed changes

torch_em/data/datasets/monusac.py Outdated Show resolved Hide resolved

Fix rgb conversion functionality

a264e5e

anwai98 requested a review from constantinpape October 23, 2023 11:16

constantinpape approved these changes Oct 23, 2023

View reviewed changes

constantinpape merged commit e31fe76 into constantinpape:main Oct 23, 2023
2 checks passed

anwai98 deleted the monusac branch November 8, 2023 00:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MoNuSAC Dataset #158

Add MoNuSAC Dataset #158

anwai98 commented Oct 18, 2023

anwai98 commented Oct 20, 2023

anwai98 commented Oct 20, 2023

constantinpape left a comment

anwai98 commented Oct 22, 2023 •

edited

Loading

constantinpape left a comment

Add MoNuSAC Dataset #158

Add MoNuSAC Dataset #158

Conversation

anwai98 commented Oct 18, 2023

anwai98 commented Oct 20, 2023

anwai98 commented Oct 20, 2023

constantinpape left a comment

Choose a reason for hiding this comment

anwai98 commented Oct 22, 2023 • edited Loading

constantinpape left a comment

Choose a reason for hiding this comment

anwai98 commented Oct 22, 2023 •

edited

Loading