Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MoNuSAC Dataset #158

Merged
merged 10 commits into from
Oct 23, 2023
Merged

Add MoNuSAC Dataset #158

merged 10 commits into from
Oct 23, 2023

Conversation

anwai98
Copy link
Contributor

@anwai98 anwai98 commented Oct 18, 2023

(WIP) Haven't been tested, as the generate_labeled_array depends on the MoNuSeg PR (will test and put to review once this works)

@anwai98
Copy link
Contributor Author

anwai98 commented Oct 20, 2023

There's an error thrown by an input image (TCGA-5P-A9K0-01Z-00-DX1_3.tif) of shape (185, 497, 4), while the dataloader requests for a patch shape of (512, 512), stating:

Traceback (most recent call last):
  File "/home/nimanwai/torch-em/scripts/datasets/check_monusac.py", line 32, in <module>
    check_monusac()
  File "/home/nimanwai/torch-em/scripts/datasets/check_monusac.py", line 18, in check_monusac
    check_loader(train_loader, 8, instance_labels=True, rgb=True, plt=True, save_path="./monusac_train.png")
  File "/home/nimanwai/torch-em/torch_em/util/debug.py", line 127, in check_loader
    _check_plt(loader, n_samples, instance_labels, save_path=save_path)
  File "/home/nimanwai/torch-em/torch_em/util/debug.py", line 17, in _check_plt
    for ii, (x, y) in enumerate(loader):
  File "/scratch/usr/nimanwai/mambaforge/envs/sam/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 634, in __next__
    data = self._next_data()
  File "/scratch/usr/nimanwai/mambaforge/envs/sam/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 678, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/scratch/usr/nimanwai/mambaforge/envs/sam/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/scratch/usr/nimanwai/mambaforge/envs/sam/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/nimanwai/torch-em/torch_em/data/image_collection_dataset.py", line 167, in __getitem__
    raw, labels = self._get_sample(index)
  File "/home/nimanwai/torch-em/torch_em/data/image_collection_dataset.py", line 134, in _get_sample
    raw, label = self._ensure_patch_shape(raw, label, have_raw_channels, have_label_channels, channel_first)
  File "/home/nimanwai/torch-em/torch_em/data/image_collection_dataset.py", line 106, in _ensure_patch_shape
    raise NotImplementedError("Padding is not implemented for data with channels")
NotImplementedError: Padding is not implemented for data with channels

whereas, if I request for a patch shape of (128. 128), it works just fine.

What do you think about this @constantinpape? (I think the issue is that the paddding implementation for multi-channel inputs is missing)

@anwai98 anwai98 marked this pull request as ready for review October 20, 2023 15:51
@anwai98
Copy link
Contributor Author

anwai98 commented Oct 20, 2023

@constantinpape The padding for images with raw and label channels has been taken care of now. Let me know how this looks now.

Copy link
Owner

@constantinpape constantinpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the comments I left in the code: this data is saved as RGBA. That doesn't make sense, the alpha channel (A) does not contain any information and we should get rid of it.

torch_em/data/datasets/util.py Outdated Show resolved Hide resolved
torch_em/data/image_collection_dataset.py Outdated Show resolved Hide resolved
torch_em/data/raw_image_collection_dataset.py Show resolved Hide resolved
torch_em/data/raw_image_collection_dataset.py Outdated Show resolved Hide resolved
@anwai98
Copy link
Contributor Author

anwai98 commented Oct 22, 2023

Besides the comments I left in the code: this data is saved as RGBA. That doesn't make sense, the alpha channel (A) does not contain any information and we should get rid of it.

Okay. So for this, what would you recommend? (to manually remove the alpha channels from the tif images? ~ using a functionality on top)

(UPDATE: I implemented a funcitonality which takes care of this in the recent commit)

Copy link
Owner

@constantinpape constantinpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now, you should just change how the RGBA data is converted.

torch_em/data/datasets/monusac.py Outdated Show resolved Hide resolved
@constantinpape constantinpape merged commit e31fe76 into constantinpape:main Oct 23, 2023
2 checks passed
@anwai98 anwai98 deleted the monusac branch November 8, 2023 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants