Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SIIM ACR Pneumothorax dataset #256

Merged
merged 6 commits into from
May 15, 2024

Conversation

anwai98
Copy link
Contributor

@anwai98 anwai98 commented May 14, 2024

This is a dataset for segmenting pneumothorax in chest x-rays.

@anwai98
Copy link
Contributor Author

anwai98 commented May 14, 2024

In addition, this PR also adds the feature to download files from kaggle using the Kaggle API.

@anwai98 anwai98 marked this pull request as ready for review May 15, 2024 06:47
@anwai98
Copy link
Contributor Author

anwai98 commented May 15, 2024

Hi @constantinpape

This is a nice spot to review a few new features:

  • Downloads are now possible from Kaggle (some simple steps in the beginning to make the API work)
  • In ImageCollectionDataset, it allows now to accept patch_shape as None, and load the entire image as it is.
  • Added a new transformation function called ResizeInputs, which is the skimage.transform.resize operation to resize the inputs based on a desired patch shape.

Let me know if you spot something.

PS. The dataset is around 4GB, in case you want to test it out.

Copy link
Owner

@constantinpape constantinpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Some minor things can be improved in the handling if None patch shape, but looks good overall.
  • This logic should also be implemented for the SegmentationDataset. Otherwise there will be a disconnect in the functionality.
  • Some changes to the resize transformation.

torch_em/data/image_collection_dataset.py Show resolved Hide resolved
torch_em/transform/generic.py Outdated Show resolved Hide resolved
@anwai98
Copy link
Contributor Author

anwai98 commented May 15, 2024

Thanks for the mentions!

Re: This logic should also be implemented for the SegmentationDataset. Otherwise there will be a disconnect in the functionality.

I was thinking of doing it in another relevant PR once this is merged, where segmentation dataset is used (ideally for a 3d case like ACDC). Would that work, or should I take care of this in this PR?

@constantinpape
Copy link
Owner

Would that work, or should I take care of this in this PR?

Yes, that works!

@constantinpape constantinpape merged commit 8f397f8 into constantinpape:main May 15, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants