Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't add transforms to UnionDataset or IntersectionDataset #867

Closed
pmandiola opened this issue Oct 25, 2022 · 2 comments · Fixed by #870
Closed

Can't add transforms to UnionDataset or IntersectionDataset #867

pmandiola opened this issue Oct 25, 2022 · 2 comments · Fixed by #870
Labels
datasets Geospatial or benchmark datasets transforms Data augmentation transforms
Milestone

Comments

@pmandiola
Copy link
Contributor

Summary

Both UnionDataset and IntersectionDataset don't have a way to add transforms neither call self.transforms in their __get_item__ method. The transforms from the merged or intersected datasets are called in their own __get_item__ methods, but there are situations where having it on the UnionDataset or IntersectionDataset is more convenient and useful.

Rationale

In some situations having transforms on each merged or intersected dataset might be enough, but here's a couple of cases where this is needed:

  • Using the IntersectionDataset to add a mask to an image. For the transforms where information information from both mask and image are needed (i.e. filter mask where image has no data).
  • Using UnionDataset to merge multiple RasterDatasets (or IntersectionDatasets) and add only one transforms to the whole dataset instead of on each one.

Implementation

One possible way would be to just add the call to self.transforms in UnionDataset and IntersectionDataset:

if self.transforms is not None:
     sample = self.transforms(sample)

And let people assign a transform after doing the intersection/union by doing my_dataset.transforms = my_transforms. Or adding an add_transforms method to IntersectionDataset and UnionDataset to make it more explicit.

Alternatives

One alternative is to add a transforms parameter to IntersectionDataset and UnionDataset, and also call it on __get_item__. But this forces people to use the constructor directly instead of doing & or |.

Another option is to create a TransformableDataset that takes a dataset and transforms as parameters. Calls the underlying datasets __get_item__ and then transforms in its __get_item__.

Additional information

No response

@adamjstewart
Copy link
Collaborator

I like this idea. Another reason this is important is for things like random flips/rotations that need to be applied uniformly to both images and target labels.

One possible way would be to just add the call to self.transforms in UnionDataset and IntersectionDataset... And let people assign a transform after doing the intersection/union by doing my_dataset.transforms = my_transforms

One alternative is to add a transforms parameter to IntersectionDataset and UnionDataset

I like both of these ideas, and I don't see any reason why we can't do both. Want to submit a PR?

@adamjstewart adamjstewart added datasets Geospatial or benchmark datasets transforms Data augmentation transforms labels Oct 25, 2022
@adamjstewart adamjstewart added this to the 0.4.0 milestone Oct 25, 2022
@pmandiola
Copy link
Contributor Author

Yes you are right, it makes sense to have both. Happy to submit a PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets transforms Data augmentation transforms
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants