Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update FAIR1M dataset and datamodule #1275

Merged
merged 10 commits into from
Apr 26, 2023

Conversation

isaaccorley
Copy link
Collaborator

Originally we created the FAIR1M dataset when only train/part1 images and labels were available.

This PR updates the FAIR1M dataset and datamodule to work with the latest train/val/test sets.

@isaaccorley isaaccorley self-assigned this Apr 22, 2023
@github-actions github-actions bot added the datasets Geospatial or benchmark datasets label Apr 22, 2023
@isaaccorley isaaccorley marked this pull request as draft April 22, 2023 18:54
@adamjstewart adamjstewart added this to the 0.5.0 milestone Apr 22, 2023
@github-actions github-actions bot added datamodules PyTorch Lightning datamodules testing Continuous integration testing labels Apr 23, 2023
@isaaccorley isaaccorley marked this pull request as ready for review April 23, 2023 02:47
torchgeo/datamodules/fair1m.py Show resolved Hide resolved
torchgeo/datamodules/fair1m.py Show resolved Hide resolved
torchgeo/datasets/fair1m.py Show resolved Hide resolved
@isaaccorley isaaccorley force-pushed the datasets/fair1mv2 branch 3 times, most recently from 56ffb9c to 6cdd0a0 Compare April 25, 2023 14:38
calebrob6
calebrob6 previously approved these changes Apr 25, 2023
Copy link
Member

@calebrob6 calebrob6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamjstewart can comment on the versionadded stuff, else LGTM

@adamjstewart
Copy link
Collaborator

Yes, it does need versionadded

@isaaccorley isaaccorley merged commit 28615a1 into microsoft:main Apr 26, 2023
17 checks passed
don't match

.. versionchanged:: 0.5
Added *split* and *download* parameters.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The split and download parameters.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like more of a nitpick. Added is more clear. The is just a statement.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should also be versionadded, not versionchanged. The versionadded template already says "New in version 0.5:"

from .utils import dataset_split


def collate_fn(batch: list[dict[str, Tensor]]) -> dict[str, Any]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this different from unbind_samples?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the torch.stack on line 27

.. versionadded:: 0.5
"""
output: dict[str, Any] = {}
output["image"] = torch.stack([sample["image"] for sample in batch])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this line do anything?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was mostly copied from nasa_marine_debris.py. I'm assuming it's there for mypy reasons.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But doesn't this just unstack and restack so that the output is identical?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this takes the batch (a list of sample dicts) and grabs each image and stacks it into a single tensor along a new batch dimension.

@isaaccorley isaaccorley deleted the datasets/fair1mv2 branch April 26, 2023 17:26
@adamjstewart adamjstewart added the backwards-incompatible Changes that are not backwards compatible label Sep 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backwards-incompatible Changes that are not backwards compatible datamodules PyTorch Lightning datamodules datasets Geospatial or benchmark datasets testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants