Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DeepGlobe dataset for land cover #578

Merged
merged 36 commits into from
Jul 2, 2022

Conversation

saumyasinha
Copy link
Contributor

@saumyasinha saumyasinha commented Jun 14, 2022

Here is a sample plot for the dataset. These are with alpha 0, 0.5, and 1 respectively.
sample_alpha0
sample_alpha0 5
sample_alpha1

@ghost
Copy link

ghost commented Jun 14, 2022

CLA assistant check
All CLA requirements met.

@github-actions github-actions bot added datamodules PyTorch Lightning datamodules datasets Geospatial or benchmark datasets labels Jun 14, 2022
@calebrob6
Copy link
Member

We should cite https://openaccess.thecvf.com/content_cvpr_2018_workshops/papers/w4/Demir_DeepGlobe_2018_A_CVPR_2018_paper.pdf in the docstring in addition to the kaggle/codalab links

Copy link
Collaborator

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks! I'm sure @calebrob6 will have more comments to add.

To get the tests to pass, you'll need to run black . on the files and maybe correct a few things that flake8 catches. Will also need tests, see other tests/datasets/* files for guidance.

torchgeo/datasets/deepglobelandcover.py Outdated Show resolved Hide resolved
torchgeo/datasets/deepglobelandcover.py Outdated Show resolved Hide resolved
torchgeo/datasets/deepglobelandcover.py Outdated Show resolved Hide resolved
torchgeo/datasets/deepglobelandcover.py Outdated Show resolved Hide resolved
torchgeo/datasets/deepglobelandcover.py Outdated Show resolved Hide resolved
torchgeo/datasets/deepglobelandcover.py Show resolved Hide resolved
torchgeo/datasets/deepglobelandcover.py Outdated Show resolved Hide resolved
torchgeo/datasets/deepglobelandcover.py Show resolved Hide resolved
@github-actions github-actions bot added the testing Continuous integration testing label Jun 15, 2022
@adamjstewart
Copy link
Collaborator

Can you add a tests/data/deepglobelandcover/data.py script that creates the fake data and zip file? See other tests/data/*/data.py scripts for examples.

@saumyasinha
Copy link
Contributor Author

saumyasinha commented Jun 17, 2022

I made a small change in the draw_semantic_segmentation_masks function inside torchgeo/torchgeo/datasets/utils.py. The classes variable was earlier just using the unique values from the mask, but for segmentation scenarios where the image's mask doesn't include all the given classes such as an image with only "rivers" or "forest land" but not "urban land"...this function doesn't plot the mask right. I think this is because classes should be set to all possible class labels so that class_masks can be appropriately created and colors can be correctly assigned.

I may not have done a great job in describing the issue, but I am happy to talk more about it. This fixed the plots for me, but there might be more elegant ways of doing this.

@calebrob6
Copy link
Member

Can you check to see what other plot methods use draw_semantic_segmentation_masks then test one of them with this change?

@saumyasinha
Copy link
Contributor Author

Can you check to see what other plot methods use draw_semantic_segmentation_masks then test one of them with this change?

I did check for Potsdam and Vaihingen. Vaihingen had a few label masks that did not include all the classes, and I saw incorrect plots with the existing code, but correct ones when I included my change. I could show sample images when we meet next.

@calebrob6
Copy link
Member

Can you post them here?

@adamjstewart
Copy link
Collaborator

Can you rebase your branch so that CI runs the new minimum version tests? There's a button on GitHub or you can do it manually from the command line. Also, can you add this dataset/datamodule to the documentation? You'll need to add a section to docs/api/datasets.rst and docs/api/datamodules.rst under non-geospatial datasets. You'll also need to add a line to the table in docs/api/non_geo_datasets.csv.

@saumyasinha
Copy link
Contributor Author

saumyasinha commented Jun 26, 2022

Can you post them here?

Yeah!
The following plots are a sample from Vaihingen dataset, that has 5 out 6 classes in their segmented outputs. Attaching them here:
The first one is plotted with the existing draw_semantic_segmentation_masks, followed by the one plotted with my suggested change and the third one is the manually downloaded mask image (for reference - we are looking at top_mosaic_09cm_area30.tif).

vahingen_sample10_with_existing_visualization
vahingen_sample10_with_modified_visualization
image

@ashnair1
Copy link
Collaborator

Can you check to see what other plot methods use draw_semantic_segmentation_masks then test one of them with this change?

I did check for Potsdam and Vaihingen. Vaihingen had a few label masks that did not include all the classes, and I saw incorrect plots with the existing code, but correct ones when I included my change. I could show sample images when we meet next.

OSCD and XView2 also use draw_semantic_segmentation_masks

@calebrob6 calebrob6 closed this Jun 27, 2022
@calebrob6 calebrob6 reopened this Jun 27, 2022
@calebrob6
Copy link
Member

So this PR uncovered some unexpected behavior in torchgeo.datasets.utils.draw_semantic_segmentation_masks. When we added this function with the xView2 dataset it looks like we added a little hack to ignore the "0" background class when generating the masks to pass to torchvision's draw_segmentation_masks so that the background would be transparent. E.g. xview2.py has 3 colors in the colormap while having 4 classes -- https://github.com/microsoft/torchgeo/blob/main/torchgeo/datasets/xview.py#L63.

Generally, I think a good solution would be to force each dataset to have a color for each class (even background classes) then add an "ignore" class as an argument to draw_semantic_segmentation_masks. We will need to update OSCD and XView2 to implement this. @adamjstewart, thoughts?

@calebrob6
Copy link
Member

This should be good to go now!

Copy link
Collaborator

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there! Just a few remaining questions

docs/api/non_geo_datasets.csv Show resolved Hide resolved
tests/datamodules/test_deepglobelandcover.py Outdated Show resolved Hide resolved
torchgeo/datamodules/deepglobelandcover.py Show resolved Hide resolved
torchgeo/datamodules/deepglobelandcover.py Outdated Show resolved Hide resolved
adamjstewart
adamjstewart previously approved these changes Jul 2, 2022
calebrob6 and others added 2 commits July 2, 2022 12:34
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
@adamjstewart adamjstewart enabled auto-merge (squash) July 2, 2022 19:40
@adamjstewart adamjstewart merged commit d98b58a into microsoft:main Jul 2, 2022
@adamjstewart
Copy link
Collaborator

Running the tests results additional files being created during zip file extraction. We should add these to the repo so that running the tests doesn't result in new files in git status.

@adamjstewart
Copy link
Collaborator

#651

@adamjstewart adamjstewart mentioned this pull request Jul 11, 2022
@adamjstewart adamjstewart added this to the 0.3.0 milestone Jul 11, 2022
yichiac pushed a commit to yichiac/torchgeo that referenced this pull request Apr 29, 2023
* add class for Deep Globe Land Cover dataset

* add Lightning data module implementation for deepglobe land cover

* fix formatting errors

* fix urls, formats and add link for paper

* add tests for deepglobe dataset and datamodule

* fix a test case and a few more formatting error

* add data.py and modify error match for data download

* modify draw_semantic_segmentation_masks for cases when mask is a subset of all class labels

* fix mypy error

* add to docs for documentation

* add deepglobe to the dataset lists csv

* fix error in building docs

* Update datamodules.rst

* Update datasets.rst

* Update data.py

* Update utils.py

* change file permissions of non_geo_datasets.csv

* Add versionadded

* Update torchgeo/datasets/deepglobelandcover.py

Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>

* Change end of line sequence

* Update tests/data/deepglobelandcover/data.py

Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>

* exist_ok

* Update tests/datasets/test_deepglobelandcover.py

Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>

* Remove datamodule tests

* Remove split monkeypatch

* Running black

* Add val percent to test conf

* Sort filelist so indices are the same across platforms

* Simplified the file and mask fns

* Re-adding datamodule tests for coverage

* Add sub-configs to test val_split_pct in the datamodule

* Lets try it

* Update tests/conf/deepglobelandcover_0.yaml

Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>

* nulllllllll

* ingore_zeros -> ignore_index

Co-authored-by: saumyasinha <t-sasinha@microsoft.com>
Co-authored-by: Caleb Robinson <calebrob6@gmail.com>
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datamodules PyTorch Lightning datamodules datasets Geospatial or benchmark datasets documentation Improvements or additions to documentation testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants