Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETCI2021 Img/Label pairing #231

Closed
calebrob6 opened this issue Nov 9, 2021 · 9 comments
Closed

ETCI2021 Img/Label pairing #231

calebrob6 opened this issue Nov 9, 2021 · 9 comments
Milestone

Comments

@calebrob6
Copy link
Member

I think there might be a bug based on visualizing some of the training patches. Shown below are VV, VH, Water mask, Flood mask for some sample:

image

Run the following a few times to reproduce:

from torchgeo.datasets import ETCI2021

import numpy as np
import matplotlib.pyplot as plt

ds = ETCI2021("data/", split="train", download=False)

for i in np.random.choice(len(ds), size=10, replace=False):
    
    img1 = np.rollaxis(ds[i]["image"][:3].numpy(),0,3)
    img2 = np.rollaxis(ds[i]["image"][3:].numpy(),0,3)
    mask1 = ds[i]["mask"][0].numpy()
    mask2 = ds[i]["mask"][1].numpy()
        
    fig, axs = plt.subplots(1,4,figsize=(12,3))
    axs[0].imshow(img1)
    axs[0].axis("off")
    axs[1].imshow(img2)
    axs[1].axis("off")
    axs[2].imshow(mask1)
    axs[2].axis("off")
    axs[3].imshow(mask2)
    axs[3].axis("off")
    plt.tight_layout()
    plt.show()
    plt.close()

@adamjstewart -- this is another example of a dataset that will need a custom trainer. You can interpret the imagery as a 6 band input, however there are two label masks (you want to predict the flood mask, and you are always given the water mask).

@adamjstewart
Copy link
Collaborator

this is another example of a dataset that will need a custom trainer.

I'm not sure I understand. To me this sounds like a 7 channel input (image + water mask) and a 1 channel target (flood mask). Can this not be handled by the Dataset/DataModule?

@calebrob6
Copy link
Member Author

I'm not sure I understand. To me this sounds like a 7 channel input (image + water mask) and a 1 channel target (flood mask). Can this not be handled by the Dataset/DataModule?

The flood mask is currently passed as a mask, you could pass it as an input.

@calebrob6
Copy link
Member Author

The problem above is caused by glob.glob not returning files in sorted order and is fixed by putting sorted(...) everywhere. I'm going to commit this fix directly to main.

@calebrob6
Copy link
Member Author

E.g. of correct pairing:

image

@adamjstewart
Copy link
Collaborator

Adding reference to commit that fixed this: 2e4b203

We should backport this to 0.1.1 when we create that release.

@calebrob6
Copy link
Member Author

calebrob6 commented Nov 9, 2021

Moving forward, let's visualize what comes out of a dataset before we merge them. These semantic bugs are the real killers. Like, this dataset is fully tested but takes me several hours of debugging to actually use -- not a great experience for any users.

@adamjstewart
Copy link
Collaborator

Can you add a plot method to the dataset? I think that's something we should consider requiring for all future datasets.

@isaaccorley
Copy link
Collaborator

This one is my fault for not checking via plotting. Thanks @calebrob6 for the fix. Will verify with plots in the future.

@adamjstewart adamjstewart added this to the 0.1.1 milestone Nov 20, 2021
@isaaccorley
Copy link
Collaborator

Hi,

Sorry to bother you. But I do have a training problem when using this dataset. Thanks so much for your time.

#745 (comment)

@liecn Kindly only leave comments on closed issues if they are relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants