Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1/3 of all the images in the data set are blank #6

Closed
rsokl opened this issue May 18, 2020 · 2 comments
Closed

1/3 of all the images in the data set are blank #6

rsokl opened this issue May 18, 2020 · 2 comments

Comments

@rsokl
Copy link

rsokl commented May 18, 2020

I was looking through this dataset and noticed that many of the images are completely blank (all pixels are 0).
Using your mybatch_generator_prediction I loaded all the images and was surprised to find that roughly 1/3 of all the images – train and test alike – consist only of 0s

image

this is almost certainly inflating the kaggle statistics and is likely diluting the training process. Do you have any insight into what might be going on here?

For context: I downloaded the data from kaggle, in case that is relevant.

@SorourMo
Copy link
Owner

Hi @rsokl,
That is correct (has been mentioned here) and it is due to the fact that complete Landsat 8 images have kind of a parallelogram shape (one example is shown here). This leads to having four empty (pixel values = zero) triangles around the informative (nonzero) area of the images. Since small 384*384 patches are extracted from complete Landsat 8 images, some of those patches are completely empty.
The names of informative patches (with more than 80% nonzero pixels) in 38-Cloud and 95-Cloud datasets are stored in the attached csv files. Feel free to use only these informative patches in your trainings.
38-cloud_95-cloud_training_informative_patches.zip

Please note that for reconstructing a complete Landsat 8 cloud mask at the output for a test image, the empty patches are required.
Thank you for your point. I have updated 38-Cloud dataset to include the csv file of informative patches.

@rsokl
Copy link
Author

rsokl commented May 20, 2020

Great! Thank you for clarifying and for the useful zip file.

Congrats on your CloudNet+ model; it's a nice piece of work! :)

@rsokl rsokl closed this as completed May 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants