Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checkerboard #59

Closed
SuzannaLin opened this issue May 17, 2022 · 7 comments
Closed

checkerboard #59

SuzannaLin opened this issue May 17, 2022 · 7 comments

Comments

@SuzannaLin
Copy link
Contributor

Hi Yassine,
I am using the CCT model to train on a satellite dataset. The images are size 128x128. For some reason the predictions show a clear checkerboard pattern as shown in this example. Left: prediction, Right: ground truth.
image
Do you have any idea what causes this and how to avoid it?

@yassouali
Copy link
Owner

yassouali commented May 19, 2022

@SuzannaLin

Hi, yes that pattern is quite weird, I propose the following things to test (although I am not sure which will work):

  • Maybe try removing some aux. decoders that add noise (VAT & F-Noise) and see if it helps,
  • Maybe try training with larger sizes
  • Use another type of decoders, like a one that uses deconvolution for example, the rest of the model does not need to change,
  • A post-processing step with a CRF should solve this I think (something similar to this step)

Hope this helps

@SuzannaLin
Copy link
Contributor Author

Hi Yassine!
Thanks for all the ideas. I started with the easiest one and resized my images in the dataloader. That did the trick! My results have improved by a lot! During inference, I just have to remember to resize to the original size again.
I think the main problem was the stride in the deconvolution, but I could not find this in the script. If I want to adjust this, do I need to do this for all aux decoders as well?

@yassouali
Copy link
Owner

the decodes use pixel shuffles instead of deconvolution, but I think maybe replacing it with a deconv decoder might help

@SuzannaLin
Copy link
Contributor Author

Hi @yassouali
How would I change the pixel shuffle layer to a deconv layer?
I assume I will need to change this line:
layers.append(PixelShuffle(out_channels, scale=2))
but with what?

@yassouali
Copy link
Owner

yes that is correct, you'll need to replace the pixel shuffle with a nn.ConvTranspose2d layer

@SuzannaLin
Copy link
Contributor Author

The PixelShuffle layer is wrapped in a for-loop after a conv layer.
Do I still need the nn.Conv2d layer?
I get errors such as this one:

RuntimeError: input and target batch or spatial sizes don't match: target [2, 320, 320], input [2, 2, 48, 48]
Batch_size = 4, images size = 320 x 320

What would be the correct parameters for (nn.ConvTranspose2d(in_channels=?, out_channels=?, kernel_size=?))
Does this layer go in the for loop? How to deal with the in and out channels?

Thank you for your help!

@yassouali
Copy link
Owner

the for loop is for upsampling with a factor of x2 each time, and the conv is part of the pixel shuffle method, you can try replacing the conv2d by conv transpose and remove the pixel shuffle,

as for an example, you might need to try different variation to find the best one, but maybe
torch.nn.ConvTranspose2d(n_channels, n_channels, kernel_size=3, stride=2) is a good start

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants