checkerboard #59

SuzannaLin · 2022-05-17T11:50:31Z

Hi Yassine,
I am using the CCT model to train on a satellite dataset. The images are size 128x128. For some reason the predictions show a clear checkerboard pattern as shown in this example. Left: prediction, Right: ground truth.

Do you have any idea what causes this and how to avoid it?

yassouali · 2022-05-19T10:09:52Z

@SuzannaLin

Hi, yes that pattern is quite weird, I propose the following things to test (although I am not sure which will work):

Maybe try removing some aux. decoders that add noise (VAT & F-Noise) and see if it helps,
Maybe try training with larger sizes
Use another type of decoders, like a one that uses deconvolution for example, the rest of the model does not need to change,
A post-processing step with a CRF should solve this I think (something similar to this step)

Hope this helps

SuzannaLin · 2022-05-20T13:29:42Z

Hi Yassine!
Thanks for all the ideas. I started with the easiest one and resized my images in the dataloader. That did the trick! My results have improved by a lot! During inference, I just have to remember to resize to the original size again.
I think the main problem was the stride in the deconvolution, but I could not find this in the script. If I want to adjust this, do I need to do this for all aux decoders as well?

yassouali · 2022-05-20T13:53:56Z

the decodes use pixel shuffles instead of deconvolution, but I think maybe replacing it with a deconv decoder might help

SuzannaLin · 2022-05-31T13:53:37Z

Hi @yassouali
How would I change the pixel shuffle layer to a deconv layer?
I assume I will need to change this line:
layers.append(PixelShuffle(out_channels, scale=2))
but with what?

yassouali · 2022-06-01T15:39:44Z

yes that is correct, you'll need to replace the pixel shuffle with a nn.ConvTranspose2d layer

SuzannaLin · 2022-06-01T18:37:13Z

The PixelShuffle layer is wrapped in a for-loop after a conv layer.
Do I still need the nn.Conv2d layer?
I get errors such as this one:

RuntimeError: input and target batch or spatial sizes don't match: target [2, 320, 320], input [2, 2, 48, 48]
Batch_size = 4, images size = 320 x 320

What would be the correct parameters for (nn.ConvTranspose2d(in_channels=?, out_channels=?, kernel_size=?))
Does this layer go in the for loop? How to deal with the in and out channels?

Thank you for your help!

yassouali · 2022-06-02T22:44:07Z

the for loop is for upsampling with a factor of x2 each time, and the conv is part of the pixel shuffle method, you can try replacing the conv2d by conv transpose and remove the pixel shuffle,

as for an example, you might need to try different variation to find the best one, but maybe
torch.nn.ConvTranspose2d(n_channels, n_channels, kernel_size=3, stride=2) is a good start

SuzannaLin closed this as completed May 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

checkerboard #59

checkerboard #59

SuzannaLin commented May 17, 2022

yassouali commented May 19, 2022 •

edited

SuzannaLin commented May 20, 2022

yassouali commented May 20, 2022

SuzannaLin commented May 31, 2022

yassouali commented Jun 1, 2022

SuzannaLin commented Jun 1, 2022

yassouali commented Jun 2, 2022

checkerboard #59

checkerboard #59

Comments

SuzannaLin commented May 17, 2022

yassouali commented May 19, 2022 • edited

SuzannaLin commented May 20, 2022

yassouali commented May 20, 2022

SuzannaLin commented May 31, 2022

yassouali commented Jun 1, 2022

SuzannaLin commented Jun 1, 2022

yassouali commented Jun 2, 2022

yassouali commented May 19, 2022 •

edited