Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inference with jpg files #5

Closed
suke27 opened this issue Oct 8, 2018 · 5 comments
Closed

inference with jpg files #5

suke27 opened this issue Oct 8, 2018 · 5 comments

Comments

@suke27
Copy link

suke27 commented Oct 8, 2018

Hello, As I know GAN algorithm has problem on jpg images.
do you try your algorithm for jpg files? is there abnormal noise texture exist in output?

@rahidz
Copy link

rahidz commented Oct 8, 2018

For some jpg files I try, the output is very “blocky”’ like there’s small but very noticeable square blocks covering the entire image. I assume this is simply ESRGAN “enhancing” the jpg artifacts already present, since this doesn't noticeably occur for all jpg files in my experience.

I’ve noticed slight improvement by converting the jpg to a png before I start, but it’s still not great.

@xinntao
Copy link
Owner

xinntao commented Oct 8, 2018

@suke27 @rahidz
Yes, if the input images are JPEG images, which means it was compressed by JPEG algorithms, the output is "blocky" with checkboard.

During training, the ESRGAN has not seen these blocky JPEG artifacts and was not trained with the objective to remove them. Therefore, in testing, the ESRGAN "enhancing" the jpg artifacts already present, as @rahidz said.

Turning it to PNG format cannot remove these checkerboard (8 pixels by 8 pixels) in JPEG compression. After upsampling the image, the checkerboard artifacts appear (8*4=32pixels, so you will see 32 by 32 pixels checkerboard artifacts).

It is a limitation that ESRGAN (also other SR algorithms using the bicubic downsampling kernels) cannot handle the problems. Acturally, the ESRGAN (and many other SR algorithms) is limited to a very strong assumption - the perfect bicubic downsampling kernels. Howevere, in real world, the input images have diverse down-sampling kernels and are also with blur, JPEG compression and noise.

So the algorithms are not satisfactory for images in real world. These problems are also the research topics that the community wants to solve.

You can try to improve it by 1) removing the JPEG artifacts first by other algorithms like ARCNN; 2) finetuning the network with datasets with JPEG artifacts, so that the ESRGAN can handle them.

@suke27
Copy link
Author

suke27 commented Oct 8, 2018

@xinntao for suggestion 2) finetuning the network with datasets with JPEG artifacts, so that the ESRGAN can handle them.
I try to finetune using jpg dataset for SRGAN, but not work well, also block and noise exist
but same training method for EDSR works well.

@xinntao
Copy link
Owner

xinntao commented Oct 8, 2018

@suke27 You can try to first pre-train the network with L2/L1 loss with JPEG compressed images. And then use the pre-trained model to train a GAN-based model.

I think, L2/L1 loss is a clear and effective loss to guide the network to remove the JPEG compression artifacts (Therefore, finetuning with jpg dataset works well for EDSR). But for SRGAN, the VGG + Adv loss may be not enough to remove the JPEG artifacts. So I think first pre-training the network with JPEG compressed images to let the network learn to remove JPEG artifacts will work.

@xinntao xinntao closed this as completed Oct 10, 2018
BlueAmulet pushed a commit to BlueAmulet/ESRGAN that referenced this issue Jul 10, 2020
@neilthefrobot
Copy link

I was able to get around this by taking my HR training set, down sampling 4x, then converting it to .jpg with a low quality (high compression) and using that as my LR set. This way the network is seeing jpg artifacts as inputs and a non jpg artifact version as a target and it learns to convert between them. It actually worked very well too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants