[ICCV 2023] On the Effectiveness of Spectral Discriminators for Perceptual Quality Improvement

Xin Luo, Yunan Zhu, Shunxin Xu, Dong Liu

[Paper] [Video] [BibTeX] ⚡ 🚀 🔥

📌 Overview

Several recent studies advocate the use of spectral discriminators, which evaluate the Fourier spectra of images for generative modeling. However, the effectiveness of the spectral discriminators is not well interpreted yet. We tackle this issue by examining the spectral discriminators in the context of perceptual image super-resolution (i.e., GAN-based SR), as SR image quality is susceptible to spectral changes. Our analyses reveal that the spectral discriminator indeed performs better than the ordinary (a.k.a. spatial) discriminator in identifying the differences in the high-frequency range; however, the spatial discriminator holds an advantage in the low-frequency range. Thus, we suggest that the spectral and spatial discriminators shall be used simultaneously. Moreover, we improve the spectral discriminators by first calculating the patchwise Fourier spectrum and then aggregating the spectra by Transformer. We verify the effectiveness of the proposed method twofold. On the one hand, thanks to the additional spectral discriminator, our obtained SR images have their spectra better aligned to those of the real images, which leads to a better PD tradeoff. On the other hand, our ensembled discriminator predicts the perceptual quality more accurately, as evidenced in the no-reference image quality assessment task.

⭐ News

Sept. 28, 2023: Training code is released!
July. 19, 2023: We release our test code and models, training and analysis code will be released at the end of September.

🌻 Main Results

ESRGAN	SPSR	ESRGAN+LDL	ESRGAN +DualFormer(Ours)
PSNR/SSIM/LPIPS	PSNR/SSIM/LPIPS	PSNR/SSIM/LPIPS	PSNR/SSIM/LPIPS
28.0465/0.7669/0.1597	28.3978/0.7821/0.1069	28.2440/0.7758/0.1133	29.3049/0.8023/0.1030

Installation

This implementation based on BasicSR, please refer to it to get more information on usage.

# create a virtual environment [Recommended but optional]
conda create -n dual_former python=3.9
source activate dual_former

# Install necessities
# In DualFormer/
pip install --user -e .

🚀 Usage

Download our pretrained models (for both SR and IQA), and place the contents in experiments/pretrained_models/ (you will need to create these directories first, e.g., mkdir -p experiments/pretrained_models, if you are in project root directory.)

x4 Super Resolution (Bicubic degradation)

Download the DIV2K, BSD100 and Urban100, test datasets, and place them in datasets/.

Evaluate models.

python basicsr/test.py -opt options/test/test_esrgan_x4_dual_former.yml

x4 Super Resolution (Hard gated degradation model)

Download the test dataset from here, place it in datasets/.
[Optional] You may also generate the test dataset yourself using the provided method (note that the resulting dataset may differ slightly from what was used in the paper due to randomness in the degradation synthesis process):
```
python scripts/generate_hgd_dataset.py \
--input datasets/DIV2K/DIV2K_valid_HR \
--hr_folder datasets/DIV2K/HGD/HR/X4 \
--lr_folder datasets/DIV2K/HGD/LR/X4 \
--scale 4
```

Evaluate models.

# ESRGAN version
python basicsr/test.py -opt options/test/test_esrgan_x4_hgd_dual_former.yml
# BebyGAN version
python basicsr/test.py -opt options/test/test_bebygan_x4_hgd_dual_former.yml

Opinion Unaware No-Reference IQA

Download the IQA datasets KonIQ-10k, LIVE-itW, PIPAL, and place them in dataset/.

Start testing.

bash scripts/test/test_iqa_vgg_specformer.sh
bash scripts/test/test_iqa_dual_former.sh

Furthermore, we provide our code for analysis, so as to facilitate and promote further research.

Calculate magnitude RMSE in frequency range for DIV2K validation set

Download DIV2K dataset and place it (or using ln -s in linux to make a soft link) under datasets/.
Download officially pretrained Real-ESRNet/Real-ESRGAN models, place them in experiments/pretrained_models.
Execute the code below to reproduce Tab.1 in our paper. Three ranges are $[0, \frac{3}{10}),[\frac{3}{10}\frac{8}{10})$, and $[\frac{8}{10},1]$ respectively, corresponding roughly to the divisions in Fig. 1a of the paper.

# For Real-ESRNet
python scripts/estimate_difference_in_frequency_range.py --model_path experiments/pretrained_models/RealESRNet_x4plus.pth

# For Real-ESRGAN
python scripts/estimate_difference_in_frequency_range.py --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth # The resultant numbers may slightly differ from those in Tab.1 of our paper, as we utilized our own reproduced model in the paper

# Test other dataset
python scripts/estimate_difference_in_frequency_range.py --dataset_opt other_dataset.yml # Please ref to options/DIV2K_valid.yml see how to make a proper dataset configuration

Plot the spectral profile of a model on a dataset

[Optional] Generate datasets (It is required in following example)

# Generate LR images for DIV2K validation set utilizing Second-order degradation model (note that the resulting dataset would not exactly same as we used, since random seed was not set beforehand).
bash scripts/generate_realesrgan_dataset.sh # modify the file to change the path

Estimate statistics

# Estimate statistics of HR images
python scripts/estimate_spectral_statistics.py --input datasets/DIV2K/RealESRGAN/HR/X4 --experiment_name spectral_analysis_G_DIV2K_train_HR_patch_size_256 --mode 1 --patch_size 64

# Estimate statistics of Real-ESRNet's outputs
python scripts/estimate_spectral_statistics.py --input datasets/DIV2K/RealESRGAN/LR/X4 --experiment_name spectral_analysis_G_realesrnet_DIV2K_x4_patch_size_256 --mode 0 --model_path experiments/pretrained_models/RealESRNet_x4plus.pth --patch_size 64

# Estimate statistics of Real-ESRGAN's outputs
python scripts/estimate_spectral_statistics.py --input datasets/DIV2K/RealESRGAN/LR/X4 --experiment_name spectral_analysis_G_realesrgan_DIV2K_x4_patch_size_256 --mode 0 --model_path experiments/pretrained_models/RealESRGAN_x4plus.pth --patch_size 64

Plot the spectral profile

# modify the file for your needs
python scripts/plot_spectral_profile.py

Evaluate the robustness of a discriminator under frequency masking and noise

⛵ Train

x4 Super Resolution (Bicubic degradation)

Prepare DF2K dataset under the guideline (just ignore OST part), and organize data according to datasets item in options/train/train_esrgan_x4_dual_former.yml.
Download pretrained ESRNet, and place it in experiments/pretrained_models/.
Start your training.

python basicsr/train.py --auto_resume -opt options/train/train_esrgan_x4_dual_former.yml

Test results.

# Modify pretrain_network_g to your model path
python basicsr/test.py -opt options/test/test_esrgan_x4_dual_former.yml

x4 Super Resolution (Hard gated degradation model)

Prepare DF2K+OST dataset under the guideline, and organize data according to datasets item in options/train/train_esrgan_x4_hgd_dual_former.yml.
[Optional] Train PSNR-oriented model. By default, step 3 will use our pretrained PSNR-oriented model, you could modify option file to use yours.

CUDA_VISIBLE_DEVICES=0,1,2,3 \
scripts/dist_train_autoresume.sh 4 options/train/train_esrnet_x4_hgd.yml # it require 4 GPUs.

Start your training.

CUDA_VISIBLE_DEVICES=0,1,2,3 \
scripts/dist_train_autoresume.sh 4 options/train/train_esrgan_x4_hgd_dual_former.yml # it require 4 GPUs.

Test results.

# Modify pretrain_network_g to your model path
python basicsr/test.py -opt options/test/test_esrgan_x4_hgd_dual_former.yml

Opinion Unaware No-Reference IQA

Prepare DF2K+OST dataset under the guideline, organize data according to datasets item in options/train/train_esrgan_x4_sgd_dual_former.yml.
[Optional] Train PSNR-oriented model. By default, step 3 will use our pretrained PSNR-oriented model, you could modify option file to use yours.

CUDA_VISIBLE_DEVICES=0,1 \
scripts/dist_train_autoresume.sh 2 python basicsr/train.py --auto_resume -opt options/train/train_esrnet_x4_sgd.yml # it require 2 GPUs.

Start your training.

CUDA_VISIBLE_DEVICES=0,1 \
scripts/dist_train_autoresume.sh 2 options/train/train_esrgan_x4_sgd_vgg_specformer.yml # it require 2 GPUs.

CUDA_VISIBLE_DEVICES=0,1 \
scripts/dist_train_autoresume.sh 2 options/train/train_esrgan_x4_sgd_dual_former.yml # it require 2 GPUs.

Test results.

# You should modify these two scripts accordingly, name, path etc...
bash scripts/test/test_iqa_vgg_specformer.sh
bash scripts/test/test_iqa_dual_former.sh

❤️ Citing Us

If you find this repository or our work useful, please consider giving a star ⭐ and citation 🦖, which would be greatly appreciated:

@inproceedings{luo2023effectiveness,
	title={On the Effectiveness of Spectral Discriminators for Perceptual Quality Improvement},
	author={Luo, Xin and Zhu, Yunan and Xu, Shunxin and Liu, Dong},
	booktitle={ICCV},
	year={2023}
}

📧 Contact

If you have any questions, please open an issue (the recommended way) or contact us via

xinluo@mail.ustc.edu.cn

License

This work is licensed under MIT license. See the LICENSE for details.

Acknowledgement

Our repository builds upon the excellent framework provided by BasicSR.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commits
basicsr		basicsr
docs		docs
figures		figures
options		options
pbs_files		pbs_files
scripts		scripts
slurm		slurm
.gitignore		.gitignore
README.md		README.md
VERSION		VERSION
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Luciennnnnnn/DualFormer

Folders and files

Latest commit

History

Repository files navigation

[ICCV 2023] On the Effectiveness of Spectral Discriminators for Perceptual Quality Improvement

📌 Overview

⭐ News

🌻 Main Results

Installation

🚀 Usage

⛵ Train

❤️ Citing Us

📧 Contact

License

Acknowledgement

About

Resources

Stars

Watchers

Forks

Languages