Official PyTorch implementation for the paper:
Scale-Arbitrary Invertible Image Downscaling
IEEE Transactions on Image Processing (TIP) 2023
Jinbo Xing*, Wenbo Hu*, Menghan Xia, Tien-Tsin Wong (*joint first authors)
We present a scale-arbitrary invertible image downscaling network (AIDN) to natively downscale HR images with arbitrary scale factors. Meanwhile, the HR images could be restored with AIDN whenever necessary.
Usecase of our AIDN. (a) shows the conventional pipeline of distributing HR images over social media platforms. (b) shows the distribution pipeline with our proposed AIDN. H and W denote the height and width of images; s1, ..., sn are scale factors; and N stands for the upper-limit resolution of various social media platforms. AIDN allows users to bypass the resolution upper-limit of social media platforms by preventing from auto-downscaling, thus receivers can obtain HR images with more details.
- 2023.09.07 Fix a bug which potentially caused inconsistent quantitative results between the released code and the paper
- 2023.07.30 Release an interactive inspection demo.
- 2023.07.17 Release code and model weights!
conda create -n AIDN python=3.6.2
conda activate AIDN
conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=10.0 -c pytorch
pip install -r requirements.txt
The training and testing datasets can be downloaded here.
For training, download & unzip DIV2K dataset, and put DIV2K_train_HR/
and DIV2K_valid_HR/
into Data/
. Fill the path in dataset/prepare_div2k.py
and execute this script to split the images into patches.
The processed datasets should be as below:
Data/
└── DIV2K/
├── DIV2K_valid_HR
├── DIV2K_train_HR_patch/
├── DIV2K_valid_HR_patch/
├── 0801_001.png
├── ...
└── 0900_021.png
└── list/
├── train.txt
├── val.txt
└── test.txt
We crop the images in testing datasets to make sure their height and width are divisible by 12. The datasets should be as below:
Data/
├── Set5/
└──GTmod12/
├── xxx.png
├── ...
└── xxx.png
├── Set14/
├── urban100/
├── BSDS100/
├── DIV2K/
└── list/
├── DIV2K_val.txt
├── ...
├── BSDS100_val.txt
└── DIV2K_val.txt
where *.txt
are data lists, whose rows will be <dataset_name>/GTmod12/<img_filename>
, e.g. in BSDS100_val.txt
:
BSDS100/GTmod12/101085.png
...
BSDS100/GTmod12/97033.png
sh scripts/train.sh <exp_name> <config_path>
e.g.: sh scripts/train.sh AIDN_exp01 config/DIV2K/AIDN.yaml
Note that, we firstly train the model with a fixed
After training, the log and model weights will be saved in LOG/DIV2K/<exp_name>
.
sh scripts/benchmark.sh <exp_name> <config_path>
e.g.: sh scripts/benchmark.sh AIDN_exp01 config/DIV2K/AIDN.yaml
Please download the pre-trained weights of AIDN and place it in the LOG/DIV2K/pre-train/
folder, and then run the benchmarking script:
sh scripts/AIDN_benchmark.sh config/DIV2K/AIDN_benchmark.yaml
Note that here we only provide a small-scale Set5
dataset for reproduction purpose (actually we don't host the right of redistributing these datasets), you can modify config/DIV2K/AIDN_benchmark.yaml
to benchmark on more downloaded datasets.
The stdout of running this script should be:
=> Dataset 'Set5' (x1.5)
==>res_lr:
PSNR: 42.42
SSIM: 0.9870
PSNR-Y: 48.56
SSIM-Y: 0.9962
==>res_sr:
PSNR: 45.26
SSIM: 0.9854
PSNR-Y: 50.61
SSIM-Y: 0.9961
Dataset 'Set5' (x2.5)
==>res_lr:
PSNR: 39.42
SSIM: 0.9851
PSNR-Y: 46.04
SSIM-Y: 0.9960
==>res_sr:
PSNR: 37.43
SSIM: 0.9550
PSNR-Y: 40.77
SSIM-Y: 0.9750
Dataset 'Set5' (x3.5)
==>res_lr:
PSNR: 37.89
SSIM: 0.9853
PSNR-Y: 44.21
SSIM-Y: 0.9960
==>res_sr:
PSNR: 34.35
SSIM: 0.9267
PSNR-Y: 37.25
SSIM-Y: 0.9538
The JPEG-robust version of AIDN (i.e., AIDN+) and the weights of AIDN pre-trained with fixed
If you find the code useful for your work, please star this repo and consider citing:
@article{xing2023scale,
title={Scale-arbitrary invertible image downscaling},
author={Xing, Jinbo and Hu, Wenbo and Xia, Menghan and Wong, Tien-Tsin},
journal={IEEE Transactions on Image Processing},
year={2023},
publisher={IEEE}
}
The code is partially borrowed from EDSR, ArbSR and DiffJPEG. We thank the authors for sharing their code.