Skip to content
/ ITER Public

PyTorch codes for "Iterative Token Evaluation and Refinement for Real-World Super-Resolution", AAAI 2024

License

Unknown, Unknown licenses found

Licenses found

Unknown
LICENCE
Unknown
LICENCE-S-Lab
Notifications You must be signed in to change notification settings

chaofengc/ITER

Repository files navigation


framework_img

Pipeline of ITER. The input I l first passes through a distortion removal network E l to obtain the initially restored tokens S l , which are composed of indexes of the quantized features in the codebook of VQGAN. Then, a reverse discrete diffusion process, conditioned on S l , is used to generate textures. The process starts from completely masked tokens S T . The refinement network (also called the de-masking network) ϕ r generates refined outputs S T 1 with S l as a condition. Then, ϕ e evaluates S T 1 to obtain the evaluation mask m T 1 , which determines the tokens to keep and refine for step T 1 through a masked sampling process. Repeat this process T times to obtain de-masked outputs S 0 , and then reconstruct the restored images I s r using the VQGAN decoder D H . We found that T 8 is enough to get good results with ITER, which is much more efficient than other diffusion-based approaches.

🔧 Dependencies and Installation

# git clone this repository
git clone https://github.com/chaofengc/ITER.git
cd ITER 

# create new anaconda env
conda create -n iter python=3.8
source activate iter 

# install python dependencies
pip3 install -r requirements.txt
python setup.py develop

⚡Quick Inference

python inference_iter.py -s 2 -i ./testset/lrx4/frog.jpg
python inference_iter.py -s 4 -i ./testset/lrx4/frog.jpg

Example results


Left: real images | Right: super-resolved images with scale factor 4

👨‍💻Train the model

⏬ Download Datasets

The training datasets can be downloaded from 🤗hugging face. You may also refer to FeMaSR to prepare your own training data.

‍🔁 Training

Below are brief examples for training the model. Please modify the corresponding configuration files to suit your needs. Note that the codes are re-writtend and models are retrained from scratch, so the results may be slightly different from the paper.

Stage I: Train the Swin-VQGAN

accelerate launch --multi_gpu --num_processes=8 --mixed_precision=bf16 basicsr/train.py -opt options/train_ITER_HQ_stage.yml

Stage II & III: Train the LQ encoder and the refinement network

accelerate launch --main_process_port=29600 --multi_gpu --num_processes=8 --mixed_precision=bf16 basicsr/train.py -opt options/train_ITER_LQ_stage_X2.yml

accelerate launch --main_process_port=29600 --multi_gpu --num_processes=8 --mixed_precision=bf16 basicsr/train.py -opt options/train_ITER_LQ_stage_X4.yml

📝 Citation

If you find this code useful for your research, please cite our paper:

@inproceedings{chen2024iter,
  title={Iterative Token Evaluation and Refinement for Real-World Super-Resolution},
  author={Chaofeng Chen and Shangchen Zhou and Liang Liao and Haoning Wu and Wenxiu Sun and Qiong Yan and Weisi Lin},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2024},
}

⚖️ License

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License and NTU S-Lab License 1.0.

❤️ Acknowledgement

This project is based on BasicSR.

About

PyTorch codes for "Iterative Token Evaluation and Refinement for Real-World Super-Resolution", AAAI 2024

Resources

License

Unknown, Unknown licenses found

Licenses found

Unknown
LICENCE
Unknown
LICENCE-S-Lab

Stars

Watchers

Forks

Packages

No packages published

Languages