BiMaCoSR: Binary One-Step Diffusion Model Leveraging Flexible Matrix Compression for Real Super-Resolution
Kai Liu, Kaicheng Yang, Zheng Chen, Zhiteng Li,Yong Guo, Wenbo Li, Linghe Kong, and Yulun Zhang.
"BiMaCoSR: Binary One-Step Diffusion Model Leveraging Flexible Matrix Compression for Real Super-Resolution", arXiv, 2025
[arXiv] [supplementary material] [visual results] [pretrained models]
- 2025-07-10: Basic pipeline & checkpoint released.
- 2025-02-04: This arXiv version and supplementary material are released.
- 2025-02-01: This repo is released.
Abstract: While super-resolution (SR) methods based on diffusion models (DM) have demonstrated inspiring performance, their deployment is impeded due to the heavy request of memory and computation. Recent researchers apply two kinds of methods to compress or fasten the DM. One is to compress the DM into 1-bit, aka binarization, alleviating the storage and computation pressure. The other distills the multi-step DM into only one step, significantly speeding up inference process. Nonetheless, it remains impossible to deploy DM to resource-limited edge devices. To address this problem, we propose BiMaCoSR, which combines binarization and one-step distillation to obtain extreme compression and acceleration. To prevent the catastrophic collapse of the model caused by binarization, we proposed sparse matrix branch (SMB) and low rank matrixbranch (LRM). Both auxiliary branches pass the full-precision (FP) information but in different ways. SMB absorbs the extreme values and its output is high rank, carrying abundant FP information. Whereas, the design of LRMB is inspired by LoRA and is initialized with the top r SVD components, outputting low rank representation. The computation and storage overhead of our proposed branches can be safely ignored. Comprehensive comparison experiments are conducted to exhibit BiMaCoSR outperforms current state-of-the-art binarization methods and gains competitive performance compared with FP one-step model. BiMaCoSR achieves a 23.8x compression ratio and a 27.4x speedup ratio compared to FP counterpart.
| Image | HR | SinSR(FP) | XNOR | ReSTE | BiMaCoSR (ours) |
|---|---|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
- Release datasets.
- Release training and testing code.
- Release pre-trained BiMaCoSR.
- Provide WebUI.
- Provide HuggingFace🤗 demo.
To set up the environment for this project, follow these steps:
-
Clone the repository:
git clone https://github.com/Kai-Liu001/BiMaCoSR.git cd BiMaCoSR -
Create a conda environment:
conda create -n BiMaCoSR python=3.10 conda activate BiMaCoSR pip install -r requirements.txt
By following these steps, you should be able to set up the environment and run the code successfully.
To test the model, follow these steps:
-
Download VQ-VAE and pretrained weights:
- Download VQ-GAN weights, put it into BiMaCoSR/weights
- Download pretrained model weights and config, put this pair into BiMaCoSR/weights and BiMaCoSR/configs, we use them by abs path later.
-
Run the testing script:
CUDA_VISIBLE_DEVICES=0 python inference_quant.py --config your_config_path --ckpt your_ckeckpoint_path --in_path LR_dir --out_path result_dir
To train the model, follow these steps:
- Download ResShift(teacher model weights) and SinSR(initial weights):
- Download Res-Shift weights, put it into
BiMaCoSR/weights - Download SinSR weights,put it into
BiMaCoSR/weights
- Download Res-Shift weights, put it into
- Download Dataset
- Download cropped imagenet for training, put it into BiMaCoSR/data/train
- Download any valid dataset you like, put it into
BiMaCoSR/data/test
data/ ├── train/train/ │ ├── image1.png │ ├── image2.png │ └── ... ├── val/ │ ├── LR/ │ │ ├── image1.png │ │ ├── image2.png │ │ └── ... │ └── HR/ │ ├── image1.png │ ├── image2.png │ └── ...
- Run the training script:
CUDA_VISIBLE_DEVICES=0 torchrun --standalone --nproc_per_node=1 --nnodes=1 main_distill.py --cfg_path your_config_path --save_dir logs/your_experiment_name
We achieve state-of-the-art performance. Detailed results can be found in the paper. All visual results of BiMaCoSR will be provided soon.
Click to expand
- results in Table 1 of the main paper
- visual comparison (x4) in the main paper
- visual comparison (x4) in the supplementary material
To evaluate the performance of BiMaCoSR, follow these steps:
- Run the evaluation script:
CUDA_VISIBLE_DEVICES=0 python metric.py --inp_imgs path/to/sr_dir --gt_imgs path/to/gt_dir --log path/to/log_dir
- Pre-computed results: For reference, we provide our pre-computed results .
The evaluation will generate detailed metrics comparing BiMaCoSR with other state-of-the-art methods as shown in our paper.
This code is built on SinSR.
















