GitHub - onedc-codec/onedc: [NeurIPS 2025] One-Step Diffusion-Based Image Compression with Semantic Distillation

One-Step Diffusion-Based Image Compression with Semantic Distillation

NeurIPS 2025

Naifu Xue, Zhaoyang Jia, Jiahao Li, Bin Li, Yuan Zhang, Yan Lu

👍 More Works

DLF: Extreme Image Compression with Dual-generative Latent Fusion (ICCV 2025 Highlight)

📝 Abstract

While recent diffusion-based generative image codecs have shown impressive performance, their iterative sampling process introduces unpleasant latency. In this work, we revisit the design of a diffusion-based codec and argue that multi-step sampling is not necessary for generative compression. Based on this insight, we propose OneDC, a One-step Diffusion-based generative image Codec—that integrates a latent compression module with a one-step diffusion generator. Recognizing the critical role of semantic guidance in one-step diffusion, we propose using the hyperprior as a semantic signal, overcoming the limitations of text prompts in representing complex visual content. To further enhance the semantic capability of the hyperprior, we introduce a semantic distillation mechanism that transfers knowledge from a pretrained generative tokenizer to the hyperprior codec. Additionally, we adopt a hybrid pixel- and latent-domain optimization to jointly enhance both reconstruction fidelity and perceptual realism. Extensive experiments demonstrate that OneDC achieves SOTA perceptual quality even with one-step generation, offering over 39% bitrate reduction and 20×faster decoding compared to prior multi-step diffusion-based codecs. Project: https://onedc-codec.github.io/

⬆️ (Top) Multi-step sampling is not essential for image compression, One-Step is enough. (Bottom) Visual examples.

⬆️ OneDC can compress images to text-level size (a 768x768 image with 0.24KB), but the reconstruction still retains strong semantic consistency and original spatial details. (a) Text prompts (from GPT-4o) struggle to capture complex visual semantics, and existing text-to-image models have limited generation fidelity. (b) Hyperprior guidance yields more faithful reconstructions. (c) Semantic distillation further improves object-level accuracy.

⬆️ Framework overview.

⬆️ Quantitative evaluation.

💿 Installation

1. Create environment & install dependencies

conda create -n onedc python=3.10
conda activate onedc

Then, install PyTorch 2.5.0 manually according to your CUDA version. Finally, install required packages:

pip install -r requirements.txt

Notes:

You may need to downgrade pip to v24.0.

2. Build the entropy coder

sudo apt-get install cmake g++
cd src
mkdir build
cd build
conda activate $YOUR_PY38_ENV_NAME
cmake ../cpp -DCMAKE_BUILD_TYPE=Release[Debug]
make -j

💻 Inference

Running coding process:

Download our model weights (lambda 12.2~0.6 model)
Run inference script:

cd src
python inference.py \
    --config_path config_inference.yaml \
    --checkpoint_path [checkpoint path] \
    --eval_image_path [your image folder] \
    --output_path [output folder] \

For 0.0034 bpp model (exlow_bpp0034 model), use inference script in src/models/sd15_onedc_codec_z_only:

cd src
python models/sd15_onedc_codec_z_only/inference.py \
    --config_path config_inference.yaml \
    --checkpoint_path [checkpoint path] \
    --eval_image_path [your image folder] \
    --output_path [output folder] \

Evaluating quality:

cd src
python test_quality.py \
    --ref [your image folder] \
    --recon [recon image folder] \
    --fid_patch_size 256 \
    --fid_patch_num 2 \
    --output_path [result output folder] \
    --output_name [result output name]

🚀 Train

Prepare pretrained models: Pretrained one-step generator and VQGAN Tokenizer
Prepare dataset

Download commoncatalog-cc-by-nd by script:
```
cd src
python get_cc_dataset.py -d [your download folder]
```
Notes:
- You may need to use "export HF_DATASETS_CACHE=" to change the HF cache dir, according to your disk space.
The
```
[your download folder]/common-canvas___commoncatalog-cc-by-nd/default/0.0.0/3991ff88ebf48e0435ec8d044d2f4b159f4f716e
```
contains downloaded data file, e.g., "commoncatalog-cc-by-nd-train-00000-of-10447.arrow" and "dataset_info.json".
Prepare your validation dataset.

We use COCO 2017 dataset. You need to prepare annotations/captions_val2017.json and val2017 image set.
Update the training config with the pretrained model and dataset path.

Stage I training

cd src
accelerate launch --config_file ddp_configs/ddp_4A100.yaml \
  models/sd15_onedc_codec_stage1/train_sd15_codec_stage1.py \
  --config_path models/sd15_onedc_codec_stage1/configs/config_sd15_onedc_lmbda4.6_stage1_lr5e-5.yaml \

Stage II training

cd src
accelerate launch --config_file ddp_configs/ddp_4A100.yaml \
  models/sd15_onedc_codec_stage2/train_sd15_codec_stage2.py \
  --config_path models/sd15_onedc_codec_stage2/configs/config_sd15_onedc_lmbda4.6_stage2_lr1e-6.yaml \

Notes:

Adjust the lambda strategy in config to reach your target bitrate.
For better results, you may need to manually decrease learning rate and continue training.

🥰 Acknowledgement

We sincerely thank the following outstanding works, which greatly inspired and supported our research:

📕 Citation

If you find our work inspiring, please cite:

@inproceedings{xue2025one,
  title={One-Step Diffusion-Based Image Compression with Semantic Distillation},
  author={Naifu Xue and Zhaoyang Jia and Jiahao Li and Bin Li and Yuan Zhang and Yan Lu},
  booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
  year={2025},
}

⚖️ License

This project as a whole is licensed under CC BY-NC-SA 4.0.

It incorporates third-party components:

DMD2: CC BY-NC-SA 4.0 (folder: src/modules/dmd)
VQGAN: Apache License 2.0 (folder: src/modules/vqgan)
DCVC: MIT License (file: src/modules/dcvc.py; folder: src/modules/entropy)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
src		src
.gitignore		.gitignore
LICENSE.md		LICENSE.md
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

One-Step Diffusion-Based Image Compression with Semantic Distillation

NeurIPS 2025

👍 More Works

📝 Abstract

💿 Installation

💻 Inference

🚀 Train

🥰 Acknowledgement

📕 Citation

⚖️ License

About

Uh oh!

Releases

Packages

Languages

License

onedc-codec/onedc

Folders and files

Latest commit

History

Repository files navigation

One-Step Diffusion-Based Image Compression with Semantic Distillation

NeurIPS 2025

👍 More Works

📝 Abstract

💿 Installation

💻 Inference

🚀 Train

🥰 Acknowledgement

📕 Citation

⚖️ License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages