Lossy Image Compression with Quantized Hierarchical VAEs

QRes-VAE (Quantized ResNet VAE) is a neural network model for lossy image compression. It is based on the ResNet VAE architecture.

Paper: Lossy Image Compression with Quantized Hierarchical VAEs, WACV 2023 Best Paper Award (Algorithms track)
Arxiv: https://arxiv.org/abs/2208.13056

Features

Progressive coding: the QRes-VAE model learns a hierarchy of features. It compresses/decompresses images in a coarse-to-fine fashion.
Note: images below are from the CelebA dataset and COCO dataset, respectively.

Lossy compression efficiency: the QRes-VAE model has a competetive rate-distortion performance, especially at higher bit rates.

Install

Requirements:

Python, pytorch>=1.9, tqdm, compressai (link), timm>=0.5.4 (link).
Code has been tested in all of the following environments:
- Both Windows and Linux, with Intel CPUs and Nvidia GPUs
- Python 3.9
- pytorch=1.9, 1.10, 1.11 with CUDA 11.3
- pytorch=1.12 with CUDA 11.6. This setup is recommended. Models run faster (both training and testing) in this setup than in previous ones.

Download:

Download the repository;
Download the pre-trained model checkpoints and put them in the checkpoints folder. See checkpoints/README.md for expected folder structure.

Pre-trained models

QRes-VAE (34M) [Google Drive]: our main model for natural image compression.
QRes-VAE (17M) [Google Drive]: a smaller model trained on CelebA dataset for ablation study.
QRes-VAE (34M, lossless) [Google Drive]: a lossless compression model. Better than PNG but not as good as WebP.

The lmb in the name of folders is the multiplier for MSE during training. I.e., loss = rate + lmb * mse. A larger lmb produces a higher bit rate but lower distortion.

Usage

Image compression

Compression and decompression (lossy): See demo.ipynb.
Compression and decompression (lossless): experiments/demo-lossless.ipynb

As a VAE generative model

Progressive decoding: experiments/progressive-decoding.ipynb
Sampling: experiments/uncond-sampling.ipynb
Latent space interpolation: experiments/latent-interpolation.ipynb
Inpainting: experiments/inpainting.ipynb

Evaluate lossy compression efficiency

Rate-distortion: python evaluate.py --root /path/to/dataset
BD-rate: experiments/bd-rate.ipynb
Estimate end-to-end flops: experiments/estimate-flops.ipynb

Training

We provide training instructions for QRes-VAE in our new project repository: https://github.com/duanzhiihao/lossy-vae/tree/main/lvae/models/qresvae

License

The code has a non-commercial license, as found in the LICENSE file.

Citation

@article{duan2023qres,
    title={Lossy Image Compression with Quantized Hierarchical VAEs},
    author={Duan, Zhihao and Lu, Ming and Ma, Zhan and Zhu, Fengqing},
    journal={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
    pages={198--207},
    year={2023},
    month=Jan
}

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
checkpoints		checkpoints
experiments		experiments
images		images
models		models
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.ipynb		demo.ipynb
evaluate.py		evaluate.py
train-multigpu.py		train-multigpu.py
train.py		train.py

License

duanzhiihao/qres-vae

Folders and files

Latest commit

History

Repository files navigation

Lossy Image Compression with Quantized Hierarchical VAEs

Features

Install

Pre-trained models

Usage

Image compression

As a VAE generative model

Evaluate lossy compression efficiency

Training

License

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages