Skip to content

BraveGroup/STF

Repository files navigation

The Devil Is in the Details: Window-based Attention for Image Compression

Pytorch implementation of the paper "The Devil Is in the Details: Window-based Attention for Image Compression". CVPR2022. This repository is based on CompressAI. We kept scripts for training and evaluation, and removed other components. The major changes are provided in compressai/models. For the official code release, see the CompressAI.

About

This repo defines the CNN-based models and Transformer-based models for learned image compression in "The Devil Is in the Details: Window-based Attention for Image Compression".

cnn_arch

The architecture of CNN-based model.

stf_arch

The architecture of Transformer-based model (STF).

Installation

Install CompressAI and the packages required for development.

conda create -n compress python=3.7
conda activate compress
pip install compressai
pip install pybind11
git clone https://github.com/Googolxx/STF stf
cd stf
pip install -e .
pip install -e '.[dev]'

Note: wheels are available for Linux and MacOS.

Usage

Training

An examplary training script with a rate-distortion loss is provided in train.py.

Training a CNN-based model:

CUDA_VISIBLE_DEVICES=0,1 python train.py -d /path/to/image/dataset/ -e 1000 --batch-size 16 --save --save_path /path/to/save/ -m cnn --cuda --lambda 0.0035
e.g., CUDA_VISIBLE_DEVICES=0,1 python train.py -d openimages -e 1000 --batch-size 16 --save --save_path ckpt/cnn_0035.pth.tar -m cnn --cuda --lambda 0.0035

Training a Transformer-based model(STF):

CUDA_VISIBLE_DEVICES=0,1 python train.py -d /path/to/image/dataset/ -e 1000 --batch-size 16 --save --save_path /path/to/save/ -m stf --cuda --lambda 0.0035

Evaluation

To evaluate a trained model on your own dataset, the evaluation script is:

CUDA_VISIBLE_DEVICES=0 python -m compressai.utils.eval_model -d /path/to/image/folder/ -r /path/to/reconstruction/folder/ -a stf -p /path/to/checkpoint/ --cuda
CUDA_VISIBLE_DEVICES=0 python -m compressai.utils.eval_model -d /path/to/image/folder/ -r /path/to/reconstruction/folder/ -a cnn -p /path/to/checkpoint/ --cuda

Dataset

The script for downloading OpenImages is provided in downloader_openimages.py. Please install fiftyone first.

Results

Visualization

visualization01

Visualization of the reconstructed image kodim01.png.

visualization07

Visualization of the reconstructed image kodim07.png.

RD curves

kodak_rd

RD curves on Kodak.

clic_rd

RD curves on CLIC Professional Validation dataset.

Codec Efficiency on Kodak

Method Enc(s) Dec(s) PSNR bpp
CNN 0.12 0.12 35.91 0.650
STF 0.15 0.15 35.82 0.651

Citation

@inproceedings{zou2022the,
  title={The Devil Is in the Details: Window-based Attention for Image Compression},
  author={Zou, Renjie and Song, Chunfeng and Zhang, Zhaoxiang},
  booktitle={CVPR},
  year={2022}
}

Related links

Releases

No releases published

Packages

No packages published