This is a TensorFlow model for additional lossless compression of bitstreams generated by neural net based image encoders as described in https://arxiv.org/abs/1703.10114.
To be more specific, the entropy coder aims at compressing further binary codes which have a 3D tensor structure with:
- the first two dimensions of the tensors corresponding to the height and the width of the binary codes,
- the last dimension being the depth of the codes. The last dimension can be sliced into N groups of K, where each additional group is used by the image decoder to add more details to the reconstructed image.
The code in this directory only contains the underlying code probability model but does not perform the actual compression using arithmetic coding. The code probability model is enough to compute the theoretical compression ratio.
The only software requirements for running the encoder and decoder is having Tensorflow installed.
You will also need to add the top level source directory of the entropy coder
to your PYTHONPATH
, for example:
export PYTHONPATH=${PYTHONPATH}:/tmp/models/compression
If you do not have a training dataset, there is a simple code generative model that you can use to generate a dataset and play with the entropy coder. The generative model is located under dataset/gen_synthetic_dataset.py. Note that this simple generative model is not going to give good results on real images as it is not supposed to be close to the statistics of the binary representation of encoded images. Consider it as a toy dataset, no more, no less.
To generate a synthetic dataset with 20000 samples:
mkdir -p /tmp/dataset
python ./dataset/gen_synthetic_dataset.py --dataset_dir=/tmp/dataset/ --count=20000
Note that the generator has not been optimized at all, generating the synthetic dataset is currently pretty slow.
If you just want to play with the entropy coder trainer, here is the command line that can be used to train the entropy coder on the synthetic dataset:
mkdir -p /tmp/entropy_coder_train
python ./core/entropy_coder_train.py --task=0 --train_dir=/tmp/entropy_coder_train/ --model=progressive --model_config=./configs/synthetic/model_config.json --train_config=./configs/synthetic/train_config.json --input_config=./configs/synthetic/input_config.json
Training is configured using 3 files formatted using JSON:
- One file is used to configure the underlying entropy coder model.
Currently, only the progressive model is supported.
This model takes 2 mandatory parameters and an optional one:
layer_depth
: the number of bits per layer (a.k.a. iteration). Background: the image decoder takes each layer to add more detail to the image.layer_count
: the maximum number of layers that should be supported by the model. This should be equal or greater than the maximum number of layers in the input binary codes.coded_layer_count
: This can be used to consider only partial codes, keeping only the firstcoded_layer_count
layers and ignoring the remaining layers. If left empty, the binary codes are left unchanged.
- One file to configure the training, including the learning rate, ... The meaning of the parameters are pretty straightforward. Note that this file is only used during training and is not needed during inference.
- One file to specify the input dataset to use during training. The dataset is formatted using tf.RecordIO.
Here is the command line to generate a single synthetic sample formatted in the same way as what is provided by the image encoder:
python ./dataset/gen_synthetic_single.py --sample_filename=/tmp/dataset/sample_0000.npz
To actually compute the additional compression ratio using the entropy coder trained in the previous step:
python ./core/entropy_coder_single.py --model=progressive --model_config=./configs/synthetic/model_config.json --input_codes=/tmp/dataset/sample_0000.npz --checkpoint=/tmp/entropy_coder_train/model.ckpt-209078
where the checkpoint number should be adjusted accordingly.