Code for the paper Practical Lossless Compression with Latent Variables using Bits Back Coding, appearing at ICLR 2019.
Overview of the code
The low level rANS encoding and decoding functions are in rans.py. Higher level functions for encoding and decoding according to various distributions, including using BB-ANS coding and more specialised BB-ANS VAE coding, are in util.py.
Scripts relating specifically to VAE learning and compression are in the torch_vae subdirectory. We have implemented two models, one for binarized MNIST digits and one for raw (non-binarized) digits. The script torch_vae/torch_bin_mnist_compress.py compresses the binarized MNIST dataset using the learned VAE, torch_vae/torch_mnist_compress.py compresses raw MNIST. Pre-learned parameters for both models are in torch_vae/saved_params. The parameters were learned using the torch_vae/tvae_binary.py and torch_vae/tvae_beta_binomial.py scripts respectively.
The benchmark_compressors.py script measures the compression rate of various commonly used lossless compression algorithms, including those quoted in our paper. In order to run the benchmarks for the Imagenet 64x64 dataset, you will need to download and unzip the dataset into the directory data/imagenet.
To run the tests, run pytest from the root dir. Scripts in the
pytorch_vae directory must be run from the root dir using
python -m torch_vae.[name_of_module] # Without .py at the end.
Core: Python 3, Numpy, Scipy
VAE Compression: Pytorch
Compression benchmarks: Pillow
Please notify us if we've missed something.