Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Video Compression through Image Interpolation

Chao-Yuan Wu, Nayan Singhal, Philipp Krähenbühl.
In ECCV, 2018. [Project Page] [Paper]


PyTorch implementation of deep video compression codec.

Currently supported:

  • Training interpolation models with different offsets.
  • Evaluation on single model (PSNR/MS-SSIM).
  • Some ablation study options.

Coming soon:

  • Entropy coding.
  • Evaluation on combined model.


We conducted experiments in the following environment:

  • Linux
  • Python 3.6.3
  • PyTorch 0.3.0.post4
  • TITAN X GPUs with CuDNN.

Similar environments (e.g. with OSX, Python 2) might work with small modification, but not tested.

Getting started

We provide a demo training script which trains on 7 clips for 100 iterations, and evaluates on a hold-out clip. To run the demo, please download the demo data, and run 2 (the argument (0, 1, or 2) specifies the level of hierarchy). This will take about 3 minutes.

Expected output:

Creating loader for data/train...
448 images loaded.
Loader for 448 images (28 batches) created.
	Encoder fuse level: 1
	Decoder fuse level: 1
[TRAIN] Iter[1]; LR: 0.00025; Loss: 0.260715; Backprop: 0.2985 sec; Batch: 2.8358 sec
[TRAIN] Iter[2]; LR: 0.00025; Loss: 0.237539; Backprop: 0.2371 sec; Batch: 1.5466 sec
[TRAIN] Iter[3]; LR: 0.00025; Loss: 0.241159; Backprop: 0.3445 sec; Batch: 1.4208 sec
[TRAIN] Iter[4]; LR: 0.00025; Loss: 0.193481; Backprop: 0.2328 sec; Batch: 1.3091 sec
[TRAIN] Iter[5]; LR: 0.00025; Loss: 0.181479; Backprop: 0.2336 sec; Batch: 1.2742 sec
[TRAIN] Iter[99]; LR: 0.00025; Loss: 0.090678; Backprop: 0.2444 sec; Batch: 1.3960 sec
[TRAIN] Iter[100]; LR: 0.00025; Loss: 0.082984; Backprop: 0.2431 sec; Batch: 1.3988 sec
Loss at each step:
0.1758 0.0982 0.0620 0.0574 0.0579 0.0597 0.0653 0.0742 0.0846 0.0949
Start evaluation...
Creating loader for data/eval...
8 images loaded.
Loader for 8 images (8 batches) created.
Evaluation @iter 100 done in 24 secs
TVL Loss   : 0.12207	0.06993	0.06193	0.06525	0.06742	0.07027	0.07543	0.08148	0.08650	0.09003
TVL MS-SSIM: 0.61841	0.81475	0.85905	0.87109	0.87745	0.88022	0.87903	0.87486	0.86880	0.86132
TVL PSNR   : 28.02937	28.63096	28.87302	28.87184	28.77673	28.64452	28.44989	28.28644	28.19869	28.15354

Output images with different number of progression compression iterations are stored in a directory called output. It starts with the blurry output with 1 iteration:


to the better output with 10 iterations:


Since we've only trained the model for 3 minutes, the results don't look great yet, but we can see that it roughly starts to reconstruct the frames.

The final result using full training set will look like:


Please see our Project Page for more examples.

To train or evaluate on additional datasets, please see for details and instructions.

Model weights and logs

Model weights are available here. The associated logs are available here.

Kinetics video IDs

The list of Kinetics videos we used for train/val/test is available here.


If you find this model useful for your research, please use the following BibTeX entry.

  title={Video Compression through Image Interpolation},
  author={Wu, Chao-Yuan and Nayan Singhal and Kr{\"a}henb{\"u}hl, Philipp},


This implementation largely borrows from pytorch-image-comp-rnn by Biao Zhang (1zb). U-net implementation borrows from Pytorch-UNet by Milesi Alexandre. Thank you, Biao and Milesi Alexandre!


Video Compression through Image Interpolation (ECCV'18) [PyTorch]







No releases published


No packages published