Description • Pipeline • How it works • Required • Results • License
What's in a frame? In 450 B.C. the ancient Greek philosopher Zeno contemplated the nature of time and its infinite divisibility, is motion any different he wondered? Like motion, videos persist through time meaning that ontop of the regular 3 dimensions necessary to describe any image i.e. (HEIGHT, WIDTH, CHANNELS), videos require an additional 4th dimension i.e. (TIME, HEIGHT, WIDTH, CHANNELS). The aim of deepzipper is to leverage redundency in color, spatial and temporal information in order to effectively reduce (compress) video data to it's utmost limit while at the same time preserving image definition.
Video compression consists of two sub-tasks:
The model works as follows:
-
Compression:
-
Decompression:
- Decode sequence of encoded images with Frame interpolation model.decoder (decompression by a factor of 8)
- Colorize images with Colorization model (decompression by a factor of 3)
- Compose series of frames into a video FFmpeg
There are two different methods of frame interpolation implemented in FrameInterp.py:
- Reconstruction which attemps to construct the original image from scratch
The first models of this type which I've experimented with is the Convolutional LSTM. The basic idea is to transfer hidden states both forward and backwerd (in time) in order to inform compression and decompression.
- Residual which attemps to refine the scaled encoded image to achieve decompression
- TensorFlow 2.0-beta0
- TensorFlow Probability 0.6
- Pandas 0.24.2 0.24.2
- Scikit-Learn 0.21.2 0.21.2
- Matplotlib 3.1.0
- FFmpeg
Here are some of the outputs obtained by using the models mentioned above:
MIT