AIVC is a neural-based video codec offering competitive performance and a lot of flexibility.
Many more details are available @ https://orange-opensource.github.io/AIVC/
A paper describing AIVC is available @ AIVC: Artificial Intelligence based Video Codec, Ladune et al.
Some (big) files are stored using Git LFS which has to be installed prior to cloning the repository:
$ sudo apt-get install git-lfs
$ git lfs install
Clone the repositories from GitHub:
$ git clone https://github.com/Orange-OpenSource/AIVC.git
The best way to ensure reproducibility is to run the code within a docker container built from the following Dockerfile
FROM pytorch/pytorch:1.7.0-cuda11.0-cudnn8-devel
RUN apt-get update && \
pip install scipy && \
pip install torchac # Thanks to Fabian Mentzer for the package! https://github.com/fab-jul/torchac
Create the docker image by executing the following command within the folder containing the Dockerfile
$ docker build -t aivc .
Finally, launch an interactive container of the aivc docker image
$ docker run -it -v <path_to_aivc>:<path_to_aivc> aivc bash # <path_to_aivc> is the path where the repo is cloned
Finally, launch the following script to ensure than everything is working properly.
$ cd AIVC/src
$ ./sanity_script.sh
This scripts encodes, decodes and measures the size and quality of the compressed video. It should return
PSNR [dB]: 26.72133
MS-SSIM : 0.93531
MS-SSIM [dB]: 11.89147
Size [bytes]: 28429
The script aivc.py
performs 3 tasks
- It encodes a .yuv video into a bitstream
- It decodes a bitstream into a .yuv video
- It measures the size of the bitstream and the quality (MS-SSIM and PSNR) of the compressed video. (Quality measure derives from the CLIC video track)
The input and output format is YUV 420. To be processed by the model and to measure the quality, each frame is transformed into a triplet of PNGs, one for each color channel.
The sanity_script.sh
provides an example of how to compress a video.
python aivc.py \
-i ../raw_videos/BlowingBubbles_416x240_50_420 \
-o ../compressed.yuv \
--bitstream_out ../bitstream.bin \
--start_frame 0 \
--end_frame 100 \
--coding_config RA \
--gop_size 16 \
--intra_period 32 \
--model ms_ssim-2021cc-6
Option | Description | Usage | Example |
---|---|---|---|
-i | Path of the input video. | Either a .yuv file or a folder containing the already extracted PNGs triplet | -i ../raw_videos/BlowingBubbles_416x240_50_420.yuv -i ../raw_videos/BlowingBubbles_416x240_50_420 |
-o | Path of the compressed video | A .yuv file (the PNG triplets are generated alongside the .yuv in a dedicated folder) | -o ../compressed.yuv |
--bitstream_out | Path of the bitstream | A .bin file | --bitstream_out ../bitstream.bin |
--start_frame | Index of the first frame to compress | An integer, 0 corresponds to the very first frame | --start_frame 0 |
--end_frame | Index of the last frame to compress | An integer, the last frame is included. Use -1 to compress until the last frame. | --end_frame 100 |
--coding_config | Desired coding configuration | RA for Random Access (I, P and B-frames) LDP for Low-delay P (I and P-frames) AI for All Intra (I-frames) | --coding_configuration RA |
--gop_size | Number of frames within a hierarchical GOP (RA only) | Must be a power of two. Min: 2, Max: 65535. This is different from the intra period! See example below. | --gop_size 16 |
--intra_period | Number of inter-frames between two intra (RA and LDP only) | Must be a multiple of gop size (RA) Min: 2, Max: 65535. | --intra_period 32 |
--model | Model used to perform encoding and decoding. | ms_ssim-2021cc-X where X in [1, 7]. 1 is the highest rate, 7 the lowest rate. | --model ms_ssim-2021cc-6 |
--cpu | Run on CPU | --cpu |
This is a random access coding structure
- Intra period: 16
- GOP size 8
This is a low-delay P coding structure
- Intra period: 16
Plain image coding for all the frames
Questions, remarks, bug reports can be posted on the AIVC google group.
- February 2022
- Fix memory leak
- Allow for more coding configurations
- September 2021
- Initial release of the code
- Models
- Personal page
- E-mail: theo.ladune@gmail.com
Copyright 2021 Orange
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
-
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
-
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
-
Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.