This Repository contains code and data relating to the "Efficient Bitrate Ladder Construction using Transfer Learning and Spatio-Temporal Features" paper, presented in IEEE MVIP 2024.
The paper is available on IEEEXplore. A preprint is available here.
To install the prerequisites on Ubuntu:20:04
using miniconda3:23.10.0-1
run the following:
apt update
apt install gcc g++ libgl1-mesa-glx libsm6 libxext6
conda config --set channel_priority strict
conda config --add channels conda-forge
conda env create -f env_part01.yml
conda env update -n torch_env -f env_part02.yml --prune
conda activate torch_env
You might need to change prefix:
in both env_part01.yml
and env_part02_yml
based on the installation directory of conda
.
To perform training you need to download the following files from here and put them into the repository:
- Download the Slowfast model weights (
SLOWFAST_8x8_R50.pkl
) and store it underdata/checkpoints/Kinetics
. - Download the video (
videos_dataframe.csv
) and encode (encodes_dataframe.csv
) information tables and store them underdata/dataframes
. - Download and extract the DNN features (
features.tar.gz
) and store them underdata/features
.
The data
folder should look like this:
data
├── checkpoints
│ └── Kinetics
│ └── SLOWFAST_8x8_R50.pkl
├── config
│ └── SLOWFAST_8x8_R50.yaml
├── dataframes
│ ├── encodes_dataframe.csv
│ └── videos_dataframe.csv
└── features
├── deep_features
│ ├── spatial_features
│ │ ├── inception_v3
│ │ │ └── Mixed_7c.cat_2
│ │ │ └── mean_std
│ │ ├── resnet50
│ │ │ └── layer4.2.relu_2
│ │ │ └── mean_std
│ │ └── vgg16
│ │ └── features.29
│ │ └── mean_std
│ └── temporal_features
│ └── slowfast
└── fused_features
├── inception_v3_Mixed_7c.cat_2_mean_std__resnet50_layer4.2.relu_2_mean_std__slowfast
├── inception_v3_Mixed_7c.cat_2_mean_std__resnet50_layer4.2.relu_2_mean_std__vgg16_features.29_mean_std__slowfast
├── inception_v3_Mixed_7c.cat_2_mean_std__slowfast
├── inception_v3_Mixed_7c.cat_2_mean_std__vgg16_features.29_mean_std__slowfast
├── resnet50_layer4.2.relu_2_mean_std
├── resnet50_layer4.2.relu_2_mean_std__slowfast
├── resnet50_layer4.2.relu_2_mean_std__vgg16_features.29_mean_std__slowfast
├── slowfast
└── vgg16_features.29_mean_std__slowfast
After downloading the data, you can run the following command to train the model on the extracted spatial and temporal features:
python3 src/SME_main.py
The training outputs will be stored in data/results/train
. After training you can use the following command to do inference:
python3 src/SME_main.py -inference
The inference outputs will be stored in data/results/inference
. Finally, you can use the following command to construct the actual and predicted bitrate ladders:
python3 src/bitrate_ladder_constructor.py
The final result tables will be stored in data/results/final
.
This repository is associated with Work Package 2 (WP2) of the project FALCON. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 101022466.