Keyframes-GAN (IEEE TMM 2023)

Download the video demo

Overview

This is the official repo of the paper Perceptual Quality Improvement in Videoconferencing using Keyframes-based GAN.

In this work we propose a novel GAN architecture for compression artifacts reduction in videoconferencing. In this context, the speaker is typically in front of the camera and remains the same for the entire duration of the transmission. With this assumption, we can maintain a set of reference keyframes of the person from the higher quality I-frames that are transmitted within the video streams. First, we extract multi-scale features from the compressed and reference frames. Then, these features are combined in a progressive manner with Adaptive Spatial Feature Fusion blocks based on facial landmarks and with Spatial Feature Transform blocks. This allows to restore the high frequency details lost after the video compression.

Prerequisites and Installation

Clone the repo

git clone https://github.com/LorenzoAgnolucci/Keyframes-GAN.git

Create a virtual env and install all the dependencies with

pip install -r requirements.txt

Even if it is not required, we strongly recommend to install dlib with GPU support
For metrics computation, you need to run

pip install -e pybrisque/

Download the pretrained models

and move them inside the pretrained_models folder

Usage

For testing, you need one or more HQ mp4 videos. These videos will be compressed with a given CRF. The face from each frame will be cropped, aligned and then restored with our model exploiting HQ keyframes.

Testing

Move the HQ videos under a directory named {BASE_PATH}/original/
Run

python preprocessing.py --base_path {BASE_PATH} --crf 42

where crf is a given Constant Rate Factor (default 42)

Run

python video_inference.py --base_path {BASE_PATH} --crf 42 --max_keyframes 5

where crf must be equal to the one of step 2 and max_keyframes is the max cardinality of the set of keyframes (default 5)

If needed, run

python compute_metrics.py --gt_path {BASE_PATH}/original --inference_path inference/DMSASFFNet/max_keyframes_5/LFU

where gt_path is the directory that contains the HQ videos and inference_path is the directory that contains the restored frames

Citation

@article{agnolucci2023perceptual,
  author={Agnolucci, Lorenzo and Galteri, Leonardo and Bertini, Marco and Bimbo, Alberto Del},
  journal={IEEE Transactions on Multimedia}, 
  title={Perceptual Quality Improvement in Videoconferencing using Keyframes-based GAN}, 
  year={2023},
  volume={},
  number={},
  pages={1-14},
  doi={10.1109/TMM.2023.3264882}}

Acknowledgments

We rely on BasicSR for the implementation of our model and for metrics computation.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
BasicSR/basicsr		BasicSR/basicsr
media		media
pybrisque		pybrisque
README.md		README.md
compute_metrics.py		compute_metrics.py
face_aligner.py		face_aligner.py
mls_face_warping.py		mls_face_warping.py
preprocessing.py		preprocessing.py
requirements.txt		requirements.txt
video_inference.py		video_inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BasicSR/basicsr

BasicSR/basicsr

media

media

pybrisque

pybrisque

README.md

README.md

compute_metrics.py

compute_metrics.py

face_aligner.py

face_aligner.py

mls_face_warping.py

mls_face_warping.py

preprocessing.py

preprocessing.py

requirements.txt

requirements.txt

video_inference.py

video_inference.py

Repository files navigation

Keyframes-GAN (IEEE TMM 2023)

Download the video demo

Table of Contents

Overview

Prerequisites and Installation

Usage

Testing

Citation

Acknowledgments

About

Releases

Packages

Languages

LorenzoAgnolucci/Keyframes-GAN

Folders and files

Latest commit

History

Repository files navigation

Keyframes-GAN (IEEE TMM 2023)

Table of Contents

Overview

Prerequisites and Installation

Usage

Testing

Citation

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Languages