Skip to content

Code repository for TUM Course in Advanced Deep Learning for Computer Vision

License

Notifications You must be signed in to change notification settings

Megatvini/DeepFaceForgeryDetection

Repository files navigation

DeepFaceForgery Detection

This repository contains code for deep face forgery detection in video frames. This is a student project from Advanced Deep Learning for Computer Vision course at TUM. Publication available on Arxiv

FaceForensics Benchmark

Using transfer learning we were able to achieve a new state of the art performance on faceforenics benchmark State of the art results on public benchmark

Dataset and technologies

For detecting video frame forgeries we use FaceForensics++ dataset of pristine and manipulated videos. As a preprocessing step we extract faces from all the frames using MTCNN. Total dataset is ~507GB and contains 7 million frames. Dataset downloading and frame extraction code is located in dataset directory. For model training, we use the split from FaceForensics++ repository.

Main technologies:

  1. Python as main programming language
  2. Pytorch as deep learning library
  3. pip for dependency management

Training/Evaluation

All the training and evaluation code together with various models are stored in src directory. All scripts are self-documenting. Just run them with --help option. They automatically use gpu when available, but will also work on cpu, but very slowly, unfortunately.

Single frame model

We got the best single-frame classification accuracy using a version of Inception Resnet V1 model pretrained on VGGFace2 face recognition dataset. Inception Resnet V1 diagram

Window frame models

We also evaluated how performance improves when incorporating temporal data. The task in this case changes from single frame classification to frame sequence classification. We used 2 different models for such an approach 3D convolutional and Bi-LSTM.

3D convolutional model

3D convolutional model diagram
Temporal feature locality assumption that 3D convolutional model has, seems reasonable in this case, but it is very slow to train for large window sizes.

LSTM with 2D CNN encoder

LSTM with 2D CNN encoder diagram

Citation

@misc{dogonadze2020deep,
    title={Deep Face Forgery Detection},
    author={Nika Dogonadze and Jana Obernosterer and Ji Hou},
    year={2020},
    eprint={2004.11804},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Model Weights

Various model weights are available here - models

About

Code repository for TUM Course in Advanced Deep Learning for Computer Vision

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages