<h1>DeepFake Starter Kit</h1>

<a id="0"><h2>Content</h2></a>  

* <a href="#1">Introduction</a>  
* <a href="#2">Preliminary Data Exploration</a>  
 * Load the Packages  
 * Import Utility Scripts  
 * Load the Data  
 * Check Files Type
* <a href="#3">Metadata Exploration</a>    
 * Missing Data  
 * Unique Values  
 * Most Frequent Originals  
* <a href="#4">Video Data Exploration</a>    
 * Missing Video data or Metadata  
 * Few Fake Videos  
 * Few Real Videos  
 * Videos with Same Original
 * Test Video Files  
* <a href="#5">Face Detection</a>   
 * Haar Cascades  
 * MTCNN 
* <a href="#6">Play Video Files</a>      
* <a href="#7">References</a>      

# <a id="1">Introduction</a>  

DeepFake is composed from Deep Learning and Fake and means taking one person from an image or video and replacing with someone else likeness using technology such as Deep Artificial Neural Networks [1]. Large companies like Google invest very much in fighting the DeepFake, this including release of large datasets to help training models to counter this threat [2].The phenomen invades rapidly the film industry and threatens to compromise news agencies. Large digital companies, including content providers and social platforms are in the frontrun of fighting Deep Fakes. GANs that generate DeepFakes becomes better every day and, of course, if you include in a new GAN model all the information we collected until now how to combat various existent models, we create a model that cannot be beatten by the existing ones.

In the **Data Exploration** section we perform a (partial) Exploratory Data Analysis (EDA) on the training and testing data. After we are checking the files types, we are focusing first on the **metadata** files, which we are exploring in details, after we are importing in dataframes. Then, we move to explore video files, by looking first to a sample of fake videos, then to real videos. After that, we are also exploring few of the videos with the same origin. We are visualizing one frame extracted from the video, for both real and fake videos. Then we are also playing few videos.
Then, we move to perform face (and other objects from the persons in the videos) extraction. More precisely, we are using OpenCV Haar Cascade resources to identify frontal face, eyes, smile and profile face from still images in the videos.

**Important note**: The data we analyze here is just a very small sample of data. The competition specifies that the train data is provided as archived chunks. Training of models should pe performed offline using the data provided by Kaggle as archives, models should be loaded (max 1GB memory) in a Kernel, where inference should be performed (submission sample file provided) and prediction should be prepared as an output file from the Kernel.

In the Resources section I provide a short list of various resources for GAN and DeepFake, with blog posts, Kaggle Kernels and Github repos.

---
<div style="float: right;">
        <a href="#0" class="button btn-info btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover" title="Go to Top">Go to Top</a>
</div>

# <a id="2">Preliminary Data Exploration</a>    

## Load Packages

In [None]:
import numpy as np
import pandas as pd
import os
from tqdm import tqdm_notebook
import cv2 as cv

## Import Utility Scripts

In [None]:
from data_quality_stats import missing_data, unique_values, most_frequent_values
from plot_style_utils import set_color_map, plot_count
from video_utils import display_image_from_video, display_images_from_video_list, play_video
from face_object_detection import CascadeObjectDetector, FaceObjectDetector
from face_detection_mtcnn import MTCNNFaceDetector

## Load Data

In [None]:
DATA_FOLDER = '../input/deepfake-detection-challenge'
TRAIN_SAMPLE_FOLDER = 'train_sample_videos'
TEST_FOLDER = 'test_videos'

print(f"Train samples: {len(os.listdir(os.path.join(DATA_FOLDER, TRAIN_SAMPLE_FOLDER)))}")
print(f"Test samples: {len(os.listdir(os.path.join(DATA_FOLDER, TEST_FOLDER)))}")

We also add face detection resources.

In [None]:
FACE_DETECTION_FOLDER = '../input/haar-cascades-for-face-detection'
print(f"Face detection resources: {os.listdir(FACE_DETECTION_FOLDER)}")

---
<div style="float: right;">
        <a href="#0" class="button btn-info btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover" title="Go to Top">Go to Top</a>
</div>

## Check Files Type  

Here we check the train data files extensions. Most of the files looks to have mp4 extension, let's check if there is other extension as well.

In [None]:
train_list = list(os.listdir(os.path.join(DATA_FOLDER, TRAIN_SAMPLE_FOLDER)))
ext_dict = []
for file in train_list:
    file_ext = file.split('.')[1]
    if (file_ext not in ext_dict):
        ext_dict.append(file_ext)
print(f"Extensions: {ext_dict}")   

Let's count how many files with each extensions there are.

In [None]:
for file_ext in ext_dict:
    print(f"Files with extension `{file_ext}`: {len([file for file in train_list if  file.endswith(file_ext)])}")

Let's repeat the same process for test videos folder.

In [None]:
json_file = [file for file in train_list if  file.endswith('json')][0]
print(f"JSON file: {json_file}")

Aparently here is a metadata file. Let's explore this JSON file.

In [None]:
def get_meta_from_json(path):
    df = pd.read_json(os.path.join(DATA_FOLDER, path, json_file))
    df = df.T
    return df

meta_train_df = get_meta_from_json(TRAIN_SAMPLE_FOLDER)
meta_train_df.head()

---
<div style="float: right;">
        <a href="#0" class="button btn-info btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover" title="Go to Top">Go to Top</a>
</div>

# <a id="3">Metadata Exploration</a>  

Let's explore now the meta data in train sample.

## Missing data  

In [None]:
missing_data(meta_train_df)

Indeed, all missing `original` data are the one associated with `REAL` label.

## Unique data  

In [None]:
unique_values(meta_train_df)

We observe that original label has the same pattern for uniques values. We know that we have 77 missing data (that's why total is only 323) and we observe that we do have 209 unique examples.

## Most frequent originals  

In [None]:
most_frequent_values(meta_train_df)

We see that most frequent label is `FAKE` (80.75%), `meawmsgiti.mp4` is the most frequent original (6 samples).

Let's do now some data distribution visualizations.

In [None]:
color_list = ['#4166AA', '#06BDDD', '#83CEEC', '#EDE8E4', '#C2AFA8']
cmap_custom = set_color_map(color_list)

In [None]:
plot_count(meta_train_df, 'split', 'split (train)', color_list, size=1)

In [None]:
plot_count(meta_train_df, 'label', 'label (train)', color_list, size=2)

As we can see, the `REAL` are only 19.25% in train sample videos, with the `FAKE` acounting for 80.75% of the samples.

---
<div style="float: right;">
        <a href="#0" class="button btn-info btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover" title="Go to Top">Go to Top</a>
</div>

# <a id="4"> Video Data Exploration</a>

In the following we will explore some of the video data.

## Missing video (or meta) data  

We check first if the list of files in the meta info and the list from the folder are the same. 

In [None]:
meta = np.array(list(meta_train_df.index))
storage = np.array([file for file in train_list if  file.endswith('mp4')])
print(f"Metadata: {meta.shape[0]}, Folder: {storage.shape[0]}")
print(f"Files in metadata and not in folder: {np.setdiff1d(meta,storage,assume_unique=False).shape[0]}")
print(f"Files in folder and not in metadata: {np.setdiff1d(storage,meta,assume_unique=False).shape[0]}")

Let's visualize now the data.

We select first a list of fake videos.

In [None]:
fake_train_sample_video = list(meta_train_df.loc[meta_train_df.label=='FAKE'].sample(3).index)
fake_train_sample_video

From the utility script `video_utils` we are using a function for displaying a selected image from a video.

In [None]:
for video_file in fake_train_sample_video:
    display_image_from_video(os.path.join(DATA_FOLDER, TRAIN_SAMPLE_FOLDER, video_file))

Let's try now the same for few of the images that are real.

## Few real videos

In [None]:
real_train_sample_video = list(meta_train_df.loc[meta_train_df.label=='REAL'].sample(3).index)
real_train_sample_video

In [None]:
for video_file in real_train_sample_video:
    display_image_from_video(os.path.join(DATA_FOLDER, TRAIN_SAMPLE_FOLDER, video_file))

## Videos with same original  

Let's look now to set of samples with the same original.

In [None]:
meta_train_df['original'].value_counts()[0:5]

We pick one of the originals with largest number of samples.

We also modify our visualization function to work with multiple images.

In [None]:
same_original_fake_train_sample_video = \
        list(meta_train_df.loc[meta_train_df.original=='meawmsgiti.mp4'].index)

display_images_from_video_list(video_path_list=same_original_fake_train_sample_video,
                               data_folder=DATA_FOLDER,
                               video_folder=TRAIN_SAMPLE_FOLDER)

Let's look now to a different selection of videos with the same original.

In [None]:
same_original_fake_train_sample_video = \
    list(meta_train_df.loc[meta_train_df.original=='atvmxvwyns.mp4'].index)

display_images_from_video_list(video_path_list=same_original_fake_train_sample_video,
                               data_folder=DATA_FOLDER,
                               video_folder=TRAIN_SAMPLE_FOLDER)

In [None]:
same_original_fake_train_sample_video = \
    list(meta_train_df.loc[meta_train_df.original=='qeumxirsme.mp4'].index)

display_images_from_video_list(video_path_list=same_original_fake_train_sample_video,
                               data_folder=DATA_FOLDER,
                               video_folder=TRAIN_SAMPLE_FOLDER)

In [None]:
same_original_fake_train_sample_video = \
    list(meta_train_df.loc[meta_train_df.original=='kgbkktcjxf.mp4'].index)  

display_images_from_video_list(video_path_list=same_original_fake_train_sample_video,
                               data_folder=DATA_FOLDER,
                               video_folder=TRAIN_SAMPLE_FOLDER)

## Test video files  

Let's also look to few of the test data files.

In [None]:
test_videos = pd.DataFrame(list(os.listdir(os.path.join(DATA_FOLDER, TEST_FOLDER))), columns=['video'])

In [None]:
test_videos.head()

Let's visualize now one of the videos.

In [None]:
display_image_from_video(os.path.join(DATA_FOLDER, TEST_FOLDER, test_videos.iloc[0].video))

Let's look to some more videos from test set.

In [None]:
display_images_from_video_list(test_videos.sample(6).video, DATA_FOLDER, TEST_FOLDER)

---
<div style="float: right;">
        <a href="#0" class="button btn-info btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover" title="Go to Top">Go to Top</a>
</div>

#  <a id="5">Face detection</a>  

For face detection we will use two different approaches. In the first one, we will use Haar cascades and in the second one we will use MTCNN models.

## Haar Cascades

In the first approach for face detection we will use the FaceObjectDetector class from `face_object_detection` utility script. This was modified from [5] (Face Detection using OpenCV) by @serkanpeldek we got and slightly modified the functions to extract face, profile face, eyes and smile.

The class CascadeObjectDetector initialize the cascade classifier (using the imported resource). The function detect uses a method of the CascadeClassifier to detect objects into images - in this case the face, eye, smile or profile face.

We load the resources for frontal face, eye, smile and profile face in an object of type FaceObjectDetector which will initialize specialized CascadeObjectDetector objects.  


In [None]:
face_object_detector = FaceObjectDetector(FACE_DETECTION_FOLDER)

We defined also the `detect` method of the `FaceObjectDetector` object. For each object to extract we are using a different shape and color, as following:

* Frontal face: green rectangle;  
* Eye: red circle;  
* Smile: red rectangle;  
* Profile face: blue rectangle. 

**Note**: due to a huge amount of false positive, we deactivate for now the smile detector.

The function `extract_image_objects`, as well defined as a member function of `FaceObjectDetector` in `face_object_detection` utility script, extracts an image from a video and then call the function that extracts the face rectangle from the image and display the rectangle above the image.

We apply the function for face detection for a selection of images from train sample videos.

In [None]:
same_original_fake_train_sample_video = \
    list(meta_train_df.loc[meta_train_df.original=='kgbkktcjxf.mp4'].index)

for video_file in same_original_fake_train_sample_video[1:4]:
    print(video_file)
    face_object_detector.extract_image_objects(video_file=video_file,
                          data_folder=DATA_FOLDER,
                          video_set_folder=TRAIN_SAMPLE_FOLDER,
                          show_smile=False                          
                          )

Let's do the same by enabling smile detection as well.

In [None]:
for video_file in same_original_fake_train_sample_video[1:2]:
    print(video_file)
    face_object_detector.extract_image_objects(video_file=video_file,
                          data_folder=DATA_FOLDER,
                          video_set_folder=TRAIN_SAMPLE_FOLDER,
                          show_smile=True                          
                          )

Indeed, the smile detection gives too many false positives.

In [None]:
train_subsample_video = list(meta_train_df.sample(3).index)
for video_file in train_subsample_video:
    print(video_file)
    face_object_detector.extract_image_objects(video_file=video_file,
                          data_folder=DATA_FOLDER,
                          video_set_folder=TRAIN_SAMPLE_FOLDER,
                          show_smile=False                          
                          )

Let's look to a small collection of samples from test videos.

In [None]:
subsample_test_videos = list(test_videos.sample(3).video)
for video_file in subsample_test_videos:
    print(video_file)
    face_object_detector.extract_image_objects(video_file=video_file,
                          data_folder=DATA_FOLDER,
                          video_set_folder=TEST_FOLDER,
                          show_smile=False                          
                          )

We can observe that in some cases, when the subject is not looking frontaly or when the luminosity is low, the algorithm for face detection is not detecting the face or eyes correctly. Due to a large amount of false positive, we deactivated for now the smile detector.

Let's retry now with a different algorithm, MTCNN model.

## MTCNN Model

First we `pip install` mtcnn library.

In [None]:
!pip install mtcnn

Then we import the model from the newly installed library.

In [None]:
from mtcnn.mtcnn import MTCNN
mtcnn_model = MTCNN()

With the instantiated object `mtcnn_model` we initialize a `MTCNNFaceDetector` type object from the Utility Script `face_detection_mtcnn`.

In [None]:
from face_detection_mtcnn import MTCNNFaceDetector
mtcnn_face_detector = MTCNNFaceDetector(mtcnn_model)

We prepare a path for one video to perform face detection. 

In [None]:
video_path = os.path.join(DATA_FOLDER, TRAIN_SAMPLE_FOLDER, fake_train_sample_video[1])

We run the `detect` function of `MTCNNFaceDetector`

In [None]:
mtcnn_face_detector.detect(video_path)

Let's repeat this face extraction for few more images.

In [None]:
video_path = os.path.join(DATA_FOLDER, TRAIN_SAMPLE_FOLDER, fake_train_sample_video[0])
mtcnn_face_detector.detect(video_path)

In [None]:
video_path = os.path.join(DATA_FOLDER, TRAIN_SAMPLE_FOLDER, fake_train_sample_video[2])
mtcnn_face_detector.detect(video_path)

We observe that this method is more  robust. It detects correctly the face and the face features even in cases when the image is less illuminated and the subject is not looking frontally.

With the implementation we did for `MTCNNFaceDetector` we display the following elements in the image:
* the bounding box for the face (with red)
* the position of keypoints (with green points), as following:  
    * left eye
    * right eye
    * nose
    * mouth left 
    * mouth right
* the confidence score (with magenta text, above the face bounding box). This score is shown as a rounded value of the first 4 decimals.

Besides the elements shown in the image, we also print the entire detection JSON.

Let's also look to some of the test videos.

In [None]:
for i in range(0, 3):
    video_path = os.path.join(DATA_FOLDER, TEST_FOLDER, subsample_test_videos[i])
    mtcnn_face_detector.detect(video_path)

---
<div style="float: right;">
        <a href="#0" class="button btn-info btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover" title="Go to Top">Go to Top</a>
</div>

# <a id="6"> Play video files</a>  

From [Play video and processing Kernel](https://www.kaggle.com/code/hamditarek/deepfake-detection-challenge-kaggle?scriptVersionId=28503498) by @hamditarek we learned how to play video files in a Kaggle Kernel. We included the function to play videos as well in the `video_utils` utilty script.


Let's look to few fake videos. 

In [None]:
fake_videos = list(meta_train_df.loc[meta_train_df.label=='FAKE'].index)

In [None]:
play_video(fake_videos[0], DATA_FOLDER, TRAIN_SAMPLE_FOLDER)    

In [None]:
play_video(fake_videos[1], DATA_FOLDER, TRAIN_SAMPLE_FOLDER) 

In [None]:
play_video(fake_videos[2], DATA_FOLDER, TRAIN_SAMPLE_FOLDER) 

In [None]:
play_video(fake_videos[3], DATA_FOLDER, TRAIN_SAMPLE_FOLDER) 

In [None]:
play_video(fake_videos[4], DATA_FOLDER, TRAIN_SAMPLE_FOLDER) 

From visual inspection of these fakes videos, in some cases is very easy to spot the anomalies created when engineering the deep fake, in some cases is more difficult.

---
<div style="float: right;">
        <a href="#0" class="button btn-info btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover" title="Go to Top">Go to Top</a>
</div>

#  <a id="7">References</a>

[1] Deepfake, Wikipedia, https://en.wikipedia.org/wiki/Deepfake  
[2] Google DeepFake Database, Endgadget, https://www.engadget.com/2019/09/25/google-deepfake-database/  
[3] A quick look at the first frame of each video, https://www.kaggle.com/brassmonkey381/a-quick-look-at-the-first-frame-of-each-video  
[4] Basic EDA Face Detection, split video, ROI, https://www.kaggle.com/marcovasquez/basic-eda-face-detection-split-video-roi  
[5] Face Detection with OpenCV, https://www.kaggle.com/serkanpeldek/face-detection-with-opencv  
[6] Face Detection using MTCNN — a guide for face extraction with a focus on speed, https://towardsdatascience.com/face-detection-using-mtcnn-a-guide-for-face-extraction-with-a-focus-on-speed-c6d59f82d49  
[7] Play video and processing, https://www.kaggle.com/hamditarek/play-video-and-processing/  

---
<div style="float: right;">
        <a href="#0" class="button btn-info btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover" title="Go to Top">Go to Top</a>
</div>

## 