## Problem

**Task [[kaggle](https://www.kaggle.com/c/reface-fake-detection)]:** recognize fake videos. You need to train the binary classifier to distinguish real videos from fake ones (the provided fake data is the result of the technologies developed in Reface).

****

### What I should get?

In order to complete this stage, you should meet one of 2 conditions below:
+ either make a solution with a minimum target metric value of 0.92475
+ or be in the top 30 of all competitors.

****

### Evaluation

The evaluation metric for this competition is F1-Score, average='micro'. The F1 score, commonly used in information retrieval, measures accuracy using the statistics precision p and recall r. Precision is the ratio of true positives (tp) to all predicted positives (tp + fp). Recall is the ratio of true positives to all actual positives (tp + fn).

The F1 metric weights recall and precision equally, and a good retrieval algorithm will maximize both precision and recall simultaneously. Thus, moderately good performance on both will be favored over extremely good performance on one and poor performance on the other.

More information you can find at sklearn docs:
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html

****

### Submission

For each filename in the test set, you must predict either this file is fake video (label 1) or this file is real video (label 0). The file should contain a header and have the following format:

```
filename,label
004582.mp4,1
003603.mp4,0
```

## Install external modules and load our data

In [3]:
!pip install -qq av

In [4]:
!pip install -qq torchsummary
!pip install -qq linformer
!pip install -qq vit_pytorch

In [5]:
!pip install facenet-pytorch > /dev/null 2>&1
!apt install zip > /dev/null 2>&1

In [6]:
# from google.colab import drive
# drive.mount('/content/drive')

In [94]:
!nvidia-smi

In [8]:
# !rm -rf reface-fake-detection tmp

In [9]:
# !cp -R /content/drive/MyDrive/dl-creator-school/reface-fake-detection reface-fake-detection

In [10]:
# !rsync -r --info=progress2 /content/drive/MyDrive/dl-creator-school/reface-fake-detection reface-fake-detection

## Modules importing

In [11]:
import os
import glob
import json
import cv2
import multiprocessing as mp

import pandas as pd
import numpy as np

import torch
import torch.nn.functional as F
import torchvision

from torch import nn, optim
from torch.utils.data import sampler, DataLoader, Dataset
from torch.optim.lr_scheduler import MultiStepLR, CosineAnnealingLR, ReduceLROnPlateau, StepLR
from torch.utils import data
from torchvision import transforms, models
from torchvision.models import resnet101
from torchsummary import summary

from albumentations import Normalize, Compose, Resize, CenterCrop, HorizontalFlip, Rotate, VerticalFlip, RandomCrop, Downscale, RandomBrightnessContrast, GaussianBlur, HueSaturationValue
from albumentations.pytorch import ToTensorV2

from facenet_pytorch import MTCNN, InceptionResnetV1, fixed_image_standardization, training

from linformer import Linformer
from vit_pytorch.efficient import ViT

from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score

from PIL import Image
from tqdm.notebook import tqdm

from typing import List, Dict, Tuple, Union, Optional
from pathlib import Path

In [12]:
import matplotlib.pyplot as plt
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
plt.rcParams['figure.dpi'] = 150

## Settings

In [13]:
if torch.cuda.is_available():
    device = 'cuda:0'
#     torch.set_default_tensor_type('torch.cuda.FloatTensor')
#     torch.multiprocessing.set_start_method('spawn')
else:
    device = 'cpu'
print(f'Running on device: {device}')

In [14]:
# !ls ../input/reface-fake-det-faces

In [15]:
PATH2PROJECT = Path('../input')
# PATH2DRIVE = Path('/content/drive/MyDrive/dl-creator-school/')

PATH2DATA = PATH2PROJECT / 'reface-fake-det-faces'
PATH2TRAIN = PATH2DATA / 'train'
PATH2TEST = PATH2DATA / 'test'
PATH2SUBMISSIONS = Path('../working') / 'submissions'
PATH2CHECKOUTS = Path('../working') / 'checkouts'

In [16]:
try: PATH2SUBMISSIONS.mkdir()
except: pass
try: PATH2CHECKOUTS.mkdir()
except: pass

In [17]:
SEED = 42
VAL_SIZE = 0.2

In [18]:
N_FACES = 4

BATCH_SIZE = 32
NUM_WORKERS = mp.cpu_count()

WARM_UP_EPOCHS = 5
WARM_UP_LR = 3e-3
FINE_TUNE_EPOCHS = 20
FINE_TUNE_LR = 5e-4

H, W = 112, 112 #224, 224
DELTA = 10
MEAN = [0.485, 0.456, 0.406]
STD = [0.229, 0.224, 0.225]

THRESHOLD = 0.5
EPSILON = 1e-7

## Training metadata

In [19]:
meta_df = pd.read_csv(PATH2PROJECT / 'trainreface' / 'train.csv')
meta_df.shape

In [20]:
meta_df.label.value_counts(normalize=True)

In [21]:
meta_df['path'] = meta_df['filename'].apply(lambda x: str(PATH2TRAIN / x.split('.')[0]))

In [22]:
meta_df.sample(n=5, random_state=SEED)

## Clean data

### Remove corrupt videos or ones in what cannot detect any faces

In [23]:
meta_df = meta_df[meta_df['path'].map(lambda x: os.path.exists(x))]
meta_df.shape

### Remove videos in which do not have enough faces

In [24]:
# try:
#     valid_meta_df = pd.read_csv(PATH2PROJECT / 'trainreface' / 'valid_meta_df.csv')
# except:
valid_meta_df = pd.DataFrame(columns=['filename', 'label', 'path'])
r = []
# for row_idx, row in tqdm(train_df.iterrows()):
for row_idx in tqdm(meta_df.index):
    row = meta_df.loc[row_idx]
    img_dir = row['path']
    face_paths = glob.glob(f'{img_dir}/*.png')

    if len(face_paths) >= N_FACES: # Satisfy the minimum requirement for the number of faces
        r.append(row)

valid_meta_df = valid_meta_df.append(r, ignore_index=True)
# valid_meta_df.to_csv(PATH2PROJECT / 'trainreface' / 'valid_meta_df.csv', index=False)
valid_meta_df.shape

In [25]:
valid_meta_df.head()

In [26]:
# valid_meta_df.path = valid_meta_df.path.apply(lambda x: '../input/reface-fake-det-faces/'+x.split('reface-fake-detection/')[-1])

In [27]:
folders = os.listdir(PATH2TEST)
X_test = pd.DataFrame({'path': [str(PATH2TEST/folder) for folder in folders], 'filename': folders})
len(X_test)

In [28]:
submission = pd.read_csv(PATH2PROJECT / 'trainreface' / 'sample_submission.csv')
submission.shape

In [29]:
submission['path'] = submission['filename'].apply(lambda x: str(PATH2TEST/x.split('.')[0]))

## Stratified split data on test and validation

In [30]:
X_train, X_val, y_train, y_val = train_test_split(
    valid_meta_df['path'].to_numpy(),
    valid_meta_df['label'].to_numpy(),
    test_size=VAL_SIZE,
    random_state=SEED, 
    stratify=valid_meta_df['label']
)

In [31]:
np.mean(y_train), np.mean(y_val)

In [32]:
assert not set(X_train.tolist()) & set(X_val.tolist()), 'intersection is not empty'

## Helper functions

In [33]:
def calculate_f1(preds, labels):
    '''
    Parameters:
        preds: The predictions.
        labels: The labels.

    Returns:
        f1 score
    '''
    return f1_score(labels, (np.array(preds) >= THRESHOLD).astype(np.uint8), average='micro')


def train_the_model(
    model,
    criterion,
    optimizer,
    scheduler,
    epochs,
    train_dataloader,
    val_dataloader,
    best_val_loss=1e7,
):
    '''
    Parameters:
        model: The model needs to be trained.
        criterion: Loss function.
        optimizer: The optimizer.
        epochs: The number of epochs
        train_dataloader: The dataloader used to generate training samples.
        val_dataloader: The dataloader used to generate validation samples.
        best_val_loss: The initial value of the best val loss (default: 1e7.)

    Returns:
        losses: All computed losses.
        val_losses: All computed val_losses.
        loglosses: All computed loglosses.
        f1_scores: All computed f1_scores.
        val_f1_scores: All computed val_f1_scores.
        best_val_loss: New value of the best val loss.
        best_model_state_dict: The state_dict of the best model.
        best_optimizer_state_dict: The state_dict of the optimizer corresponds to the best model.
    '''

    losses = np.zeros(epochs)
    val_losses = np.zeros(epochs)
    f1_scores = np.zeros(epochs)
    val_f1_scores = np.zeros(epochs)
    best_model_state_dict = None
    best_optimizer_state_dict = None

    for i in tqdm(range(epochs)):
        batch_losses = []
        train_pbar = tqdm(train_dataloader)
        train_pbar.desc = f'Epoch {i+1}'
        classifier.train()

        all_labels = []
        all_preds = []

        for i_batch, sample_batched in enumerate(train_pbar):
            # Zero gradients
            optimizer.zero_grad()
            
            # Make prediction.
#             y_pred = classifier(sample_batched['faces'])

#             all_labels.extend(sample_batched['label'].squeeze(dim=-1).detach().cpu().numpy().tolist())
#             all_preds.extend(y_pred.squeeze(dim=-1).detach().cpu().numpy().tolist())

#             # Compute loss.
#             loss = criterion(y_pred, torch.tensor(sample_batched['label'],dtype=torch.float))
#             batch_losses.append(loss.item())
            y_pred = classifier(sample_batched['faces'].to(device))

            all_labels.extend(sample_batched['label'].numpy().tolist())
            all_preds.extend(y_pred.squeeze(dim=-1).detach().cpu().numpy().tolist())

            # Compute loss.
            loss = criterion(y_pred, sample_batched['label'].to(device))
            batch_losses.append(loss.item())

            # Perform a backward pass, and update the weights.
            loss.backward()
            optimizer.step()

            # Display some information in progress-bar.
            train_pbar.set_postfix({
                'loss': batch_losses[-1]
            })

        # Compute scores.
        f1_scores[i] = calculate_f1(all_preds, all_labels)

        # Compute batch loss (average).
        losses[i] = np.array(batch_losses).mean()


        # Compute val loss
        val_batch_losses = []
        val_pbar = tqdm(val_dataloader)
        val_pbar.desc = 'Validating'
        classifier.eval()

        all_labels = []
        all_preds = []

        for i_batch, sample_batched in enumerate(val_pbar):
            # Make prediction.
#             y_pred = classifier(sample_batched['faces'])

#             all_labels.extend(sample_batched['label'].squeeze(dim=-1).detach().cpu().numpy().tolist())
#             all_preds.extend(y_pred.squeeze(dim=-1).detach().cpu().numpy().tolist())

#             # Compute val loss.
#             val_loss = criterion(y_pred, torch.tensor(sample_batched['label'],dtype=torch.float))
#             val_batch_losses.append(val_loss.item())
            
            y_pred = classifier(sample_batched['faces'].to(device))

            all_labels.extend(sample_batched['label'].numpy().tolist())
            all_preds.extend(y_pred.squeeze(dim=-1).detach().cpu().numpy().tolist())

            # Compute val loss.
            val_loss = criterion(y_pred, sample_batched['label'].to(device))
            val_batch_losses.append(val_loss.item())

            # Display some information in progress-bar.
            val_pbar.set_postfix({
                'val_loss': val_batch_losses[-1]
            })

        # Compute val scores.
        val_f1_scores[i] = calculate_f1(all_preds, all_labels)

        val_losses[i] = np.array(val_batch_losses).mean()
        print(f'loss: {losses[i]} | val loss: {val_losses[i]} | f1: {f1_scores[i]} | val f1: {val_f1_scores[i]}')
        
        # step of lr scheduler
        scheduler.step(val_losses[i])
        
        # Update the best values
        if val_losses[i] < best_val_loss:
            best_val_loss = val_losses[i]
            
            print('Found a better checkpoint!')
            best_model_state_dict = classifier.state_dict()
            best_optimizer_state_dict = optimizer.state_dict()
            state = {
                'state_dict': best_model_state_dict,
                'warmup_optimizer': best_optimizer_state_dict,
                'best_val_loss': best_val_loss,
            }
            torch.save(state, 'best-checkout.pth')
            
    return losses, val_losses, f1_scores, val_f1_scores, best_val_loss, best_model_state_dict, best_optimizer_state_dict


def visualize_results(
    losses,
    val_losses,
    f1_scores,
    val_f1_scores
):
    '''
    Parameters:
        losses: A list of losses.
        val_losses: A list of val losses.
        f1_scores: A list of f1 scores.
        val_f1_scores: A list of val f1 scores.
    '''

    fig = plt.figure(figsize=(16, 8))
    ax = fig.add_axes([0, 0, 1, 1])

    ax.plot(np.arange(1, len(losses) + 1), losses)
    ax.plot(np.arange(1, len(val_losses) + 1), val_losses)
    ax.set_xlabel('epoch', fontsize='xx-large')
    ax.set_ylabel('loss', fontsize='xx-large')
    ax.legend(
        ['loss', 'val loss'],
        loc='upper right',
        fontsize='xx-large',
        shadow=True
    )
    plt.show()

    fig = plt.figure(figsize=(16, 8))
    ax = fig.add_axes([0, 0, 1, 1])

    ax.plot(np.arange(1, len(f1_scores) + 1), f1_scores)
    ax.plot(np.arange(1, len(val_f1_scores) + 1), val_f1_scores)
    ax.set_xlabel('epoch', fontsize='xx-large')
    ax.set_ylabel('f1 score', fontsize='xx-large')
    ax.legend(
        ['f1', 'val f1'],
        loc='upper left',
        fontsize='xx-large',
        shadow=True
    )
    plt.show()

## Dataset and Dataloaders

In [59]:
class FaceDataset(Dataset):
    def __init__(self, img_dirs, labels, n_faces=1, preprocess=None):
        self.img_dirs = img_dirs
        self.labels = labels
        self.n_faces = n_faces
        self.preprocess = preprocess

    def __len__(self):
        return len(self.img_dirs)
    
    def __getitem__(self, idx):
        if torch.is_tensor(idx):
            idx = idx.tolist()

        img_dir = self.img_dirs[idx]
        label = self.labels[idx]
        face_paths = glob.glob(f'{img_dir}/*.png')

        if len(face_paths) >= self.n_faces:
            sample = sorted(np.random.choice(face_paths, self.n_faces, replace=False))
        else:
            sample = sorted(np.random.choice(face_paths, self.n_faces, replace=True))
            
        faces = []
        for face_path in sample:
            face = cv2.imread(face_path, 1)
            face = cv2.cvtColor(face, cv2.COLOR_BGR2RGB)
            faces.append(face)
            
        if self.preprocess is not None:
            d = {f'image{i-1}': faces[i] for i in range(1, self.n_faces)}
            d['image'] = faces[0]
            faces = list(self.preprocess(**d).values())

        return {'faces': torch.stack(faces).permute(1, 0, 2, 3), 'label': torch.tensor([label], dtype=torch.float)}#{'faces': np.concatenate(faces, axis=-1).transpose(2, 0, 1), 'label': np.array([label], dtype=float)}

In [58]:
torch.tensor([1], dtype=torch.float)

In [60]:
train_transforms = Compose([
    Resize(H+DELTA, W+DELTA),
    Downscale(scale_min=0.5, scale_max=0.9, p=0.3),
    RandomCrop(H, W),
    HorizontalFlip(p=0.5),
    RandomBrightnessContrast(brightness_limit=0, contrast_limit=0.2, p=0.3),
    HueSaturationValue(hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, p=0.3),
    GaussianBlur(blur_limit=(3, 7), p=0.3),
    Normalize(mean=MEAN, std=STD, p=1),
    ToTensorV2()
], additional_targets={f'image{i}': 'image' for i in range(0, N_FACES-1)})

val_transforms = Compose([
    Resize(H+DELTA, W+DELTA),
    CenterCrop(H, W),
    Normalize(mean=MEAN, std=STD, p=1),
    ToTensorV2()
], additional_targets={f'image{i}': 'image' for i in range(0, N_FACES-1)})

test_transforms = Compose([
    Resize(H+DELTA, W+DELTA),
    CenterCrop(H, W),
    Normalize(mean=MEAN, std=STD, p=1),
    ToTensorV2()
], additional_targets={f'image{i}': 'image' for i in range(0, N_FACES-1)})

In [73]:
train_dataset = FaceDataset(
    img_dirs=X_train,
    labels=y_train,
    n_faces=N_FACES,
    preprocess=train_transforms
)
val_dataset = FaceDataset(
    img_dirs=X_val,
    labels=y_val,
    n_faces=N_FACES,
    preprocess=val_transforms
)
test_dataset = FaceDataset(
    img_dirs=X_test['path'].values,
    labels=[0]*len(X_test['path']),
    n_faces=N_FACES,
    preprocess=test_transforms
)

train_dataloader = DataLoader(
    train_dataset,
    batch_size=BATCH_SIZE,
    shuffle=True,
    num_workers=NUM_WORKERS,
#     generator=torch.Generator(device='cuda'),
#     num_workers=0,
#     pin_memory=False,
)
val_dataloader = DataLoader(
    val_dataset,
    batch_size=BATCH_SIZE,
    shuffle=False,
    num_workers=NUM_WORKERS,
#     generator=torch.Generator(device='cuda'),
#     num_workers=0,
#     pin_memory=False,
)
test_dataloader = DataLoader(
    test_dataset,
    batch_size=BATCH_SIZE,
    shuffle=False,
    num_workers=NUM_WORKERS,
#     generator=torch.Generator(device='cuda'),
#     num_workers=0,
#     pin_memory=False,
)

In [74]:
plt.figure(figsize=(10,10))
for ii,img in enumerate(next(iter(train_dataloader))['faces'][0].permute(1, 0, 2, 3)):
    plt.subplot(2,2,ii+1)
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    inp = img.numpy().transpose((1, 2, 0))
    inp = std * inp + mean
    inp = np.clip(inp, 0, 1)
    plt.imshow(inp)

In [63]:
next(iter(train_dataloader))['faces'].shape

In [64]:
# next(iter(train_dataloader))['faces'].permute(0, 2, 1, 3, 4).shape

## Models

In [65]:
class DeepfakeClassifierResnet(nn.Module):
    def __init__(self, encoder, in_channels=3, out_channels=64, kernel_size=7, stride=2, padding=3, bias=False, linear_size=2048, num_classes=1):
        super(DeepfakeClassifierResnet, self).__init__()
        self.encoder = encoder
        
        # Modify input layer.
        self.encoder.conv1 = nn.Conv2d(
            in_channels,
            out_channels,
            kernel_size=kernel_size,
            stride=stride,
            padding=padding,
            bias=bias,
        )
        
        # Modify output layer.
#         self.encoder.fc = nn.Sequential(
#             nn.BatchNorm1d(linear_size), 
#             nn.Dropout(p=0.25), # 
#             nn.Linear(in_features=linear_size, out_features=2048),
#             nn.ReLU(),
#             nn.BatchNorm1d(2048, eps=1e-05, momentum=0.1),
#             nn.Dropout(p=0.5),
#             nn.Linear(in_features=2048, out_features=num_classes)
#         )
        self.encoder.fc = nn.Linear(linear_size * 1, num_classes)

    def forward(self, x):
        return torch.sigmoid(self.encoder(x))
    
    def freeze_all_layers(self):
        for param in self.encoder.parameters():
            param.requires_grad = False

    def freeze_middle_layers(self):
        self.freeze_all_layers()
        
        for param in self.encoder.conv1.parameters():
            param.requires_grad = True
            
        for param in self.encoder.fc.parameters():
            param.requires_grad = True

    def unfreeze_all_layers(self):
        for param in self.encoder.parameters():
            param.requires_grad = True

In [66]:
class DeepfakeClassifierInception(nn.Module):
    def __init__(self, encoder, in_channels=3, out_channels=32, kernel_size=3, stride=2, padding=3, bias=False, linear_size=512, num_classes=1):
        super(DeepfakeClassifierInception, self).__init__()
        self.encoder = encoder
        
        # Modify input layer.
        self.encoder.conv2d_1a.conv = nn.Conv2d(
            in_channels,
            out_channels,
            kernel_size=kernel_size,
            stride=stride,
            padding=padding,
            bias=bias,
        )
        
        # Modify output layer.
#         self.encoder.logits = nn.Sequential(
#             nn.BatchNorm1d(linear_size), 
#             nn.Dropout(p=0.25), # 
#             nn.Linear(in_features=linear_size, out_features=2048),
#             nn.ReLU(),
#             nn.BatchNorm1d(2048, eps=1e-05, momentum=0.1),
#             nn.Dropout(p=0.5),
#             nn.Linear(in_features=2048, out_features=num_classes)
#         )
        self.encoder.logits = nn.Linear(linear_size * 1, num_classes)

    def forward(self, x):
        return torch.sigmoid(self.encoder(x))
    
    def freeze_all_layers(self):
        for param in self.encoder.parameters():
            param.requires_grad = False

    def freeze_middle_layers(self):
        self.freeze_all_layers()
        
        for param in self.encoder.conv2d_1a.conv.parameters():
            param.requires_grad = True
            
        for param in self.encoder.logits.parameters():
            param.requires_grad = True

    def unfreeze_all_layers(self):
        for param in self.encoder.parameters():
            param.requires_grad = True

In [67]:
class DeepfakeClassifierR3D18(nn.Module):
    def __init__(self, encoder, linear_size=512, num_classes=1):
        super(DeepfakeClassifierR3D18, self).__init__()
        self.encoder = encoder
        
        # Modify output layer.
        num_features = self.encoder.fc.in_features
        self.encoder.fc = nn.Linear(num_features, num_classes)
#         self.encoder.fc = nn.Sequential(
#             nn.BatchNorm1d(linear_size), 
#             nn.Dropout(p=0.25), # 
#             nn.Linear(in_features=linear_size, out_features=2048),
#             nn.ReLU(),
#             nn.BatchNorm1d(2048, eps=1e-05, momentum=0.1),
#             nn.Dropout(p=0.5),
#             nn.Linear(in_features=2048, out_features=num_classes)
#         )

#         self.encoder.logits = nn.Linear(linear_size * 1, num_classes)

    def forward(self, x):
        return torch.sigmoid(self.encoder(x))
    
    def freeze_all_layers(self):
        for param in self.encoder.parameters():
            param.requires_grad = False

    def freeze_middle_layers(self):
        self.freeze_all_layers()
            
        for param in self.encoder.fc.parameters():
            param.requires_grad = True

    def unfreeze_all_layers(self):
        for param in self.encoder.parameters():
            param.requires_grad = True

In [68]:
class FocalLoss(nn.Module):
    def __init__(self, gamma=2, sample_weight=None):
        super().__init__()
        self.gamma = gamma
        self.sample_weight = sample_weight

    def forward(self, logit, target):
        target = target.float()
        max_val = (-logit).clamp(min=0)
        loss = logit - logit * target + max_val + \
               ((-max_val).exp() + (-logit - max_val).exp()).log()

        invprobs = F.logsigmoid(-logit * (target * 2.0 - 1.0))
        loss = (invprobs * self.gamma).exp() * loss
        if len(loss.size())==2:
            loss = loss.sum(dim=1)
        if self.sample_weight is not None:
            loss = loss * self.sample_weight
        return loss.mean()

In [69]:
# efficient_transformer = Linformer(
#     dim=128,
#     seq_len=49+1,  # 7x7 patches + 1 cls-token
#     depth=12,
#     heads=8,
#     k=64
# )

# classifier = ViT(
#     dim=128,
#     image_size=224,
#     patch_size=32,
#     num_classes=1,
#     transformer=efficient_transformer,
#     channels=3,
# ).to(device)
# classifier.train()

In [45]:
encoder_r3d_18 = models.video.r3d_18(
    pretrained=True,
)

classifier = DeepfakeClassifierR3D18(encoder=encoder_r3d_18, linear_size=512, num_classes=1)

classifier.to(device);
classifier.train();

In [46]:
# x = torch.zeros(1, 3, N_FACES, H, W)
# y= classifier(x)
# print(y.shape)

In [47]:
# encoder_facenet = InceptionResnetV1(
#     classify=True,
#     pretrained='casia-webface',
#     num_classes=1
# )

# classifier = DeepfakeClassifierInception(encoder=encoder_facenet, in_channels=3*N_FACES, num_classes=1)

# classifier.to(device);
# classifier.train();

In [48]:
# encoder_resnet = resnet101(pretrained=True)

# classifier = DeepfakeClassifierResnet(encoder=encoder_resnet, in_channels=3*N_FACES, num_classes=1)

# classifier.to(device);
# classifier.train();

In [49]:
criterion = nn.BCELoss()#FocalLoss()

In [50]:
losses = np.zeros(WARM_UP_EPOCHS + FINE_TUNE_EPOCHS)
val_losses = np.zeros(WARM_UP_EPOCHS + FINE_TUNE_EPOCHS)
f1_scores = np.zeros(WARM_UP_EPOCHS + FINE_TUNE_EPOCHS)
val_f1_scores = np.zeros(WARM_UP_EPOCHS + FINE_TUNE_EPOCHS)

best_val_loss = 1e7

## Define training hyperparameters

In [51]:
classifier.freeze_middle_layers()
warmup_optimizer = optim.Adam(filter(lambda p: p.requires_grad, classifier.parameters()), lr=WARM_UP_LR)
warmup_scheduler = ReduceLROnPlateau(warmup_optimizer, mode='min', factor=0.1, patience=5, threshold=0.0001, threshold_mode='abs', verbose=True)

In [100]:
summary(classifier, input_size=(3, N_FACES, H, W))

In [52]:
# WARM_UP_LR

## Training

In [75]:
losses[:WARM_UP_EPOCHS], val_losses[:WARM_UP_EPOCHS], \
f1_scores[:WARM_UP_EPOCHS], val_f1_scores[:WARM_UP_EPOCHS], \
best_val_loss, \
best_model_state_dict, best_optimizer_state_dict \
= train_the_model(
    model=classifier,
    criterion=criterion,
    optimizer=warmup_optimizer,
    scheduler=warmup_scheduler,
    epochs=WARM_UP_EPOCHS,
    train_dataloader=train_dataloader,
    val_dataloader=val_dataloader,
    best_val_loss=best_val_loss,
)

# Save the best checkpoint.
if best_model_state_dict is not None:
    state = {
        'state_dict': best_model_state_dict,
        'warmup_optimizer': best_optimizer_state_dict,
        'best_val_loss': best_val_loss,
    }
    torch.save(state, 'best-checkout-warmup.pth')

In [None]:
# visualize_results(
#     losses=losses[:WARM_UP_EPOCHS],
#     val_losses=val_losses[:WARM_UP_EPOCHS],
#     f1_scores=f1_scores[:WARM_UP_EPOCHS],
#     val_f1_scores=val_f1_scores[:WARM_UP_EPOCHS]
# )

In [77]:
# state = torch.load(PATH2PROJECT / 'trainreface' / 'temp-best-checkout-resnet101.pth', map_location=lambda storage, loc: storage)
# state = torch.load('best-checkout-finetune.pth', map_location=lambda storage, loc: storage)
state = torch.load('best-checkout.pth', map_location=lambda storage, loc: storage)
best_val_loss = state['best_val_loss']
classifier.load_state_dict(state['state_dict'])
# classifier.to(device)

In [78]:
classifier.unfreeze_all_layers()
finetune_optimizer = optim.Adam(filter(lambda p: p.requires_grad, classifier.parameters()), lr=FINE_TUNE_LR)
finetune_scheduler = ReduceLROnPlateau(finetune_optimizer, mode='min', factor=0.1, patience=5, threshold=0.0001, threshold_mode='abs', verbose=True)

In [None]:
# FINE_TUNE_LR

In [84]:
# losses[WARM_UP_EPOCHS:WARM_UP_EPOCHS+FINE_TUNE_EPOCHS], val_losses[WARM_UP_EPOCHS:WARM_UP_EPOCHS+FINE_TUNE_EPOCHS], \
# f1_scores[WARM_UP_EPOCHS:WARM_UP_EPOCHS+FINE_TUNE_EPOCHS], val_f1_scores[WARM_UP_EPOCHS:WARM_UP_EPOCHS+FINE_TUNE_EPOCHS], \
# best_val_loss, best_val_logloss, \
# best_model_state_dict, best_optimizer_state_dict \
_, _, \
_, _, \
best_val_loss, \
best_model_state_dict, best_optimizer_state_dict \
= train_the_model(
    model=classifier,
    criterion=criterion,
    optimizer=finetune_optimizer,
    scheduler=finetune_scheduler,
    epochs=FINE_TUNE_EPOCHS,
    train_dataloader=train_dataloader,
    val_dataloader=val_dataloader,
    best_val_loss=best_val_loss,
)

# Save the best checkpoint.
if best_model_state_dict is not None:
    state = {
        'state_dict': best_model_state_dict,
        'finetune_optimizer': best_optimizer_state_dict,
        'best_val_loss': best_val_loss,
    }

    torch.save(state, 'best-checkout-finetune.pth')

In [85]:
# state = torch.load(PATH2PROJECT / 'trainreface' / 'temp-best-checkout-resnet101.pth', map_location=lambda storage, loc: storage)
# state = torch.load('best-checkout-finetune.pth', map_location=lambda storage, loc: storage)
state = torch.load('best-checkout.pth', map_location=lambda storage, loc: storage)
best_val_loss = state['best_val_loss']
classifier.load_state_dict(state['state_dict'])
# classifier.to(device)

In [81]:
from IPython.display import FileLink
FileLink('best-checkout.pth')

In [None]:
# visualize_results(
#     losses=losses,
#     val_losses=val_losses,
#     loglosses=loglosses,
#     val_loglosses=val_loglosses,
#     f1_scores=f1_scores,
#     val_f1_scores=val_f1_scores
# )

## Inference and submission

In [86]:
def inference(classifier, test_dataloader):
    classifier.eval()
    
    all_preds = []
    all_labels = []
    with torch.no_grad():
        for _, batch in enumerate(tqdm(test_dataloader, total=len(test_dataloader))):
            # Make prediction.
            y_pred = classifier(batch['faces'].to(device))

            all_preds.extend(y_pred.squeeze(dim=-1).detach().cpu().numpy().tolist())
            all_labels.extend(batch['label'].squeeze(dim=-1).numpy().tolist())
    return all_preds, all_labels

In [87]:
test_prediction, _ = inference(classifier, test_dataloader)
len(test_prediction)

In [88]:
val_prediction, val_labels = inference(classifier, val_dataloader)
len(val_prediction)

In [89]:
X_test['score'] = test_prediction

In [90]:
thresholds = np.linspace(0, 1, len(np.unique(val_prediction)))
f1_scores = [f1_score(val_labels, (np.array(val_prediction) > t).astype(np.uint8), average='micro') for t in tqdm(thresholds)]
t_best = thresholds[np.argmax(f1_scores)]
print('Best threshold: ', t_best)
print('Best F1-Score: ', np.max(f1_scores))

In [91]:
from IPython.display import FileLink

In [92]:
X_test['label'] = (X_test['score'] > t_best).astype(int)
submission_result = submission[['filename', 'path']].merge(X_test[['path', 'label']], on='path', how='left').fillna(0)[['filename', 'label']]
submission_result['label'] = submission_result['label'].astype(int)

assert submission_result.shape[0] == submission.shape[0]

submission_result.to_csv('submission_result_th.csv', index=False)

FileLink('submission_result_th.csv')

In [93]:
X_test['label'] = (X_test['score'] > 0.5).astype(int)
submission_result = submission[['filename', 'path']].merge(X_test[['path', 'label']], on='path', how='left').fillna(0)[['filename', 'label']]
submission_result['label'] = submission_result['label'].astype(int)

assert submission_result.shape[0] == submission.shape[0]

submission_result.to_csv('submission_result_05.csv', index=False)

FileLink('submission_result_05.csv')

## Combine

In [None]:
classifier1 = DeepfakeClassifier_inc(encoder=encoder_facenet, in_channels=3*N_FACES, num_classes=1)
classifier1.to(device);

classifier2 = DeepfakeClassifier(encoder=encoder_resnet, in_channels=3*N_FACES, num_classes=1)
classifier2.to(device);

In [None]:
state = torch.load(PATH2PROJECT / 'trainreface' / 'best-checkout-inceptionv1.pth', map_location=lambda storage, loc: storage)
best_val_loss = state['best_val_loss']
classifier1.load_state_dict(state['state_dict'])

In [None]:
state = torch.load(PATH2PROJECT / 'trainreface' / 'best-checkout-resnet101.pth', map_location=lambda storage, loc: storage)
best_val_loss = state['best_val_loss']
classifier2.load_state_dict(state['state_dict'])

In [None]:
test_prediction1, _ = inference(classifier1, test_dataloader)
len(test_prediction1)

In [None]:
test_prediction2, _ = inference(classifier2, test_dataloader)
len(test_prediction2)

In [None]:
val_prediction1, val_labels1 = inference(classifier1, val_dataloader)
len(val_prediction1)

In [None]:
val_prediction2, val_labels2 = inference(classifier2, val_dataloader)
len(val_prediction2)

In [None]:
assert val_labels1 == val_labels2

In [None]:
alphas = np.linspace(0, 1, 100)
thresholds = np.linspace(0, 1, len(np.unique(val_prediction1)))

global_best_t = 1e7
global_best_f_score = 1e7
global_best_a = 1e7
for a in tqdm(alphas):
    f1_scores = [f1_score(val_labels1, ((a*np.array(val_prediction1) + (1-a)*np.array(val_prediction2)) > t).astype(np.uint8), average='micro') for t in thresholds]
    t_best = thresholds[np.argmax(f1_scores)]
    f_score = np.max(f1_scores)
    
    if f_score > global_best_f_score:
        global_best_t = t_best
        global_best_f_score = f_score
        global_best_a = a
print('Best threshold: ', t_best)
print('Best F1-Score: ', np.max(f1_scores))

In [None]:
X_test['label'] = ((global_best_a*np.array(test_prediction1) + (1-global_best_a)*np.array(test_prediction2)) > global_best_t).astype(np.uint8)
submission_result = submission[['filename', 'path']].merge(X_test[['path', 'label']], on='path', how='left').fillna(0)[['filename', 'label']]
submission_result['label'] = submission_result['label'].astype(int)

In [None]:
assert submission_result.shape[0] == submission.shape[0]

In [None]:
submission_result.to_csv('submission_result__combine_th.csv', index=False)

In [None]:
from IPython.display import FileLink
FileLink('submission_result__combine_th.csv')

In [None]:
X_test['label'] = ((global_best_a*np.array(test_prediction1) + (1-global_best_a)*np.array(test_prediction2)) > 0.5).astype(np.uint8)
submission_result = submission[['filename', 'path']].merge(X_test[['path', 'label']], on='path', how='left').fillna(0)[['filename', 'label']]
submission_result['label'] = submission_result['label'].astype(int)

submission_result.to_csv('submission_result__combine_05.csv', index=False)

FileLink('submission_result__combine_05.csv')