# Deepfake Submission

This notebook is intended to be a submission kernel for the competition. To use it, you need to add the public dataset [facenet_pytorch](https://www.kaggle.com/timesler/facenet-pytorch-vggface2) and one of your own that contains:

1. *kernel_module.py*, containing all the defintions of your own functions and classes.
2. A *.pth* file for loading your trained model into fastai's `Learner`.  If the file is called *mesonet_stage1.pth*, you need to assign `'mesonet_stage1'` to `FNAME_LEARNER` in this notebook.

If you own dataset is called *realfake*, you need to assign `'realfake'` to `DIR_MYCODE` in this notebook.

In [None]:
DIR_MYCODE = 'realfake'
FNAME_LEARNER = 'mesonet_stage1'

In [None]:
! ls ../input/{DIR_MYCODE}/

In [None]:
!pip install /kaggle/input/facenet-pytorch-vggface2/facenet_pytorch-2.0.0-py3-none-any.whl
from facenet_pytorch.models.inception_resnet_v1 import get_torch_home
torch_home = get_torch_home()
# Copy model checkpoints to torch cache so they are loaded automatically by the package
!mkdir -p $torch_home/checkpoints/
!cp /kaggle/input/facenet-pytorch-vggface2/20180402-114759-vggface2-logits.pth $torch_home/checkpoints/vggface2_DG3kwML46X.pt
!cp /kaggle/input/facenet-pytorch-vggface2/20180402-114759-vggface2-features.pth $torch_home/checkpoints/vggface2_G5aNV2VSMn.pt
! cp ../input/{DIR_MYCODE}/kernel_module.py ../working/.
from kernel_module import *

### Data

In [None]:
SOURCE = Path('../input/deepfake-detection-challenge/train_sample_videos/')

In [None]:
f = get_files(SOURCE, extensions=['.json'])[0]
annots = pd.read_json(f).T
annots.reset_index(inplace=True)
annots.rename({'index':'fname'}, axis=1, inplace=True)
annots.head()

#### Get face detector

In [None]:
device = 'cuda:0' if torch.cuda.is_available() else 'cpu'

In [None]:
detector = MTCNN(device=device, post_process=False)

#### Remove videos in which no faces are detected.  

In [None]:
fnames = [SOURCE/o for o in annots.fname]

In [None]:
hasface = get_has_face(fnames, detector)

In [None]:
annots_hasface = annots[np.array(hasface)]

#### Create `DataBunch`

In [None]:
src = (VideoFaceList
       .from_df(df=annots_hasface, path=SOURCE, cols='fname', detector=detector)
       .split_by_rand_pct())

In [None]:
bs, sz = 32, 128

In [None]:
data = (src.label_from_df('label').transform(get_transforms(), size=sz)
        .databunch(bs=bs, num_workers=0).normalize(imagenet_stats))

### Model

In [None]:
model = MesoNet()

### Learner

In [None]:
learn = Learner(data, model, metrics=accuracy, path=f'../input/{DIR_MYCODE}/', model_dir='')

In [None]:
learn.load(FNAME_LEARNER);

### Inference

In [None]:
SOURCE_TEST = Path('../input/deepfake-detection-challenge/test_videos/')

In [None]:
fnames = get_files(SOURCE_TEST, extensions=['.mp4'])
fnames[:3]

Again, because we can't deal with videos which have no detected face, we need to first separate these.

In [None]:
hasface_tst = get_has_face(fnames, detector)

In [None]:
fnames_tst_hasface = [f for f, b in zip(fnames, hasface_tst) if b]
len(fnames_tst_hasface)

Infer on videos in which a face can be detected.

In [None]:
vlist = VideoFaceList(sorted(fnames_tst_hasface), detector=detector)

In [None]:
df_hasface = infer_on_videolist(learn, vlist)

Then, fill in dummy labels for those in which a face *cannot* be detected. 

In [None]:
def insert_noface_entries(df, fnames, hasface):
    label_fill = 0  # Fill in 'FAKE'.
    assert len(fnames) == len(hasface)
    fnames_noface = [f for f, b in zip(fnames, hasface) if not b]
    for o in fnames_noface:
        df = df.append(pd.Series([o.name, label_fill], index=df.columns), ignore_index=True)
    df.sort_values('filename', axis=0, inplace=True)
    return df.reset_index(drop=True)

In [None]:
df = insert_noface_entries(df_hasface, fnames, hasface_tst)

Write out the *submission.csv* file.

**Write out a trivial *submission.csv***

In [None]:
#df = pd.DataFrame([(o.name, 0) for o in fnames], columns=['filename', 'label'])

In [None]:
df.to_csv('submission.csv', index=False)

# - fin