# Deepfake Submission

This notebook is intended to be a submission kernel for the competition. To use it, you need to add the public dataset [facenet_pytorch](https://www.kaggle.com/timesler/facenet-pytorch-vggface2) and one of your own that contains:

1. *kernel_module.py*, containing all the defintions of your own functions and classes.
2. A *.pth* file for loading your trained model into fastai's `Learner`.  If the file is called *mesonet_stage1.pth*, you need to assign `'mesonet_stage1'` to `FNAME_LEARNER` in this notebook.

If you own dataset is called *realfake*, you need to assign `'realfake'` to `DIR_MYCODE` in this notebook.

In [5]:
DIR_MYCODE = 'realfake'
FNAME_LEARNER = 'mesonet_stage1'

In [6]:
! ls ../input/{DIR_MYCODE}/

kernel_module.py  mesonet_stage1.pkl  mesonet_stage1.pth


In [7]:
!pip install /kaggle/input/facenet-pytorch-vggface2/facenet_pytorch-2.0.0-py3-none-any.whl
from facenet_pytorch.models.inception_resnet_v1 import get_torch_home
torch_home = get_torch_home()
# Copy model checkpoints to torch cache so they are loaded automatically by the package
!mkdir -p $torch_home/checkpoints/
!cp /kaggle/input/facenet-pytorch-vggface2/20180402-114759-vggface2-logits.pth $torch_home/checkpoints/vggface2_DG3kwML46X.pt
!cp /kaggle/input/facenet-pytorch-vggface2/20180402-114759-vggface2-features.pth $torch_home/checkpoints/vggface2_G5aNV2VSMn.pt
! cp ../input/{DIR_MYCODE}/kernel_module.py ../working/.
from kernel_module import *

Processing /kaggle/input/facenet-pytorch-vggface2/facenet_pytorch-2.0.0-py3-none-any.whl
Installing collected packages: facenet-pytorch
Successfully installed facenet-pytorch-2.0.0


### Data

In [8]:
SOURCE = Path('../input/deepfake-detection-challenge/train_sample_videos/')

In [9]:
f = get_files(SOURCE, extensions=['.json'])[0]
annots = pd.read_json(f).T
annots.reset_index(inplace=True)
annots.rename({'index':'fname'}, axis=1, inplace=True)
annots.head()

Unnamed: 0,fname,label,split,original
0,aagfhgtpmv.mp4,FAKE,train,vudstovrck.mp4
1,aapnvogymq.mp4,FAKE,train,jdubbvfswz.mp4
2,abarnvbtwb.mp4,REAL,train,
3,abofeumbvv.mp4,FAKE,train,atvmxvwyns.mp4
4,abqwwspghj.mp4,FAKE,train,qzimuostzz.mp4


#### Get face detector

In [10]:
device = 'cuda:0' if torch.cuda.is_available() else 'cpu'

In [11]:
detector = MTCNN(device=device, post_process=False)

#### Remove videos in which no faces are detected.  

In [12]:
fnames = [SOURCE/o for o in annots.fname]

In [13]:
hasface = get_has_face(fnames, detector)

In [14]:
annots_hasface = annots[np.array(hasface)]

#### Create `DataBunch`

In [15]:
src = (VideoFaceList
       .from_df(df=annots_hasface, path=SOURCE, cols='fname', detector=detector)
       .split_by_rand_pct())

In [16]:
bs, sz = 32, 128

In [17]:
data = (src.label_from_df('label').transform(get_transforms(), size=sz)
        .databunch(bs=bs, num_workers=0).normalize(imagenet_stats))

### Model

In [18]:
model = MesoNet()

### Learner

In [19]:
learn = Learner(data, model, metrics=accuracy, path=f'../input/{DIR_MYCODE}/', model_dir='')

In [20]:
learn.load(FNAME_LEARNER);

### Inference

In [21]:
SOURCE_TEST = Path('../input/deepfake-detection-challenge/test_videos/')

In [22]:
fnames = get_files(SOURCE_TEST, extensions=['.mp4'])
fnames[:3]

[PosixPath('../input/deepfake-detection-challenge/test_videos/iorbtaarte.mp4'),
 PosixPath('../input/deepfake-detection-challenge/test_videos/vnlzxqwthl.mp4'),
 PosixPath('../input/deepfake-detection-challenge/test_videos/gqnaxievjx.mp4')]

Again, because we can't deal with videos which have no detected face, we need to first separate these.

In [23]:
hasface_tst = get_has_face(fnames, detector)

In [24]:
fnames_tst_hasface = [f for f, b in zip(fnames, hasface_tst) if b]
len(fnames_tst_hasface)

396

Infer on videos in which a face can be detected.

In [25]:
vlist = VideoFaceList(sorted(fnames_tst_hasface), detector=detector)

In [50]:
df = infer_on_videolist(learn, vlist)

Then, fill in dummy labels for those in which a face *cannot* be detected. 

In [62]:
def insert_noface_entries(df, fnames, hasface):
    label_fill = 0  # Fill in 'FAKE'.
    assert len(fnames) == len(hasface)
    fnames_noface = [f for f, b in zip(fnames, hasface) if not b]
    for o in fnames_noface:
        df = df.append(pd.Series([o.name, label_fill], index=df.columns), ignore_index=True)
    df.sort_values('filename', axis=0, inplace=True)
    return df.reset_index()

In [63]:
df = insert_noface_entries(df, fnames, hasface_tst)

Write out the *submission.csv* file.

In [67]:
df.to_csv('submission.csv', index=False)

# - fin