# XceptionNet Deepfake Detector
FIT3183 2020 S2 Assignment
<br/>By Team Dark.HAIYA
<br/>Team members:
- Kee Pei Jiin
- Chin Wen Yuan

In this Colab, we train a deepfake detector which uses the XceptionNet CNN architecture. We mainly refer to [this Github](https://github.com/otenim/Xception-with-Your-Own-Dataset) to create the detector.


# Download training datasets
The training dataset has 1600 images, which is made up of 800 cropped CelebA images and 800 fake faces downloaded from [here](https://github.com/cc-hpc-itwm/DeepFakeDetection/blob/master/Experiments_CelebA/dataset_celebA.7z).

In [None]:
# Download the training dataset
import gdown
!gdown https://drive.google.com/uc?id=1tZ1pQHuz94TCjzo9mdKWuKlRgOnHfog9

Downloading...
From: https://drive.google.com/uc?id=1tZ1pQHuz94TCjzo9mdKWuKlRgOnHfog9
To: /content/training_images.zip
6.00MB [00:00, 11.6MB/s]


In [None]:
!unzip -q /content/training_images.zip
!rm -r /content/training_images.zip

In [None]:
!mv /content/content/training_images /content/

In [None]:
!rm -r /content/content

# Import Libraries & Variables Declaration

In [1]:
import math
import os
import matplotlib
import imghdr
import pickle as pkl
import numpy as np
import matplotlib.pyplot as plt
from keras.applications.xception import Xception, preprocess_input
from keras.optimizers import Adam
#from keras.preprocessing import image
import keras.utils as image
from keras.losses import categorical_crossentropy
from keras.layers import Dense, GlobalAveragePooling2D
from keras.models import Model
from keras.utils import to_categorical
from keras.callbacks import ModelCheckpoint
from keras.models import load_model
import PIL

In [2]:
matplotlib.use('Agg')
#dataset_root = "E://roop//"

dataset_root = 'D:\\DeepFakeRepos\\Datasets\\RoopData'
result_root = "D:\\DeepFakeRepos\\AllModels\\DeepFakeDetectionModels\\XceptionNet\\"
classes = ["real", "forget"]
num_classes = 2

epochs_pre = 20
epochs_fine = 10
batch_size_pre = 32
batch_size_fine = 16
lr_pre = 1e-3
lr_fine = 1e-4
snapshot_period_pre = 5
snapshot_period_fine = 1
split = 0.7

# Load training data
We load the training images and create their one-hot-categorical label.

Then, we further split the training datasets into smaller datasets for training & validation purposes.
  - 70% will be used for training
  - 30% will be used for validation

In [3]:
# make input_paths and labels
input_paths, labels = [], []
for class_name in os.listdir(dataset_root):
    class_root = os.path.join(dataset_root, class_name)
    class_id = classes.index(class_name)
    for path in os.listdir(class_root):
        path = os.path.join(class_root, path)
        if imghdr.what(path) is None:
            # this is not an image file
            continue
        input_paths.append(path)
        labels.append(class_id)

# convert to one-hot-vector format
labels = to_categorical(labels, num_classes=num_classes)

# convert to numpy array
input_paths = np.array(input_paths)

In [7]:
print(len(labels), len(input_paths))

200 200


In [4]:
# split dataset for training and validation purposes
border = int(len(input_paths) * split)
train_labels = labels[:border]
val_labels = labels[border:]
train_input_paths = input_paths[:border]
val_input_paths = input_paths[border:]
print("Training on %d images and labels" % (len(train_input_paths)))
print("Validation on %d images and labels" % (len(val_input_paths)))

if os.path.exists(result_root) is False:
    os.makedirs(result_root)

Training on 140 images and labels
Validation on 60 images and labels


# Model Training using Transfer Learning Technique

Since out training dataset is quite small, we apply [transfer learning](https://machinelearningmastery.com/transfer-learning-for-deep-learning/#:~:text=Transfer%20learning%20is%20a%20machine,model%20on%20a%20second%20task.) technique in creating the detector.

We use the Keras pre-trained XceptionNet model as our base model. This pre-trained model was trained on Imagenet datasets and is able to classify images into around 1000 different classes. We fine-tune this model so that it recognises real & fake human faces as well.

In [5]:
def generate_from_paths_and_labels(input_paths, labels, batch_size, input_size=(1024, 1024)):
    num_samples = len(input_paths)
    while 1:
        perm = np.random.permutation(num_samples)
        input_paths = input_paths[perm]
        labels = labels[perm]
        for i in range(0, num_samples, batch_size):
            inputs = list(map(
                lambda x: image.load_img(x, target_size=input_size),
                input_paths[i:i+batch_size]
            ))
            inputs = np.array(list(map(
                lambda x: image.img_to_array(x),
                inputs
            )))
            inputs = preprocess_input(inputs)
            yield (inputs, labels[i:i+batch_size])

In [16]:
# base model used is the pre-trained XceptionNet model on imageNet dataset
# do not include imageNet classfier at the top
base_model = Xception(include_top=False,
                    weights='imagenet',
                    input_shape=(256, 256, 3))

In [17]:
# create a custom top classifier
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(num_classes, activation='softmax')(x)
model = Model(inputs=base_model.inputs, outputs=predictions)

In [19]:
# train the top classifier layer

# freeze the base_model body layers
for layer in base_model.layers:
    layer.trainable = False

# compile model
model.compile(loss=categorical_crossentropy,
              optimizer=Adam(lr=lr_pre),
              metrics=['accuracy']
)

# train
hist_pre = model.fit_generator(
    generator=generate_from_paths_and_labels(input_paths=train_input_paths,
                                              labels=train_labels,
                                              batch_size=batch_size_pre),

    steps_per_epoch=math.ceil(len(train_input_paths) / batch_size_pre),

    epochs=epochs_pre,

    validation_data=generate_from_paths_and_labels(input_paths=val_input_paths,
                                                  labels=val_labels,
                                                  batch_size=batch_size_pre),

    validation_steps=math.ceil(len(val_input_paths) / batch_size_pre),

    verbose=1,

    callbacks=[ModelCheckpoint(
                filepath=os.path.join(result_root,
                                'model_pre_ep{epoch}_valloss{val_loss:.3f}.h5'),
                period=snapshot_period_pre,),
    ],
)

model.save(os.path.join(result_root, 'model_pre_final.h5'))



  hist_pre = model.fit_generator(


Epoch 1/20


ResourceExhaustedError: Graph execution error:

Detected at node 'model_1/block1_conv2/Conv2D' defined at (most recent call last):
    File "c:\Apps\envs\tf_venv\lib\runpy.py", line 197, in _run_module_as_main
      return _run_code(code, main_globals, None,
    File "c:\Apps\envs\tf_venv\lib\runpy.py", line 87, in _run_code
      exec(code, run_globals)
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel_launcher.py", line 18, in <module>
      app.launch_new_instance()
    File "c:\Apps\envs\tf_venv\lib\site-packages\traitlets\config\application.py", line 1075, in launch_instance
      app.start()
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\kernelapp.py", line 739, in start
      self.io_loop.start()
    File "c:\Apps\envs\tf_venv\lib\site-packages\tornado\platform\asyncio.py", line 205, in start
      self.asyncio_loop.run_forever()
    File "c:\Apps\envs\tf_venv\lib\asyncio\base_events.py", line 601, in run_forever
      self._run_once()
    File "c:\Apps\envs\tf_venv\lib\asyncio\base_events.py", line 1905, in _run_once
      handle._run()
    File "c:\Apps\envs\tf_venv\lib\asyncio\events.py", line 80, in _run
      self._context.run(self._callback, *self._args)
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\kernelbase.py", line 545, in dispatch_queue
      await self.process_one()
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\kernelbase.py", line 534, in process_one
      await dispatch(*args)
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\kernelbase.py", line 437, in dispatch_shell
      await result
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\ipkernel.py", line 359, in execute_request
      await super().execute_request(stream, ident, parent)
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\kernelbase.py", line 778, in execute_request
      reply_content = await reply_content
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\ipkernel.py", line 446, in do_execute
      res = shell.run_cell(
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\zmqshell.py", line 549, in run_cell
      return super().run_cell(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\IPython\core\interactiveshell.py", line 3048, in run_cell
      result = self._run_cell(
    File "c:\Apps\envs\tf_venv\lib\site-packages\IPython\core\interactiveshell.py", line 3103, in _run_cell
      result = runner(coro)
    File "c:\Apps\envs\tf_venv\lib\site-packages\IPython\core\async_helpers.py", line 129, in _pseudo_sync_runner
      coro.send(None)
    File "c:\Apps\envs\tf_venv\lib\site-packages\IPython\core\interactiveshell.py", line 3308, in run_cell_async
      has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
    File "c:\Apps\envs\tf_venv\lib\site-packages\IPython\core\interactiveshell.py", line 3490, in run_ast_nodes
      if await self.run_code(code, result, async_=asy):
    File "c:\Apps\envs\tf_venv\lib\site-packages\IPython\core\interactiveshell.py", line 3550, in run_code
      exec(code_obj, self.user_global_ns, self.user_ns)
    File "C:\Users\Павел\AppData\Local\Temp\ipykernel_23192\2170180317.py", line 14, in <module>
      hist_pre = model.fit_generator(
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\training.py", line 2507, in fit_generator
      return self.fit(
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\training.py", line 1564, in fit
      tmp_logs = self.train_function(iterator)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\training.py", line 1160, in train_function
      return step_function(self, iterator)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\training.py", line 1146, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\training.py", line 1135, in run_step
      outputs = model.train_step(data)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\training.py", line 993, in train_step
      y_pred = self(x, training=True)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\training.py", line 557, in __call__
      return super().__call__(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\base_layer.py", line 1097, in __call__
      outputs = call_fn(inputs, *args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\utils\traceback_utils.py", line 96, in error_handler
      return fn(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\functional.py", line 510, in call
      return self._run_internal_graph(inputs, training=training, mask=mask)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\functional.py", line 667, in _run_internal_graph
      outputs = node.layer(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\base_layer.py", line 1097, in __call__
      outputs = call_fn(inputs, *args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\utils\traceback_utils.py", line 96, in error_handler
      return fn(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\layers\convolutional\base_conv.py", line 283, in call
      outputs = self.convolution_op(inputs, self.kernel)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\layers\convolutional\base_conv.py", line 255, in convolution_op
      return tf.nn.convolution(
Node: 'model_1/block1_conv2/Conv2D'
OOM when allocating tensor with shape[32,509,509,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node model_1/block1_conv2/Conv2D}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
 [Op:__inference_train_function_37375]

In [8]:
#import tensorflow as tf
from tensorflow.python.client import device_lib
#print("GPU is", "available" if tf.config.list_physical_devices('GPU') else "NOT AVAILABLE")
#from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 9801412947511294421
xla_global_id: -1
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 3640655872
locality {
  bus_id: 1
  links {
  }
}
incarnation: 7444525079590856375
physical_device_desc: "device: 0, name: NVIDIA GeForce RTX 3060 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6"
xla_global_id: 416903419
]


In [1]:
import tensorflow as tf
#tf.test.gpu_device_name()
tf.test.is_built_with_cuda()

True

In [21]:
import tensorflow as tf
tf.config.list_physical_devices('GPU')

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

In [13]:
# Fine-tune model
# set all the layers to be trainable
for layer in model.layers:
    layer.trainable = True

# recompile
model.compile(optimizer=Adam(lr=lr_fine),
              loss=categorical_crossentropy,
              metrics=['accuracy'])

# train
hist_fine = model.fit_generator(
    generator=generate_from_paths_and_labels(input_paths=train_input_paths,
                                            labels=train_labels,
                                            batch_size=batch_size_fine),

  steps_per_epoch=math.ceil(len(train_input_paths) / batch_size_fine),

  epochs=epochs_fine,

  validation_data=generate_from_paths_and_labels(input_paths=val_input_paths,
                                                labels=val_labels,
                                                batch_size=batch_size_fine),

  validation_steps=math.ceil(len(val_input_paths) / batch_size_fine),

  verbose=1,

  callbacks=[ModelCheckpoint(
          filepath=os.path.join(result_root,
                                'model_fine_ep{epoch}_valloss_new{val_loss:.3f}.h5'),
          period=snapshot_period_fine,),
  ],
)

model.save(os.path.join(result_root, 'model_ROOP_Final_3.h5'))



  hist_fine = model.fit_generator(


Epoch 1/10


ResourceExhaustedError: Graph execution error:

Detected at node 'model/block1_conv2_bn/FusedBatchNormV3' defined at (most recent call last):
    File "c:\Apps\envs\tf_venv\lib\runpy.py", line 197, in _run_module_as_main
      return _run_code(code, main_globals, None,
    File "c:\Apps\envs\tf_venv\lib\runpy.py", line 87, in _run_code
      exec(code, run_globals)
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel_launcher.py", line 18, in <module>
      app.launch_new_instance()
    File "c:\Apps\envs\tf_venv\lib\site-packages\traitlets\config\application.py", line 1075, in launch_instance
      app.start()
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\kernelapp.py", line 739, in start
      self.io_loop.start()
    File "c:\Apps\envs\tf_venv\lib\site-packages\tornado\platform\asyncio.py", line 205, in start
      self.asyncio_loop.run_forever()
    File "c:\Apps\envs\tf_venv\lib\asyncio\base_events.py", line 601, in run_forever
      self._run_once()
    File "c:\Apps\envs\tf_venv\lib\asyncio\base_events.py", line 1905, in _run_once
      handle._run()
    File "c:\Apps\envs\tf_venv\lib\asyncio\events.py", line 80, in _run
      self._context.run(self._callback, *self._args)
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\kernelbase.py", line 545, in dispatch_queue
      await self.process_one()
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\kernelbase.py", line 534, in process_one
      await dispatch(*args)
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\kernelbase.py", line 437, in dispatch_shell
      await result
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\ipkernel.py", line 359, in execute_request
      await super().execute_request(stream, ident, parent)
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\kernelbase.py", line 778, in execute_request
      reply_content = await reply_content
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\ipkernel.py", line 446, in do_execute
      res = shell.run_cell(
    File "c:\Apps\envs\tf_venv\lib\site-packages\ipykernel\zmqshell.py", line 549, in run_cell
      return super().run_cell(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\IPython\core\interactiveshell.py", line 3048, in run_cell
      result = self._run_cell(
    File "c:\Apps\envs\tf_venv\lib\site-packages\IPython\core\interactiveshell.py", line 3103, in _run_cell
      result = runner(coro)
    File "c:\Apps\envs\tf_venv\lib\site-packages\IPython\core\async_helpers.py", line 129, in _pseudo_sync_runner
      coro.send(None)
    File "c:\Apps\envs\tf_venv\lib\site-packages\IPython\core\interactiveshell.py", line 3308, in run_cell_async
      has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
    File "c:\Apps\envs\tf_venv\lib\site-packages\IPython\core\interactiveshell.py", line 3490, in run_ast_nodes
      if await self.run_code(code, result, async_=asy):
    File "c:\Apps\envs\tf_venv\lib\site-packages\IPython\core\interactiveshell.py", line 3550, in run_code
      exec(code_obj, self.user_global_ns, self.user_ns)
    File "C:\Users\Павел\AppData\Local\Temp\ipykernel_23192\839117123.py", line 12, in <module>
      hist_fine = model.fit_generator(
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\training.py", line 2507, in fit_generator
      return self.fit(
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\training.py", line 1564, in fit
      tmp_logs = self.train_function(iterator)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\training.py", line 1160, in train_function
      return step_function(self, iterator)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\training.py", line 1146, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\training.py", line 1135, in run_step
      outputs = model.train_step(data)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\training.py", line 993, in train_step
      y_pred = self(x, training=True)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\training.py", line 557, in __call__
      return super().__call__(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\base_layer.py", line 1097, in __call__
      outputs = call_fn(inputs, *args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\utils\traceback_utils.py", line 96, in error_handler
      return fn(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\functional.py", line 510, in call
      return self._run_internal_graph(inputs, training=training, mask=mask)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\functional.py", line 667, in _run_internal_graph
      outputs = node.layer(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\engine\base_layer.py", line 1097, in __call__
      outputs = call_fn(inputs, *args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\utils\traceback_utils.py", line 96, in error_handler
      return fn(*args, **kwargs)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\layers\normalization\batch_normalization.py", line 850, in call
      outputs = self._fused_batch_norm(inputs, training=training)
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\layers\normalization\batch_normalization.py", line 660, in _fused_batch_norm
      output, mean, variance = control_flow_util.smart_cond(
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\utils\control_flow_util.py", line 108, in smart_cond
      return tf.__internal__.smart_cond.smart_cond(
    File "c:\Apps\envs\tf_venv\lib\site-packages\keras\layers\normalization\batch_normalization.py", line 634, in _fused_batch_norm_training
      return tf.compat.v1.nn.fused_batch_norm(
Node: 'model/block1_conv2_bn/FusedBatchNormV3'
OOM when allocating tensor with shape[16,509,509,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node model/block1_conv2_bn/FusedBatchNormV3}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
 [Op:__inference_train_function_25852]

In [None]:
# performance of the final fine-tuned model
acc = hist_fine.history["accuracy"][-1]
val_acc = hist_fine.history["val_accuracy"][-1]
loss = hist_fine.history['loss'][-1]
val_loss = hist_fine.history['val_loss'][-1]

print("Accuracy on training data: %.2f" %acc)
print("Loss on training data: %.2f" %loss)
print("Accuracy on validation data: %.2f" %val_acc)
print("Loss on validation data: %.2f" %val_loss)

Accuracy on training data: 0.98
Loss on training data: 0.06
Accuracy on validation data: 0.95
Loss on validation data: 0.16


In [None]:
# download the final model weight files
from google.colab import files
files.download("/content/results/model_fine_final.h5")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>