I have used Google's Colab Notebook to implement this project. The working directory for coloab is the "Content" folder. We can see the available folder structure from clicking on the folder icon on the left.There we can upload files from our local system


In [0]:
import os
import zipfile

local_zip = 'real_and_fake_face_ReduceD.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall('real_and_fake_face') #extracting the zip file and putting it in folder named 'real_and_fake_face' 
zip_ref.close()


Let's see few file names from both the directories

In [0]:
# Directory with our training Real Face pictures
train_Real_dir = os.path.join('real_and_fake_face/real_and_fake_face_ReduceD/Real Face')

# Directory with our training Fake Face pictures
train_Fake_dir = os.path.join('real_and_fake_face/real_and_fake_face_ReduceD/Fake Face')

train_Real_names = os.listdir(train_Real_dir)
print(train_Real_names[:5]) # First 5 file names of real images of training data has been printed.

train_Fake_names = os.listdir(train_Fake_dir)
print(train_Fake_names[:5]) # First 5 file names of fake images of training data has been printed.

We can get the total number of images in the respective directories

In [0]:
print('Total training real images:', len(os.listdir(train_Real_dir)))
print('total training fake images:', len(os.listdir(train_Fake_dir)))

Now let's see some of the pictures. But first,need configure the necessary libraries and parameters:

In [0]:
%matplotlib inline

import matplotlib.pyplot as plt
import matplotlib.image as mpimg

# Parameters for our graph; we'll output images in a 3x3 configuration
nrows = 3
ncols = 3

# Index for iterating over images
pic_index = 0

Now, display a batch of 4 Real and 4 Fake pictures. You can rerun the cell to see a fresh batch each time:

In [0]:
# Set up matplotlib fig, and size it to fit 3x3 pics
fig = plt.gcf() # Here the gcf() function has been used to get the referece of the image 
fig.set_size_inches(ncols * 2, nrows * 2) # we can ste the size here. It is in inches.

pic_index += 4 # Here we can set how many pictures we want to see at the output
next_Real_pix = [os.path.join(train_Real_dir, fname) 
                for fname in train_Real_names[pic_index-4:pic_index]]
next_Fake_pix = [os.path.join(train_Fake_dir, fname) 
                for fname in train_Fake_names[pic_index-4:pic_index]]

for i, img_path in enumerate(next_Real_pix+next_Fake_pix):
  # Set up subplot; subplot indices start at 1
  sp = plt.subplot(nrows, ncols, i + 1)
  sp.axis('Off') # Don't show axes (or gridlines)

  img = mpimg.imread(img_path)
  plt.imshow(img)

plt.show()



<h2>Lets Build the model now</h2>



Let's start by importing TensorFlow library

In [0]:
import tensorflow as tf

<h3>Building the CNN</h3> 
Now we will be building the densely connected Convolutional Neural Network. TensorFlow helps us to resize the images as per our requirement during the operation. So we don't need to resize the actual image in our local file system. Hence gives us the iberty to experiemnt.

In [0]:
model = tf.keras.models.Sequential([
    # Note the input shape is the desired size of the image 300x300 with 3 bytes color
    # This is the first convolution
    tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(300, 300, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    # The second convolution
    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    # The third convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    # We can further add or remove Convolution and pooling layer depending upon the complexicity and/or image sizes. 
    # Flatten the results to feed into a DNN
    tf.keras.layers.Flatten(),
    # 512 neuron hidden layer
    tf.keras.layers.Dense(512, activation='relu'),
    # Only 1 neuron in the output layer.As we are developing a Binary Classifier. The value for sigmoid activation function will be between 0-1
    # Where 0 will be for Real images and 1 would be for Fake images.
    tf.keras.layers.Dense(1, activation='sigmoid') 
])

<h3>'summary()' Method</h3>
From the 'summary()' method we can see how the the image shape has been changed/reduced after the operation in every layer. 

In [0]:
model.summary()

<h3>Compiling the model</h3>
As we are designing a Binary classifier,we have used 'binary_crossentropy' loss function. But we can use other loss functions to see how it impacts the model performance.


In [0]:
from tensorflow.keras.optimizers import RMSprop

model.compile(loss='binary_crossentropy',
              optimizer=RMSprop(lr=0.001),
              metrics=['accuracy'])

<h3>Data Preprocessing</h3>

1.'ImageDataGenerator' helps to read the data files from source and feed them to the network.

2.'flow_from_directory' function of the *ImageDataGenerator* helps to label the data from the subdirectory name.

3.So we don't need to label evry data by ourselves explicitly. We just point the directory which holds the subdirectories which intern holds the images of respective classes.  

In [0]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# All images will be rescaled by 1./255 to normalize the images before we feed it to the network.
train_datagen = ImageDataGenerator(rescale=1/255)

# Flow training images in batches of 11 using train_datagen generator
train_generator = train_datagen.flow_from_directory(
        'real_and_fake_face/real_and_fake_face_ReduceD/',  # This is the source directory for training images which holds the subdirectories
        target_size=(300, 300),                            # All images will be resized to 300x300
        batch_size=11,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

<h3>Callback</h3>
This is the Callback function. While training it helps us to stop the training when required/desired accuracy is reached on the training set.

In [0]:
DESIRED_ACCURACY = 0.98

class myCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self,epoch,logs={}):             #This function is called on each epoch end    
        if (logs.get('accuracy')>DESIRED_ACCURACY):
            self.model.stop_training=True
            print('Reached 98% accuracy so cancelling training!')

callbacks = myCallback()                              #Creating the object of 'myCallback()'

<h3>Trainging the Model</h3>
Now we have developed the model,it's time to train it. 'model.fit()' helps us to do so.

In [0]:
history = model.fit(
      train_generator,
      steps_per_epoch=3,  # No of steps required to use the entire training data in each epoch before it go to the next epoch
      epochs=10,          # steps_per_epoch = total training sample/batch size
      verbose=1,          # Different modes of showing progress bar. Verbosity mode 0 = silent, 1 = progress bar, 2 = one line per epoch
      callbacks=[callbacks])

#For very high accuracy on training set, there can be a chance of *Overfitting*. So we need to carefully choose our no of epochs, no of training data etc.

<h3>Now Let's see how the model works on Unseen photos</h3>

In [0]:
import numpy as np
from google.colab import files
from keras.preprocessing import image

uploaded = files.upload()

for fn in uploaded.keys():
 
  # predicting images
  path = '/content/' + fn
  img = image.load_img(path, target_size=(300, 300))
  x = image.img_to_array(img)
  x = np.expand_dims(x, axis=0)

  images = np.vstack([x])
  classes = model.predict(images, batch_size=10)
  print(classes[0])
  if classes[0]>0.5:
    plt.imshow(img)
    print(fn + " is a REAL image")
  else:
    plt.imshow(img)
    print(fn + " is a FAKE image")