# Simple model

This is the code to train a simple model using keras. This seems to be unefficient, making a very complex and big CNN. However with no no more then 3500 images it converges well to almost 100% accuracy (on evaluation). Still only using, one model per person. Using only one type of attack.

The need of a more robust method would be good. A model that can generilize from multiple people and corectly discriminate live faces from spoofed ones, form different attacks. That would require more and better models of style tranfer.

In [1]:
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from keras import callbacks
from keras.regularizers import l2
from keras import backend as Kb
import pickle
import numpy as np
import datetime
import os
import random
now_datetime  = datetime.datetime.now()
NAME = f"#Face_spoofing300x2_{now_datetime.day:02d}{now_datetime.month:02d}{now_datetime.year}_{now_datetime.hour:02d}{now_datetime.minute:02d}"
dir_pickle = "database_serialized"
dir_models_save = "models"
person = "001"
import platform
print(platform.architecture())

('64bit', 'WindowsPE')


Using TensorFlow backend.


## Model format
This is a simple CNN, that is made of a convolutional and a dense layer. below you choose the numbers of filters of each concolutional layer, and the numbers of neurons of each dense layer

In [2]:
#format of the convolutional layer
format_convolutions = [[80,140,320],[80,140,320]]
#format of the dense layer
format_denses = [[400,300,200],[400,300,200]]

NN_formats = (format_convolutions,format_denses)



## The model
The model creaeted here is enought for an almost ideal (acc = 98%) spoof detection model, as expected from a liveness detection methods that include deep learning as shown on many [studies](https://www.emerald.com/insight/content/doi/10.1108/SR-08-2015-0136/full/html) before.

However the artifical creation of spoofed images, may reduce its performance. For 2 different reasons:
1. Non perfect spoofed images. The creating of a model that can generate an spoofed image relayes much on the single image and the trainning data used to create such model. If both are not manage correctly, the creation of bad models may occur. Not to mention that some types of attacks are really hard to imitate e.g. print attacks.
2. Reduced numbers of data. There isn't much data online for this type of problem, and it to the creation of individuals models for each person may prove unresenable, for each person must have minutes (about two) of video of their faces, for a better conversion of the CNN.

For those reasons, it would be needed a deep leraning method that requires less data, to generalize. Something that could be accuired by [this](https://www.sciencedirect.com/science/article/pii/S1047320318301044) method.

In [3]:
def load_data(person):   
    pickle_path =os.path.join(dir_pickle,person)
    pickle_in = open(os.path.join(pickle_path,f"X{person}.pickle"),"rb")
    X = pickle.load(pickle_in)
    pickle_in.close()
    pickle_in = open(os.path.join(pickle_path,f"y{person}.pickle"),"rb")
    y = pickle.load(pickle_in)
    pickle_in.close()
    
    X = X.astype(np.float16)/255.0
    
    return X, y

def create_model(person, pickle_path,format_convolution,format_dense):
    model = Sequential()
    is_first = True
    # Convolutional layers
    for format_c in format_convolution:
        if is_first:
            model.add(layers.Conv2D(format_c,(3,3),input_shape=X.shape[1:]))
            is_first = False
        else:
            model.add(layers.Conv2D(format_c,(3,3)))
        model.add(layers.Activation("relu"))
        model.add(layers.MaxPool2D(pool_size=(3,3)))
    #Flatten the model if needed
    model.add(layers.Flatten())
    
    #Dense layers
    for format_d in format_dense:
        model.add(layers.Dropout(0.12))
        model.add(layers.Dense(format_d,activation="relu",kernel_regularizer=l2(0.002)))
    
    #Output layer
    model.add(layers.Dense(1,activation="sigmoid"))
    
    
    model.compile(loss="binary_crossentropy",
              optimizer = "adam",
              metrics= ["accuracy"])
    
    return model

## Trainning your model
This line trains your model. It is recoomend to use tensorflow-gpu, and to be carefull if your gpu can handle the data, if  you are having *ResourceExhaustedError*, try reducing the **batch size**, or the sizes of your **tesors**.

Keep the eye on your **validation** because that shows your live performance without trainning bias. You may use checkpoints to get the best validation accuracy.

In [4]:
def create_callbacks(person,format_convolution,format_dense):
    if not os.path.isdir(os.path.join(dir_models_save,person)):
        os.mkdir(os.path.join(dir_models_save,person))
    checkpoints = callbacks.ModelCheckpoint(filepath=os.path.join(dir_models_save,person,f"{person}{NAME}{format_convolution}{format_dense}.h5"),
                                                monitor="val_acc",
                                                mode = "max",
                                                verbose = 1,
                                                save_weights_only=False,
                                                save_best_only=True)
    return checkpoints
        
def create_validation(person,val_test_split=0.25):
    pickle_path =os.path.join(dir_pickle,person)
    pickle_in = open(os.path.join(pickle_path,f"X{person}Test.pickle"),"rb")
    Xval = pickle.load(pickle_in)
    pickle_in.close()
    pickle_in = open(os.path.join(pickle_path,f"y{person}Test.pickle"),"rb")
    yval = pickle.load(pickle_in)
    pickle_in.close()
    Xval = Xval.astype(np.float16)
    Xval = Xval/255.0
    yval = yval
    ValSet = [Xval,yval]
    random.shuffle(ValSet)
    n_exs = len(ValSet[0]) * val_test_split
    return ValSet[:int(n_exs)-1][:int(n_exs)-1]
    
    

In [5]:
people = os.listdir(dir_pickle)
for person in people:
    X, y = load_data(person)
    ValSet = create_validation(person)
    for format_convolution, format_dense in NN_formats:
        # Create the model structure for the person
        model = create_model(person,os.path.join(dir_pickle,person),format_convolution,format_dense)
        checkpoint = create_callbacks(person,format_convolution,format_dense)
        #Split the serilized test to validation
        #Train model
        model.fit(X,y,batch_size=6,validation_data=ValSet,epochs=4,shuffle=False, callbacks=[checkpoint])
    del X,y,ValSet,model
    Kb.clear_session()

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Train on 10635 samples, validate on 1156 samples
Epoch 1/7
Epoch 00001: val_acc improved from -inf to 0.99048, saving model to models\001\001#Face_spoofing300x2_12102019_1549[80, 140, 320][80, 140, 320].h5
Epoch 2/7
Epoch 00002: val_acc improved from 0.99048 to 1.00000, saving model to models\001\001#Face_spoofing300x2_12102019_1549[80, 140, 320][80, 140, 320].h5
Epoch 3/7
Epoch 00003: val_acc did not improve from 1.00000
Epoch 4/7
Epoch 00004: val_acc did not improve from 1.00000
Epoch 5/7
Epoch 00005: val_acc did not improve from 1.00000
Epoch 6/7
Epoch 00006: val_acc did not improve from 1.00000
Epoch 7/7
Epoch 00007: val_acc did not improve from 1.00000
Train on 10635 samples, validate on 1156 samples
Epoch 1/7


ResourceExhaustedError: OOM when allocating tensor with shape[6,400,99,99] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[{{node Adam_1/gradients/conv2d_4/Conv2D_grad/Conv2DBackpropInput}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.


In [None]:
#kills the kernel to free-up memory on GPU, also avoiding collisions with othhers scripts 
#comment the line below, if you want to keep the variables and buffers
os._exit(00)