# Implementation of a new rPPG method

## Part 2 : Notebook for the prediction of the 3D-CNN model

This jupyter notebook file complements the "Train_3DCNN_model_BPM.ipynb" file. In this file, we can test the model predictions on real videos and highlight logic of the future implementation into the pyVHR framework. ([Link](https://ieeexplore.ieee.org/document/9272290)) ([GitHub](https://github.com/phuselab/pyVHR))

This file is based on the implementation described in the following article :
Frédéric Bousefsaf, Alain Pruski, Choubeila Maaoui, 3D convolutional neural networks for remote pulse rate measurement and mapping from facial video, Applied Sciences, vol. 9, n° 20, 4364 (2019). ([Link](https://www.mdpi.com/2076-3417/9/20/4364)) ([GitHub](https://github.com/frederic-bousefsaf/ippg-3dcnn))

## Importing libraries

Previously , you have to install theses python librairies :
* tensorflow
* matplotlib
* scipy
* numpy
* opencv-python
* Copy
* pyVHR

In [1]:
import os
#RUN ON CPU
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'

#Tensorflow/KERAS
import tensorflow as tf
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.models import model_from_json
from tensorflow.python.keras.utils import np_utils

# Numpy / Matplotlib / OpenCV / Scipy / Copy
import numpy as np
import scipy.io
import scipy.stats as sp
import matplotlib.pyplot as plt
import cv2
from copy import copy

#pyVHR
from pyVHR.signals.video import Video
from pyVHR.datasets.dataset import Dataset
from pyVHR.datasets.dataset import datasetFactory

# Functions for making predictions

## Loading the video & pyVHR processing


In the pyVHR framework, we work on a processed video. The processing consists of detecting and extracting an area of interest, in order to apply our rPPGs methods on relevant data.

In [2]:
# -- Video object
def extractionROI(videoFilename):
    video = Video(videoFilename)
    video.getCroppedFaces(detector='dlib', extractor='skvideo')
    video.setMask(typeROI='skin_adapt',skinThresh_adapt=0.22)
    return video

## Loading the model
Load model & classes

In [3]:
# Load model
def loadmodel(MODEL_PATH):
    model = model_from_json(open(f'{MODEL_PATH}/model_conv3D.json').read())
    model.load_weights(f'{MODEL_PATH}/weights_conv3D.h5')
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

    # define the frequencies // output dimension (number of classes used during training)
    freq_BPM = np.linspace(55, 240, num=model.output_shape[1]-1)
    freq_BPM = np.append(freq_BPM, -1)     # noise class
    return model, freq_BPM


## Converting videoframes to a single channel array

Select one channel for making prediction

In [4]:
# 2. LOAD DATA
def convertVideoToTable(video,model):
    imgs = np.zeros(shape=(model.input_shape[1], video.cropSize[0], video.cropSize[1], 1))

    # channel extraction
    if (video.cropSize[2]<3):
        IMAGE_CHANNELS = 1
    else:
        IMAGE_CHANNELS = video.cropSize[2]

    # load images (imgs contains the whole video)
    for j in range(model.input_shape[1]):

        if (IMAGE_CHANNELS==3):
            temp = video.faces[j]/255
            temp = temp[:,:,1]      # only the G component is currently used
        else:
            temp = video.faces[j] / 255

        imgs[j] = np.expand_dims(temp, 2)
    return imgs

## Formating Video

Create a list of frames from video and select representative pixels

In [53]:
def formatingDataTest(video, model, imgs):
    xtest = np.zeros(shape=(model.input_shape[1], model.input_shape[2] , model.input_shape[3], 1))
    for j in range(0,model.input_shape[1]):   # j = nb frames
        faceCopy = copy(imgs[j])
        faceCopy = faceCopy - np.mean(faceCopy)
        faceCopy = cv2.resize(faceCopy,(model.input_shape[2], model.input_shape[3]))
        for m in range(0, model.input_shape[2]):
            for n in range(0, model.input_shape[3]):
                xtest[j][m][n]= faceCopy[m][n]
                
    return xtest


## Making a prediction

Use the model to make prediction 

In [6]:
def getPrediction(model,freq_BPM, xtest):
    idx =0
    maxi =0
    # model.predict
    input_tensor = tf.convert_to_tensor(np.expand_dims(xtest, 0))
    h = model(input_tensor)
    h = h.numpy() 
    #prediction
    return h[0] 

## Finding the label associated with the prediction

In [7]:
def getClass(h, freq_BPM):
    idx =0
    maxi =0
    #find label associated
    for i in range(0, len(h)):
        if maxi < h[i]:
            idx = i
            maxi = h[i]

    return freq_BPM[idx]
        

# Make a prediction

Function to make prediction on veritable data (60 first frames only in this example)

In [54]:
videoFilename = "./UBFC/DATASET_2/subject1/vid.avi"  #video to be processed path
modelFilename = "./model"   #model path 

def makePrediction(videoFilename, modelFilename):
    # ROI EXTRACTION
    video = extractionROI(videoFilename)
    # print ROI EXTRACTION
    video.showVideo()  
    #Load the model
    model, freq_BPM = loadmodel(modelFilename)
    #extract Green channel or Black & whrite channel
    framesOneChannel = convertVideoToTable(video,model)
    #Data preparation 
    xtest = formatingDataTest(video, model, framesOneChannel)
    prediction = getPrediction(model,freq_BPM,xtest)
    bpm = getClass(prediction, freq_BPM)
    return bpm

print('BPM frequency estimated = ' + str(makePrediction(videoFilename, modelFilename)))

interactive(children=(IntSlider(value=1, description='frame', max=1533, min=1), Output()), _dom_classes=('widg…

BPM frequency estimated = 105.0


# Validation test on veritable data

Test on 60 first frames

In [55]:
videoFilenames = ["./UBFC/DATASET_2/subject12/vid.avi", "./UBFC/DATASET_2/subject16/vid.avi"]
GT = ["./UBFC/DATASET_2/subject12/ground_truth.txt","./UBFC/DATASET_2/subject16/ground_truth.txt"]

modelFilename = "./model"
model, freq_BPM = loadmodel(modelFilename)

dataset = datasetFactory("UBFC2")
winSizeGT = 2      

for i in range(0, len(videoFilenames)):
    prediction = makePrediction(videoFilenames[i], modelFilename)
    print("Prediction Video "+ str(i+1) +" : "+ str(prediction))
    
    sigGT = dataset.readSigfile(GT[i])
    bpmGT, timesGT = sigGT.getBPM(winSizeGT)
    # Format the GT
    bpm = np.round(bpmGT)
    bpm = bpm - 55
    bpm = np.round(bpm / 2.5)
    GT_value = freq_BPM[int(bpm[2])]
    print("GT Video "+ str(i) +" : "+str(GT_value))
    
    print("ABS DIFF Video "+ str(i+1) +" : "+str(abs(GT_value-prediction)))
    

interactive(children=(IntSlider(value=1, description='frame', max=1986, min=1), Output()), _dom_classes=('widg…

Prediction Video 1 : 67.5
GT Video 0 : 67.5
ABS DIFF Video 1 : 0.0


interactive(children=(IntSlider(value=1, description='frame', max=2049, min=1), Output()), _dom_classes=('widg…

Prediction Video 2 : 90.0
GT Video 1 : 95.0
ABS DIFF Video 2 : 5.0
