<h1>Basic Demo</h1> 

Basic steps to process video for HRV estimation trough rPPG methods:
<ul>
    <li><a href="#video">Load video and crop faces</a></li>
    <li><a href="#roi">Process ROIs</a></li>
    <li><a href="#method">Apply rPPG methods</a></li>
    <li><a href="#res">Analyze results</a></li>
    <li><a href="#tune">Do fine tuning</a></li>
    <li><a href="#gt">Compare with ground truth (GT)</a></li>
</ul>

In [1]:
# -- Modules and packages to import for demo
from pyVHR.signals.video import Video
from pyVHR.methods.pos import POS
from pyVHR.methods.chrom import CHROM
from pyVHR.analysis.testsuite import TestSuite, TestResult


i


<a id="video"></a>
<h2>Load video and crop faces</h2> 

Given a video sequence $v(t)$, the process starts by extracting the portion of the image corresponding to the face from each frame.
The face detectors included in the Package pyVHR are: dlib, mtcnn, and Kalman filter for face tracking. Thus,
\begin{equation*}
b(t)=
\begin{cases} 
    \text{dlib}(v(t))  \\ 
    \text{mtcnn}(v(t)) \\ 
    \text{kalman}(v(t)).
\end{cases}
\end{equation*}
The signal $b(t)$ has dimensions $w\times h \times 3 \times W_s$, where $w$ and $h$ are the width and the height of the bounding box containing the face, respectively. Signal $b(t)$ has 3 channels being coded in the RGB-color space, and depth $W_s$ according to the time window considered.

<h3>Download video sample</h3> 
To repeat next experiment:
<ul>
    <li>Download the file <code>ID1.zip</code> (7.57GB) from <a href="https://github.com/partofthestars/LGI-PPGI-DB">github.com/partofthestars/LGI-PPGI-DB</a></li>
    <li>Unpack the file in the pyVHR root directory and rename the folder as <code>sampleDataset</code></li>
    <li>Run the following snippets (cells)</li>
        

In [2]:
# -- Video object
videoFilename = ("mesh","mp4")
video = Video(videoFilename)

# -- extract faces
video.getCroppedFaces(detector='mtcnn', extractor='skvideo')

video.printVideoInfo()

print("\nShow video cropped faces, crop size:", video.cropSize)
video.showVideo()



mtcnn



Performing face detection...
Processing: |█████████████████████████████████████████████████-| 99.7% Complete
   * Video filename: ../sampleDataset/mesh.mp4
         Total frames: 300
             Duration: 10.0 (sec)
           Frame rate: 30 (fps)
                Codec: h264
           Num frames: 300
               Height: 480
                Width: 480
             Detector: mtcnn
            Extractor: skvideo

Show video cropped faces, crop size: (266, 194)


interactive(children=(IntSlider(value=1, description='frame', max=300), Output()), _dom_classes=('widget-inter…

<a id="roi"></a>
<h2>Process ROIs</h2> 

For each frame $t=1,\dots,T$, given the ROIs, either rectangular patch-based $R(t)$ or skin-based $S(t)$, we average over all the selected pixels to compute the output  $q(t)$ of this step. More formally, if $N$ denotes the number of rectangular patches, $\left|R^{(i)}(t)\right|$ the number of pixels in the $i$-th patch, and  $\left|S(t) \right |$ the number of detected skin pixels within the face, we have
\begin{equation*}
q(t)=\begin{cases} 
    \text{Patch}(R(t)) \\ \text{Skin}(S(t))
    \end{cases}
\end{equation*}
where
\begin{align*}
    \text{Patch}(R(t))&= 
    \frac{1}{N}\sum_{i=1}^N \frac{1}{\left| R^{(i)}(t) \right|} \sum_{(x,y)\in R^{(i)}(t)} R^{(i)}_{x,y}(t)   \\ 
    \text{Skin}(S(t)) &= 
    \frac{1}{\left| S(t) \right |} \sum_{(x,y)\in S(t)}
    S_{x,y}(t)
\end{align*}


In [3]:
# -- define ROIs: free rectangular regions
video.setMask(typeROI='rect', rectCoords=[[15,20,140,50],[10,120,100,30]])
video.printROIInfo()
video.showVideo()

      ROI type: rect
   Rect coords: [[15, 20, 140, 50], [10, 120, 100, 30]]


interactive(children=(IntSlider(value=1, description='frame', max=300), Output()), _dom_classes=('widget-inter…

In [4]:
# -- define ROIs: standard regions, i.e. 'forehead', 'lcheek', 'rcheek', 'nose'
video.setMask(typeROI='rect', rectRegions=['forehead', 'lcheek', 'rcheek', 'nose'])
video.printROIInfo()
video.showVideo()

      ROI type: rect
   Rect coords: [[38, 26, 116, 31], [29, 143, 29, 29], [135, 143, 29, 29], [67, 133, 58, 21]]


interactive(children=(IntSlider(value=1, description='frame', max=300), Output()), _dom_classes=('widget-inter…

In [5]:
# -- define ROIs: using skin, with threshold param 
video.setMask(typeROI='skin_adapt',skinThresh_adapt=0.2)
video.printROIInfo()
video.showVideo()

      ROI type: skin_adapt
   Skin thresh: 0.2


interactive(children=(IntSlider(value=1, description='frame', max=300), Output()), _dom_classes=('widget-inter…

In [6]:
# -- define ROIs: using skin, with threshold param 
video.setMask(typeROI='skin_fix',skinThresh_fix=[20, 50])
video.printROIInfo()
video.showVideo()

      ROI type: skin_fix
   Skin thresh: [20, 50]


interactive(children=(IntSlider(value=1, description='frame', max=300), Output()), _dom_classes=('widget-inter…

<a id="method"></a>
<h2>Apply rPPG methods</h2> 

The temporal RGB trace $x(t) =(x_r(t),x_g(t),x_b(t))^T$ is achieved by preprocessing the raw RGB signal $s(t) =(s_r(t),s_g(t),s_b(t))^T$. The signal $x(t)$ is than split into overlapped subsequences, each  representing samples of a finite-length multivariate measurement with $t= 1,2,...,M$, where $M=W_s\cdot f_{ps}$ is the number of frames selected by the sliding window, being $f_{ps}$ the video frame rate. Thus, a method receives as input a chunk of the sequence $x(t)$ and produces as output a monovariate temporal sequence $y(t)$, i.e. a real BVP estimate coming from the application of the rPPG model.

In [7]:
# -- Define a configuration file. 
#    It contains all the information relative to the dataset (e.g. the path), 
#    and the test procedure (e.g. hyperparamenters)
cfgFilename = '../pyVHR/analysis/sample.cfg'

# -- apply the pipeline until GT comparison
test = TestSuite(configFilename=cfgFilename)

# -- run exp and save results on a pandas file
#    change verb to see more details as follow:
#       0 - not verbose
#       1 - show the main steps
#       2 - display graphic 
#       3 - display spectra  
#       4 - display errors
#       (use also combinations, e.g. verb=21, verb=321)

result = test.start(outFilename='sampleExp.h5', verb='1')

** Run the test with the following config:
      dataset: LGI_PPGI
      methods: ['POS', 'CHROM']
cms50
cms50
cms50
cms50
cms50
cms50
cms50
cms50
cms50
cms50
cms50
cms50
cms50


<a id="test"></a>
<h2>Test results</h2>

Results from experiments are organized in pandas file and object of type TestResult. 

In [8]:
# -- pandas file for collecting results
result.dataFrame

<a id="tune"></a>
<h2>Do fine tuning</h2>

In this section it is shown how to tune parameters, visualize intermediate results, invoke methods on specific dataset 

In [14]:
# -- Detailed pipeline for rafinement steps and tuning  

# -- define some params in the form of dict (those in the cfg file) 
params = {"video": video, "verb":2, "ROImask":"rect","rectRegions":['forehead', 'lcheek', 'rcheek', 'nose'], "skinAdapt":0.2,"csv_filename":"mesh"+".csv"}

# -- invoke the method
m = CHROM(**params)
#m = POS(**params)

# -- invoke the method
bpmES, timesES = m.runOffline(**params)

[ 1.34281074e-03  2.05452043e-01  3.53271945e-01  4.12088015e-01
  3.84101724e-01  3.00661291e-01  2.03909836e-01  1.26698156e-01
  8.11731124e-02  6.02957868e-02  4.88603067e-02  3.59296495e-02
  2.12352295e-02  1.29271247e-02  1.96890010e-02  4.31581441e-02
  7.54091980e-02  1.02331375e-01  1.10021939e-01  9.02250641e-02
  4.25333264e-02 -2.63639969e-02 -1.04810220e-01 -1.78636599e-01
 -2.33977574e-01 -2.60631589e-01 -2.55210913e-01 -2.22468210e-01
 -1.73582535e-01 -1.21788702e-01 -7.73701802e-02 -4.44749033e-02
 -2.11443138e-02 -2.12032979e-03  1.74103600e-02  3.91849869e-02
  6.14641653e-02  8.06135535e-02  9.34404436e-02  9.89100994e-02
  9.84552237e-02  9.49612939e-02  9.11645357e-02  8.83600260e-02
  8.60098109e-02  8.23095532e-02  7.53127585e-02  6.40188430e-02
  4.89323935e-02  3.18882261e-02  1.52595341e-02  9.11965786e-04
 -1.06212704e-02 -2.03212542e-02 -3.00416411e-02 -4.12595271e-02
 -5.38327737e-02 -6.55249376e-02 -7.27638991e-02 -7.24264872e-02
 -6.37563429e-02 -4.92932

<a id="gt"></a>
<h2>Compare with ground truth (GT)</h2>

In this section it is shown how to define a new Dataset object of type SAMPLE, to load the BVP ground truth (GT) signal and plot it and to compute the GT BPMs used in comparisons with those estimated by method

In [15]:
from pyVHR.datasets.sample import SAMPLE
from pyVHR.datasets.dataset import Dataset

# -- dataset object
dataset = SAMPLE(videodataDIR='../sampleDataset/', BVPdataDIR='../sampleDataset/')

# -- ground-truth (GT) signal
idx = 0   # index of signal within the list dataset.videoFilenames
fname = dataset.getSigFilename(idx)

# -- load signal and build a BVPsignal or ECGsignal object
sigGT = dataset.readSigfile(fname)
sigGT.plot()

# -- plot signal + peaks
sigGT.findPeaks(distance=20)
sigGT.plotBPMPeaks()

# -- compute BPM GT
winSizeGT = 7
bpmGT, timesGT = sigGT.getBPM(winSizeGT)

cms50
cms50
cms50
cms50
cms50
cms50
cms50
cms50
cms50
cms50
cms50
cms50
cms50


IndexError: list index out of range

In [None]:
from pyVHR.utils.errors import getErrors, printErrors, displayErrors

# -- error metrics
RMSE, MAE, MAX, PCC = getErrors(bpmES, bpmGT, timesES, timesGT)
printErrors(RMSE, MAE, MAX, PCC)
displayErrors(bpmES, bpmGT, timesES, timesGT)

In [None]:
# -- print BPM
print("BPMs of the GT signal averaged on winSizeGT = %d sec:" %winSizeGT)
print(bpmGT)

# -- plot spectrogram
sigGT.displaySpectrum()