### This notebook is a walkthrough of how to use extract_faces.py.

You will need to have pliers and its face_recognition dependency installed.

To install, in command line run:

This script will read a features.json file that defines the frame sampling rate, the download path, and the save path.
For convenience, this .json file should also include the other parameters you may need for extracting semantic or low-level visual features. Make sure that you specify where to find the movies with a complete file path, as well as where to save them.

Note: You may want your sampling rate to match up with the TRs. 

#### Here's an example of what should go in the .json file:


{"hcpmovies": {  
&emsp;      "hdim": 90,                ---> desired horizontal dimension of downsampled image  
&emsp;      "vdim": 128,               ---> desired vertical dimension of downsampled image  
&emsp;      "fps": 24,                 ---> frames per second of the movie
&emsp;      "dir": [0, 30, 60,  90, 120, 150, 180, 210, 240, 270, 300, 330], ---> spatial directions of gabors (aka motion direction)  
&emsp;      "sf": [0,4,8,16],               ---> spatial frequency range for gabors  
&emsp;      "tf": [0,4],               ---> temporal frequency range for gabors  
&emsp;      "samplerate": 1,           ---> the number of frames per second to sample  
&emsp;      "downloadpath": "/home/jovyan/shared/hcp-7T_Movies/movie/unzip/Post_20140821_version/", ---> path to movies  
&emsp;      "movies": ["7T_MOVIE1_CC1_v2.mp4", ---> list of movie names  
&emsp; &emsp; &emsp; &emsp; &emsp;  "7T_MOVIE2_HO1_v2.mp4",   
&emsp; &emsp; &emsp; &emsp; &emsp;  "7T_MOVIE3_CC2_v2.mp4",   
&emsp; &emsp; &emsp; &emsp; &emsp;  "7T_MOVIE4_HO2_v2.mp4],
&emsp; &emsp; &emsp; &emsp; &emsp;   
&emsp; "savepath": "/home/jovyan/workingdirectory/" ---> where you want to save features  
&emsp;      "TRs": {"7T_MOVIE1_CC1_v2": {"train1": [20,265], "train2": [285,506], "train3": [526,714], "train4": [735,798], "test1":[818,901]},  
&emsp; &emsp; &emsp; &emsp; &emsp;  "7T_MOVIE2_HO1_v2": {"train5":[20,248],"train6":[267,526],"train7":[545,795],"test2":[815,898]},  
&emsp; &emsp; &emsp; &emsp; &emsp;  "7T_MOVIE3_CC2_v2": {"train8":[20,200],"train9": [220,405], "train10": [425,628], "train11": [650,793], "test3": [812,895]},  
&emsp; &emsp; &emsp; &emsp; &emsp;  "7T_MOVIE4_HO2_v2": {"train12":[20,254],"train13":[272,503],"train14":[522,777],"test4":[798,881]}  ---> list of movie names and the associated ranges of TRs to train and test on  
&emsp; }  
}  

#### Package installation
Next, we import the packages we will need to extract and save out the features we want.

In [1]:
import imageio
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns
import face_recognition
import pliers
import os
from os.path import join

from pliers.stimuli import VideoStim
from pliers.graph import Graph
from pliers.filters import FrameSamplingFilter
from pliers.extractors import (FaceRecognitionFaceLocationsExtractor,
                               FaceRecognitionFaceEncodingsExtractor,
                               merge_results)

from pliers.converters import VideoToAudioConverter

2023-08-17 16:30:39.083746: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


#### To run the extractor in python notebook: 

In [2]:
import utils

#### JSON file path
Your file path will need to be set to the specific json file you want to load in your parameters from.

Here is an example path:

In [3]:
json_filepath = '/home/jovyan/hackathon/visual-feature-decoding/extract_features/feature_AHH.json'

#### Extracting the face information
This script will take in the movies as specified in your .json file, loop over them, and output an .npz file containing the extracted information. The .npz will contain, by column: 

[order, sample duration, time of face onset, face identity, coordinates of the bounding box for the identified face (in pixels)]

The higher the sampling rate, the longer the extraction will take.

In [5]:
utils.extract_faces(json_filepath)

Stim: 921it [08:52,  1.73it/s]
Stim: 918it [08:44,  1.75it/s]
Stim: 915it [08:37,  1.77it/s]
Stim: 901it [08:43,  1.72it/s]


If you want to take a quick look at the saved data...

In [12]:
import numpy as np

# Load the .npz file
data = np.load('/home/jovyan/hackathon/visual-feature-decoding/extract_features/extract_faces/extracted_data/7T_MOVIE1_CC1_v2_faces.npz', allow_pickle=True)

# Access the arrays stored in the .npz file using keys
features = data['features']
print(features)

# Close the file after using it
data.close()

[[nan 1.0 82.0 0 (204, 760, 590, 375)]
 [nan 1.0 84.0 0 (134, 812, 455, 491)]
 [nan 1.0 85.0 0 (76, 803, 461, 418)]
 ...
 [nan 1.0 895.0 2 (147, 250, 199, 199)]
 [nan 1.0 895.0 3 (155, 602, 229, 527)]
 [nan 1.0 895.0 4 (87, 467, 149, 405)]]
