# Inference 

---

The inference is created from 3 parts - face detector, spectrum translator from NIR to VIS and vice versa, and Facial expression recognition part.

### Face Detector
Now, face detection can be done with 2 following models:
* **RetinaFace** - model from [here](https://github.com/serengil/deepface). Accurate, though less quick. Also, RetinaFace can align the face.
* **CenterFace** - original Centerface from [here](https://gitlab.fit.cvut.cz/vadlemar/real-time-facial-expression-recognition-in-the-wild). It is fast, however less accurate than first one. The model in onnx is stored in `models/face_detection/centerface.onnx`. This option does not implicitly align faces in Inference class, RetinaFace does.

Additional detectors from DeepFace module can be easily added such as *MTCNN* or *Jones-Viola* algorithm.

### Spectrum Translator
Translates images between NIR, VIS specters.

* **CycleGAN on CASIA+OuluCasia db** - This Translates between both specters. 
* **FFE-CycleGAN on BUAA db** - This translates between NIR images and averaged grayscale image (averaged from color green illuminated RGB image). The translation is good, however not 

### Facial expression recognition
Classifies categorical emotion and regresses the valece arousal labels. 
* **MobileNet** - FER net from original work [here](https://github.com/serengil/deepface). Does not work on NIR. Onnx in `models/pretrained/mobilenet_simultaneous.onnx`.
* **MobileNet-NIR** - Original MobileNet pretrained on Oulu-Casia database NIR images. Several onnx versions are stored in a folder `models/mobilenet_NIR/`. Additional information below.

***

#### MobileNet-NIR on Oulu-Casia db
This was pretrained on original mobilenet facial expression recognition model.
It has both categorical and spatial predictions.

It was pretrained on 7 emotions (neutral, anger, disgust, fear, happy, sad, surprise) - without contempt unlike in original model.


Since this dataset the model was pretrained on did not have valence arousal labels (only categorical), they were assigned in following way. Because dataset contained for each emotion (for each patient) images retrieved from video, that started from neutral to most affected emotion. according to this gradient were linearly assigned valence arousal values.

The most affected categorical emotions are anchored to following V/A values:

* neutral: (0., 0.),
* anger: (-0.51, 0.59),
* disgust: (-0.60, 0.35),
* fear: (-0.64, 0.6),
* happiness: (0.81, 0.51),
* sadness: (-0.63, -0.27),
* surprise: (0.4, 0.67)

Above VA labels are from [-1, 1] and are not recalculated from circumplex model yet.
But model can be quickly trained, and now is only pretrained on 15 epochs.

## Definitions

Imports

In [1]:
from skeleton.inference import Inference

2024-03-14 13:38:29.035588: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-03-14 13:38:29.089118: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-03-14 13:38:29.089776: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Definitions of inference classes of individual networks and one encapsulating inference class.

## Inference examples

***

The inference first needs defining each of: *face detector*, *spectrum translator* and *facial expression recognition* (FER). Each one of them needs either `"net_type": None` for skipping this part, or defining the model type with its path to onnx file and additional info.
This models definition are shown below and further passed to Inference object.

Model types can be for:
* **Face detector** - `Inference.net_type.FACE_DETECTOR_CENTERFACE` or `Inference.net_type.FACE_DETECTOR_RETINAFACE`
* **FER** - `Inference.net_type.FER_MOBILENET`
* **Spectrum translation** - `Inference.net_type.SPECTRUM_TRANSLATOR_FFE_CYCLEGAN` or
`Inference.net_type.SPECTRUM_TRANSLATOR_ORIG_CYCLEGAN`

Each model is in the folder *models/*.

***

After the `Inference` object definition, the inference can be run. Either *from_folder* loading images, *from_filenames* loading images from array of filepaths, or *from_array* - already loaded image as np.array.

Also the inference can be run as *infer_instant* where each image is immediately passed through all parts of Inference.
<!-- Inference, as opposed to *infer* mode, where are first all images detected, then than those translated and then FER. -->

<!-- Thus following functions are available:
* `inf.infer_from_folder('path/to/folder')`
* `inf.infer_from_filenames(['path/to/image1', 'path/to/image2'])`
* `inf.infer_from_array(img_np_array)` -->

<!-- And for Inference of each image separately is: -->
* `inf.infer_instant_from_folder('path/to/folder', save_to_folder='pth/to/save_fld')`
* `inf.infer_instant_from_filenames(['path/to/image1', 'path/to/image2'], save_to_folder='pth/to/save_fld')`
* `inf.infer_instant_from_array(img_np_array, save_to_folder='pth/to/save_fld')`

In [1]:
from skeleton.inference import Inference

# define here which models to use. If u which not to use certain model, set its 'net_type' = None.
# If you want to apply model, pick from Inference.net_type defined above
models = {
    "face_detector": {
        "net_type": Inference.net_type.FACE_DETECTOR_RETINAFACE,
        # If using RetinaFace detector, when true, align faces.
        # Should not be used alongside 'remove_black_stripes' == True (artifacts wil be created)
        "retina_face_align": True,
        # Returned images have square dimenstions, however, detectors return rectangles,
        # thus there are black stripes on sides. If you wish to remove those stripes, set True.
        "remove_black_stripes": False,
        # Display detected images in notebook
        "display_images": False,
        # Save images - when loded from filepath saved as <original_filestem>_<face_idx>.<original_extension>,
        # else as <detection_id>.jpg starting from 0
        "save_image_to_folder": None,
    },
    "spectrum_translator": {
        "net_type": Inference.net_type.SPECTRUM_TRANSLATOR_ORIG_CYCLEGAN,
        "pth_to_onnx": 'models/spectrum_translation/VIS2NIR_cyclegan_snellius_casia_oulucasia_double_gen_opt-GA-latestepoch.onnx',
        # First translates input image to grayscale and then translates spectra
        "input_as_avg_grayscale": False,
        # Translates newly translated image to averaged grayscale
        "output_as_avg_grayscale": True,
        "display_images": True,
        # Saves as <global idx_xount>_<id of face starting from 0>.jpg
        "save_image_to_folder": "transtmp",
    },
    "fer": {
        "net_type": None, # Inference.net_type.FER_MOBILENET,
        "pth_to_onnx": 'models/_old/mobilenet_NIR/mobilenet_on_AffectNet-NIR/mobilenet_aff_nir-aff_continue.onnx',
        "display_images": True,
        # Saves as <global idx_xount>_<id of face starting from 0>.jpg
        "save_image_to_folder": "fertmp",
    }
}
# debug flag is displaying images, verbose prints other secondary information
inf = Inference(models, None, verbose=True)

2024-03-14 14:40:30.189214: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-03-14 14:40:30.244310: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-03-14 14:40:30.245451: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Using '{'net_type': <net_type.FACE_DETECTOR_RETINAFACE: 'G'>, 'retina_face_align': True, 'remove_black_stripes': False, 'display_images': False, 'save_image_to_folder': None}' as face detector model
Using '{'net_type': <net_type.SPECTRUM_TRANSLATOR_ORIG_CYCLEGAN: 'B'>, 'pth_to_onnx': 'models/spectrum_translation/VIS2NIR_cyclegan_snellius_casia_oulucasia_double_gen_opt-GA-latestepoch.onnx', 'input_as_avg_grayscale': False, 'output_as_avg_grayscale': True, 'display_images': True, 'save_image_to_folder': 'transtmp'}' as spectrum transfer model
Orig-CycleGAN model from 'models/spectrum_translation/VIS2NIR_cyclegan_snellius_casia_oulucasia_double_gen_opt-GA-latestepoch.onnx' loaded.


In [None]:
output = inf.infer_instant_from_folder('../DIP/Facial-expression-analysis-from-NIR-image/data/AffectNet-8Labels/train_set/images_small/')