# Real-Time Multi-Modal Audio, Video and Physiological Data Collection
*by Paul Helstab & Danila Mamontov*

<div class="alert alert-info"><b>Initial Goal</b><br />
To prepare <b>equipment and sensors</b> for complex data collection (<b>audio, video, physiology, eye and body tracking</b>) during a VR experiment aimed at studying the user’s state while passing a maze under stressful conditions
</div>

**Our Goal:** 
* **Real-time** data acquisition and visualization of audio, video and physiological signals


**Initial thoughts:**

* Which sensors are we dealing with in the first place?
* What sensor data do we get?
* Check the SDK's / API's of the sensors
* Specific Hardware needed?
* Real time data and Recording?
    * Visualisation
    * Consider delays and losses
    * How to record together in parallel via several connectors?
* Define architecture
* ...

**Provided Sensors:**
 * **Empatica E4** (Real-time physiological signal sensoring)
 * **Polar H10** (Real-time heart rate signal sensoring)
 * **Azure Kinect Developer Kit** (3D spatial information sensor device to create computer vision and speech model applications)

| Device | Sensors | Data Stream | Frequency (set to) |Output|
|-----------|-----------|-----------|-----------|---|
| **Empatica E4** | Photoplethysmography sensor (PPG) | Blood Volume Pressure <b>(BVP)</b> |64Hz|.csv|
|   |   | Heart Rate <b>(HR)</b> | event-based (derived from BVP)|.csv|
|   |   | Interbeat Interval <b>(IBI)</b> | event-based (derived from BVP)|.csv|
|   | Electrodermal activity sensor (EDA) | Galvanic skin response <b>(GSR)</b> |4Hz|.csv|
|   | 3-axis accelerometer | XYZ-axes per timestep <b>(ACC)</b> |32Hz|.csv|
|   | Optical thermometer (probably infrared) | Skin temperature <br>in celsius per timestep <b>(TEMP)</b> |4Hz|.csv|
| **Polar H10** | Electrocardiography sensor (ECG) | ECG data <b>(ECG)</b><br>electric potential in microvolts (μV)  |130Hz|.csv|
|   |  | Interbeat Interval <b>(IBI)</b> | event-based (derived from ECG) |.csv|
|   |  | Heart Rate <b>(HR)</b> | event-based (derived from ECG) |.csv|
|   | 3-axis accelerometer (XYZ) | XYZ-axes per timestep <b>(ACC)</b> |200Hz|.csv|
| **Azure Kinect DK** |	Depth camera: 1MP Time-of-flight | NFOV unbinned: 640x576 </br> NFOV 2x2 binned: 320x288</br> WFOV 2x2 binned: 512x512</br> WFOV unbinned: 1024x1024</br> Passive IR: 1024x1024 | 0, 5, 15, or 30 FPS |-|
|   | RGB camera: 12MP CMOS sensor rolling shutter | 3840x2160 (UHD)</br>2560x1440 (QHD)</br>1920x1080 (FHD)</br>1280x720 (HD)| 0, 5, 15, or 30 FPS |.mkv|
|   | IMU (Inertial Measurement Unit): 3D accelerometer and gyroscope |  XYZ-axes linear acceleration in (m/s²);</br>XYZ-axes rotational speed (rad/s)  | Sampling rate: 1.6 kHz</br>Reporting rate to host: 208 Hz |-|
|   | Microphone: 7-microphone circular array | Channel Count: 7 channels</br>Sample Format: 16-bit integer (Int16)| Sample Rate: 16000 Hz (16 kHz) |.wav|

{cite}`FemtoBol29`

* Important is the **distinction** between PPG (optical) and ECG (electrodes) where ECG serves as the gold standard for accuracy.

**ECG:** 
 * Polar H10 *potentially* uses two electrodes that make direct contact with the skin. 
 * These electrodes are made of conductive materials *potentially* silver or silver chloride (Ag/AgCl) that can detect the small electrical signals generated by the heart.

This information can't be proven because there is no in detail public information available about this sensor.

<div class="alert alert-info"><b>INFO</b><br />
The electrodes in contact with the skin detect the small electrical signals generated by the <b>heart's contractions</b>, measuring the voltage difference between them to produce an electrocardiogram signal.
</div>


**PPG:** 
* E4's PPG sensor contains light-emitting diodes (LEDs) with a total 14 mm² sensitive area, that emit light into the skin. 
    * Specifically, it uses green (2 LEDs) and red (2 LEDs) wavelengths.

<div class="alert alert-info"><b>INFO</b><br />
As the heart pumps blood through the body, the volume of blood in the capillaries under the skin changes. <b>This affects how much light is absorbed or reflected back to the photodiodes.</b>
</div>

**Challenges:**
 * **Empatica E4**
     * Poor proprietary recording software with inaccurate data acquisition and no real time function 
     * Little to non-existent developer community around the SDK
     * Complicated development of own software and obligation to use very specific hardware 




 * **Polar H10**
     * Proprietary recording software works well but is intended more for athletes 
       * No good realtime monitoring
     * Hardly any open source projects that use the sensor work reliable



 * **Azure Kinect Developer Kit**
   * Large developer community for projects of all kinds, but no projects with parallel audio and video recording
   * Very unstable recording software with very poor audio recording (audio/video drift)
     * Developed as a CLI application

##  Defined Milestones

- [x] Lightweight real time data acquisition of **Polar H10** Sensors
- [x] Lightweight real time data acquisition of **Empatica E4** Sensors
- [x] Flawless simultaneous recording of **audio and video** from the **Azure Kinect DK** 
- [x] Parallel caputuring of **sessions** for experiments
- [x] **Real time visual feedback** of data aquisition
- [x] **CI/CD Pipeline** for versioning and fast deployment of features and fixes
- [x] **Free online book** for documentation and theorethical background about sensoring


**Open To Do's:**
- [ ] **Audio input selection** e.g. for alternative microphone inputs (not only Azure Kinect DK microphone) 
- [ ] **Eye sensoring** with HTC vive pro

**Optional To Do's:**
- [ ] **Triggering** data aquisition in the sensor software with the unreal engine endpoints
- [ ] Real time **Body Tracking** with Azure Kinect DK
    * Could be done better afterwards on the video footage

## References

* {cite}`Elektris81`